|
Excite was one of the popular Web search engines in 1990s. The data used in our analysis were made available to public by the original researchers who logged these queries on a single day in 1999 and a single day in 2001. |
| Exhibit A – Query Characteristics and Information Needs |
| Web queries submitted to the general search engines seem to search different topics as compared with the academic Website or health information site. Popular queries are ranked by frequency (number of occurrences). |
| Exhibit B – Corpus-Linguistic Characteristics and Information Needs |
| The corpus-linguistic analysis is based on unique (identical) queries from the perspective of representation of information needs. A query, regardless of how many times it was submitted by different searchers, or the same searcher, is treated technically as a representation of one information need. We identified the user vocabulary and the co-occurrences of words. |
| Exhibit C – Interaction Behaviors within a Session |
| The Excite query corpus does have user id as well as the pages viewed. However, setting session boundary still requires a preset cutoff value in order to analyze how a searcher initiated and reiterated a search session. Quantitative analysis clusters interaction behaviors based on variables such as Session Size (the number of submitted queries), Query Length (the average number of terms per query), Term Popularity (the average frequency of terms based on corpus frequency), Term Use (the average frequency of usage of each query term within the session), Query Interval (the average time between consecutive queries), Pages Viewed (the average number of pages requested per query within a session). |
| The long search sessions are likely problematic in that most of the reiterations/moves are either unsuccessful or unsystematic. Quantitative analysis provides a basis for identifying problematic searches, and subsequent qualitative analysis can reveal the underlying cognitive factors (knowledge structure and mental models of Web search systems). |
