|
These queries are logged in a health information Website for one year. |
| Exhibit A – Query Characteristics and Information Needs |
| Queries submitted to this Website represent quite different information needs from those submitted to the other two search engines. Popular queries are ranked by frequency (number of occurrences). |
| Exhibit B – Corpus-Linguistic Characteristics and Information Needs |
| The corpus-linguistic analysis is based on unique (identical) queries from the perspective of representation of information needs. A query, regardless of how many times it was submitted by different searchers, or the same searcher, is treated technically as a representation of one information need. We identified the user vocabulary and the co-occurrences of words. |
| Exhibit C – Interaction Behaviors within a Session |
| Similar to the transaction logs in academic Website, transaction logs in this corpus does not have user identification. We rely on IP address and query intervals to identify sessions to observe how a searcher initiated and subsequently reiterated the query. Quantitative analysis clusters interaction behaviors based on variables such as Session Size (the number of submitted queries), Query Length (the average number of terms per query), Term Popularity (the average frequency of terms based on corpus frequency), Term Use (the average frequency of usage of each query term within the session), Query Interval (the average time between consecutive queries), Pages Viewed (the average number of pages requested per query within a session). |
| The long search sessions are likely problematic in that most of the reiterations/moves are either unsuccessful or unsystematic. Quantitative analysis provides a basis for identifying problematic searches, and subsequent qualitative analysis can reveal the underlying cognitive factors (knowledge structure and mental models of Web search systems). |
