Ray
R. Larson
School
of Information Management and Systems
University
of California, Berkeley
Berkeley,
CA 94720-4600
This paper
briefly discusses the UC Berkeley entry in the TREC8 Interactive Track. In this
year’s study twelve searchers conducted six searches each, half on the Cheshire
II system and the other half on the Zprise system, for a total of 72 searches.
Questionnaires were administered to each participant to gather information
about basic demographic and searching experience, about each search, about each
of the systems, and finally, about the user’s perceptions of the systems. In
this paper I will briefly describe the systems used in the study and how they
differ in design goals and implementation. The results of the interactive track
evaluations and the information derived from the questionnaires are then
discussed and future improvements to the Cheshire II system are considered.
The primary goals of UC Berkeley entry in the
TREC-8 Interactive track were to 1) attempt to replicate our entry in the
TREC-6 and TREC-7 Interactive track with a larger number of participants
(searchers), and 2) to evaluate changes to the experiment system (Cheshire II)
to see if there were substantial differences in the ranking of the systems
between previous year’s entries and this year. In addition we are continuing to
use the same systems, questionnaires, and complete TREC-7 Interactive track
protocol to obtain further information that we hope to combine with the data
obtained in previous TREC interactive track experiments for further analysis.
In TREC-8 we used virtually identical
implementations of the Cheshire II system and the ZPRISE system as those used
in previous TRECs. The database and indexing for each system were also the same
as for TREC-6 and TREC-7 (Larson & McDonough, . The changes made to the
Cheshire II system for this year’s experiment are discussed below.
The design and retrieval algorithm of the
Cheshire II system have been discussed in both the TREC-6 and
TREC-7 papers, and only the highlights of that
description will be repeated here. The Cheshire II system finds its primary
usage in full text or structured metadata collections based on SGML and XML,
often as the search engine behind a variety of WWW-based “search pages” or as a
Z39.50 server for particular applications. The Cheshire II system includes the
following features:
1. It
supports SGML and XML as the primary database format of the underlying search
engine
2. It
is a client/server application where the interfaces (clients) communicate with
the search engine (server) using the Z39.50 v.3 Information Retrieval Protocol.
3. It
includes a programmable graphical direct manipulation interface under X on Unix
and NT. There is also CGI interpreter
version that combines client and server capabilities.
4. It
permits users to enter natural language queries and these may be combined with
Boolean logic for users who wish to use it.
5.

It
uses probabilistic ranking methods based on the Logistic Regression research
carried out at Berkeley to match the user's initial query with documents in the
database.
6. It
supports open-ended, exploratory browsing through following dynamically established
linkages between records in the database, in order to retrieve materials
related to those already found. These can be dynamically generated
“hypersearches” that let users issue a Boolean query with a mouse click to find
all items that share some field with a displayed record.
7. It
uses the user's selection of relevant citations to refine the initial search
statement and automatically construct new search statements for relevance
feedback searching.
The Cheshire II search engine supports both probabilistic
and Boolean searching. The design rationale and features of the Cheshire II
search engine have been discussed in the TREC-6 and TREC-7 papers (Larson &
McDonough, 1998; Gey, Jiang, Chen & Larson, 1999).
The Cheshire search engine functions as a Z39.50
information retrieval protocol server providing access to a set of databases.
In the TREC-8 experiments the TREC Financial Times (FT) database was the only
database used by participants. The system supports various methods for
translating a searcher's query into the terms used in indexing the database.
These methods include elimination of unused words using field-specific stopword
lists, particular field-specific query-to-key conversion or “normalization”
functions, standard stemming algorithms (Porter stemmer).
The Cheshire II search engine supports both
Boolean and probabilistic searching on any indexed element of the database. In
probabilistic searching, a natural language query can be used to retrieve the
records that are estimated to have the highest probability of being relevant
given the user's query. The search engine supports a simple form of relevance
feedback, where any items found in an initial search (Boolean or probabilistic)
can be selected and used as queries in a relevance feedback search.
The probabilistic retrieval algorithm used in
the Cheshire II search engine is based on the logistic regression algorithms developed by Berkeley researchers
(Cooper, et al. 1992, 1994a, 1994b). The Cheshire II search engine also
supports complete Boolean operations on indexed elements in the database, and
supports searches that combine probabilistic and Boolean elements.
Relevance feedback is supported and implemented
quite simply, as probabilistic retrieval based on extraction of content-bearing
elements (such as titles, subject headings, etc.) from any items that have
already been seen and selected by a user. At the present time we do not use any
methods for eliminating poor search terms from the selected records, nor
special enhancements for terms common between multiple selected records (Salton
& Buckley 1990).
The design of the Cheshire II client interface
(shown with the TREC FT database in Figure 1), has also been discussed in
previous TREC papers. This discussion will concentrate on changes made to the
interface for the purposes of our TREC-8 experiment. The Cheshire II interface was intended to provide a generic
interface to Z39.50 servers, primarily for search and display of library
catalog information and other bibliographic databases. The principle design
goals in the interface design were:
1. to
support a consistent interface to a wide variety of Z39.50 servers, and to
dynamically adapt to the particular server.
2. to
reduce the cognitive load on the users wishing to interact with multiple
distributed information retrieval systems by providing a single interface for
them all.
3. to
minimize use of additional windows during users' interactions with the client
in order to allow them to concentrate on formulating queries and evaluating the
results, and not expend additional mental effort and time switching their focus
of attention from the search interface to display clients;
As pointed out in the TREC-7 paper (Gey, Jiang,
Chen & Larson, 1999), the interface design assumed that most of the
information retrieved and viewed in the search interface would be brief
metadata records for documents, and not full text documents themselves. The
ability to view full-text documents such as the FT articles used in the interactive
track experiments was initially added to the existing interface as longer
records that could be scrolled in the main display window. However, comments
and questionnaire responses from TREC-7 participants indicated that the
separate document viewing window associated with the ZPRISE system was
preferable to having to do so much scrolling to accomplish the Interactive
Track tasks.. The primary addition to the Cheshire II client interface was the
addition of a full-text display window that included controls for
selecting/saving the displayed document. This window is shown in Figure 1. The
full-text window is invoked by the “Full Text” button next to the “Select”
button for each record. The “Full Text” button changes color to indicate the
currently displayed full-text document (blue) or previously seen documents
(orange/gold). The full-text window also included controls for stepping
directly to the next or previous full-text document in the retrieval list.
In addition, the Boolean NOT, requested by
several searchers in TREC-7 was brought out to the interface and integrated
with the Boolean search capability.
The
second (control) system used in the TREC-7 Interactive track at Berkeley was
the Zprise system from NIST. This system was used in the same configuration and
with the same database indexing setup as used for the global control system in
our TREC-6 and TREC-7 Interactive Track
entries. Zprise, as configured for this test was limited to a total of 24
retrieved items and relevance feedback was disabled. However, the interface was
set up so that it provided a very good fit for the tasks involved in the
interactive track. For example, documents were viewed in full text form in a
separate window from the short display (consisting primarily of title and date
as well as control elements for indicating relevant documents and for moving
around in the brief display. Most of our users found the ZPRISE displays simple
to learn and to operate, in fact most found that the operations required to carry
out the Interactive Track tasks were easier to do on the ZPRISE interface than
they were on the Cheshire II interface. This was not entirely surprising, since
the ZPRISE interface is designed to support TREC-like databases containing full
text. We had hoped that the addition of the full-text display to the Cheshire
II system would show less difference in preference (and hopefully, less
differences in the aspectual recall and precision figures) when compared to
TREC-7. But, as discussed below, this hope was not fulfilled.

The
administration of the Interactive Track followed the protocols set down in the
track guidelines. This mandated a minimum group of 12 participant searchers,
each of whom conduct 6 searches, half on the control system (ZPRISE, identified
as “Z”) and half on the experimental system (Cheshire II, identified as “C”).
Each searcher was asked to use the features of the respective interfaces to
select as relevant those documents that they considered to relevant to one or
more aspects of the specific topic.
The pooled results for all systems were
evaluated at NIST by the TREC evaluators and “Aspectual Precision” and
“Aspectual Recall” for each searcher was calculated. Table 1 shows the values
for Aspectual Precision and Recall by TREC topic for the two Berkeley systems
(“C” and “Z”, the Cheshire II system and ZPRISE systems respectively) are shown
in boldface in Tables 1 and 2. The control system “Z” performed considerably
better than the experimental system in terms of the Aspectual Precision and
noticeably better in terms of Aspectual recall. Needless to say, this is a
disappointing result, and our analysis has yet to reveal any obvious reason for
the discrepancy. We believe that the difference may be due to the more complex
interactions required to perform the search tasks on the generic Cheshire II
interface than on the ZPRISE system, certainly the comments of participants on
the questionnaires indicated that most of them preferred

the
ZPRISE system.
In the following section we will examine the
characteristics of the searchers as reported in the questionnaires administered
during the experiments. Figure 3 summarizes the average aspectual precision and
recall for each of the systems participating in the TREC-7 Interactive Track.

The
administration of the interactive track followed the track guidelines with a
single group of 12 participants. While only one of the participants had used
either the experimental (Cheshire II) or control (ZPRISE) systems in searching
tasks, some had seen demonstrations of the experimental system. The searchers
who participated in the study were volunteers drawn from the School of
Information Management and Systems at UC Berkeley (a call for participation was
sent to all students and faculty at SIMS and the first 12 volunteers were
scheduled for search sessions. A pre-search questionnaire asked each
participant about:
1.
What high
school/college/univerity degrees/diplomas do have (or expect to have)?
2.
What is
your occupation?
3.
What is
your gender?
4.
What is
your age?
5.
Have you
participated in previous TREC searching studies?
6.
Overall how
long have you been doing online searching?
7.
Experience
with using a point-and-click interface (e.g. Windows, Macintosh)
8.
Experience
searching on computerized library catalogs either locally or remotely
9.
Experience
searching on CD-ROM systems
10.
Experience
searching on commercial online systems (BRS afterdark, Dialog, Lexis-Nexis,
etc.)
11.
Experience
searching on the World Wide Web search services (Alta Vista, Excite, Yahoo,
Hotbot, etc.)
12.
Experience
searching on other systems
13.
How often
do you conduct a search on any kind of system
14.
“I enjoy
carrying out information searches”
All of the participants, except one
undergraduate, held college degrees (One held a PhD, Three others were PhD
students with previous undergraduate and graduate degrees, and the remaining
participants were Masters students in the SIMS program). Three of the
participants (P1, P2, and P3) had over 8 years of experience in online
searching on other systems. As observed last year, once again the most
frequently used search systems were the Web search services and the next most
frequent were online catalogs. It appears that most recent searchers will be
gaining their experience from the WWW and possibly from online library
catalogs, and will probably not have experience (or as much experience) with
traditional Boolean systems such as Dialog.
Following each search the participants were
given a questionnaire asking:
1.
Are you
familiar with this topic
2.
Was it easy
to get started on this search
3.