Information Access for a Digital Library:
Cheshire II and the Berkeley Environmental Digital Library
 Ray R. Larson
School of Information Management & Systems
University of California, Berkeley


Chad Carson
Computer Science Division, EECS
University of California, Berkeley
carson@eecs.berkeley.edu

UCB Digital Library Project: Research Agenda

Testbed: An Environmental
Digital Library

The Environmental Library -
Users/Contributors

The Environmental Library - Contents

The Environmental Library - Contents

Botanical Data:

Geographical Data:

Documents:

Documents - cont.

Photographs:

Testbed Success Stories

Research Highlights

User Interface Paradigms: Multivalent Documents

Multivalent Documents

Slide 16

GIS in the MVD Framework

GIS Viewer Example http://elib.cs.berkeley.edu/annotations/gis/buildings.html

Overview of Cheshire II

Overview of Cheshire II

Cheshire II Searching

Current Usage of Cheshire II

Image Retrieval Research

Blobworld: use regions for retrieval

Outline

Creating and using Blobworld

Extract features for each pixel

Find groups in feature space

Find regions in the image

Describe regions by color, texture, shape

Creating and using Blobworld

Querying: let user see the representation

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Query experiments

Distinctive objects

Distinctive objects and backgrounds

Distinctive scenes

Index to search huge collections

Index using conventional IR methods

Indexing and Retrieval with Cheshire II

Slide 46

Slide 47

Slide 48

Slide 49

Conclusions

Further Information

Slide 52

Select appropriate scale for processing

Initialize means using image data

Grouping: Expectation-Maximization

How many Gaussians?

Find groups in feature space

EM math

Encode similarity between color bins

Fourier descriptors for shape

Rank images by distance

Index lower-dimensional histograms

Projection details

How many dimensions do we need?

Indexing doesn’t hurt precision too much