Cheshire Configuration - Components

[Back to Indexes ] [ Up to the contents ] [ On to Clusters ]

The components tag starts the section of the configuration file on component indexing. It contains one or more componentdef tags which define a component database, in the same way as clusterdef and indexdef.

The component configuration is where sections of the SGML documents are indexed separately and can be retrieved without the rest of the document. For example, in a catalog where multiple items are recorded within a single document this could be of use in retreiving the information about a single item (for instance, EAD). Equally, in a full text document it could be used for retrieving a page, column or even line at a time.

Componentdef is used to encapsulate a definition of a component database in the same manner as indexdef.

The componentname tag contains the filename of the database where the component information for this definition is to be kept. This is information about where the components are in the documents, not the indexes of the data in them, and as such needs to be unique across component definitions. For example:
<componentname> /home/cheshire/cheshire/teidocs/page_components </componentname>

This field is for normalisation to be done on the components while indexing. There are currently two possible values, NONE and COMPRESS. If COMPRESS is given, then all markup inside the component will be flattened out. NONE will leave the markup intact. If the tag is not supplied, it will default to NONE.
<componentnorm> NONE </componentnorm>

This tag is used to indicate the tag to be treated as a component, or if the compendtag is given, the tag to start the component at. It contains a single tag specification (without the tagspec tag). For example, if we wanted to create a component index of 'stanza' tags for a poetry DTD:

  <ftag> stanza </ftag>

Components may also be defined as everything between two different tags. For example, between two empty <linestart> and <lineend> tags. In this case the first tag is recorded in compstarttag as above, and the second in compendtag. If the component is just the contents of a tag, then don't use the compendtag field. The compendtag field in the above example would be:

  <ftag> lineend </ftag>

In exactly the same way as the indexes field in the main file definition, componentindexes contains index definitions to be applied to the component database.