Client Scripting

There are two clients provided with Cheshire, a Z39.50 client (ztcl) and a web client (webcheshire). The Z39.50 client communicates via the zserver configured in inetd.conf and the services files, webcheshire on the other hand uses the indexes and database configurations directly. Each has TCL/TK as its underlying scripting language, with different extensions to interogate the databases.

It is important to make clear at this point the distinction between a cheshire client, and a web client. The cheshire client tells the cheshire server to do the search, and then process the results. The results could be processed into any form desired, but typically into HTML for display on a web browser. In this way, the cheshire client is still a part of the web server as it is just a cgi script.

This page makes no attempt to instruct the reader on the use of TCL/TK, but to only go over the extensions and ways in which ztcl and webcheshire can be used in the context of SGML search and retreival. As the process is similar for both, we'll go through it step by step.

The zselect command is used only in the ztcl client, and establishes a connection to a z39.50 server.
Usage: zselect servername [address database port] [authentication]

The client comes with several pre-established servers, which can be displayed using 'parray Z_HOSTS', and these can be connected to using just the server name field. However to connect to your own, you'll need to give all the information.

The servername, if not from the Z_HOSTS list, can be anything that you want to call this particular session.
Address is the IP address, or appropriate host name for the server you're connecting to.
Database is the name of the database, as determined by the 'filetag' field in the DBCONFIGFILE.
Port is the port on which to connect - this is defined in the /etc/zserver.conf file that was set up in the first step.
Authentication is in order to give a password to connect to Z39.50 points that require a login.

This step is done in webcheshire by setting special variables:

CHESHIRE_CONFIGFILE: This is the file path of the local configuration file.
CHESHIRE_DATABASE: And this is the name of the database in the configuration file to search, as defined in the filetag field.

The next thing that needs to be done is to set up some of the attributes of the search. This is done in ztcl using the zset command, which has the same usage as 'set' from regular TCL:
Usage: zset variable value

The following variables should be set before doing a search:

ElementSetNames (name): This is the name of the format in which to retreive the information. For specially defined formats, this is the name defined in the name attribute of the displaydef field. The default is F, which will retrieve the full record.

RecordSyntax (name): This is the syntax in which the record should be returned. This is different from the format, in that the format sets which set of tags to return, whereas the Syntax is how to return them. Possible values are: SGML (return the record as is in SGML), GRS1 (return the record in the Generic Record Syntax), MARC (return a MARC record for data that is in USMARC sgml form), SUTRS (Simple Unstructured Text Record Syntax), EXPLAIN and OPAC.

NumRequested (number): This is the number of records to be displayed, regardless of how many records match the search query. This defaults to all the returned records.

StartPosition (number): The first record to display. This defaults to the first record, but when combined with NumRequested above is useful in displaying particular records by themselves.

These should also be set in the webcheshire client, but as above this is done via special cheshire variables. They are, respectively, CHESHIRE_ELEMENTSET, CHESHIRE_RECSYNTAX, CHESHIRE_NUMREQUESTED and CHESHIRE_NUMSTART.

This is the command to instigate a z39.50 search. It has a simple basic usage, but the search query can be constructed in a very complex manner. It will return either an error message, or the number of hits found in the index. These can be displayed with zdisplay, discussed next.
Usage: zfind (indexname query)

The indexname field is the name of the index in the database to search, as set in the indextag field in the configuration file. The query is the search to perform, set out as discussed below.

The webcheshire equivalent is called 'search', and takes the same arguments, however will return the records in the desired syntax after the number of hits directly.

Note well that cheshire does some internal mapping of index names. These mappings are based on the use attributes of the indexes, so if you call an index 'mydate', but give it the appropriate attributes associated with a date type, then date will work as well. This is important to note, as you may need to change the names of your indexes to something that isn't in one of these default mappings if you don't wish to give the attributes as well.

Query construction is done by building up using blocks of indexname and search term pairs. For example, the simplest query is just a word to search for in the index:
zfind author sanderson

This can be built up with boolean operators: AND, OR, and NOT. For example:
zfind author sanderson AND title cheshire

These can be nested with brackets, thus to find an article by someone called Sanderson with either cheshire or medieval in the title:
zfind author sanderson AND (title cheshire OR title medieval)

To search for a phrase rather than a word - a search term with spaces in it - the term can be enclosed in { } brackets. So to search by a full title:
zfind author sanderson AND title {full text retrieval of medieval manuscripts}

There are several relational operators that can be put between the index name and the search term, which can be especially useful for searching Date indexes. The simple ones, which have the same meaning as elsewhere are: < <= > and >=. There is also a 'within' operator - '<=>' useful for finding a date within a range for example. The @ character is used in Cheshire to indicate that the results should be sorted probabilistically ranked. Some examples are in order:
zfind idnumber < 20
zfind date <=> 1800-1850
zfind topic @ {history malaysia ruins}

This command is used only in the ztcl client, and displays the result set retreived or a part thereof. The simplest use is just the command without any arguments, which will display the entire result set of the most recent search. It may also take three different arguments:

resultset: The name of the resultset from which to display.
number_of_records: The number of records to display.
first_record: The record with which to start.

The zscan command does a scan of the requested database returning the keys and how many records they appear in around the search term.
Usage: zscan indexname term stepsize numreq position

The arguments need some explanation. Firstly, the indexname is the name of the index as defined in the indxtag field.
Term is the term around which to scan.
Stepsize is the number of terms between each of the terms returned. This allows refining of scans from a wide scan down to a more finely grained one. To get every term, then, this should be 0.
Numreq is the number of keys requested.
Position is the numerical position at which the search term should appear in the list.

As these terms are the keys in the database, searching for them in that index is garaunteed to find results. However, the terms may appear multiple times (as is indicated in the number of records field that is returned with the term), and hence an intermediate stage may be required to get to the explicit record desired.

The local version of this command is 'lscan' and has the same arguments.

This command will sort a stored resultset based on the contents of a tag or attribute in the SGML. (This is only in version 2.32 and above)
Usage: zsort [-in resultsetname ] [-out resultsetname] { -tag tagname1 | -attribute attributename1 } [ -ignore_case -case_sensitive -ascending -descending -ascending_freq -descending_freq -missing_null -missing_quit -missing_value value ] { - tag tagname2 | -attribute attributename2 } ...

When this works, I'll finish the documentation!

The web_cheshire version is called local_sort (as lsort is already a tcl command for sorting lists)

The zshow command displays information about the Z39.50 session.

Zclose will close the current Z39.50 session.

There is a global variable automatically set on startup called 'cheshire_version'. This will contain the version of the running code. This is available from version 2.30 and on.

Sample Search Script

Here's a very simple search script. For a much more complex example, see the GRS1 handler example.

# Simple cheshire search script

# Set up environment
set CHESHIRE_DATABASE "archives"

# Define query
set query "search subject history"

# Search
set err [catch {eval $query} qresults]
set hits [lindex [lindex [lindex $qresults 0] 0] 1]
set sresults [lrange $qresults 1 end]

# Display the record
puts $sresults