Okapi-Pack

Centre For Interactive Systems Research
City University
London EC1V 0BH

"Okapi-Pack" Introduction.

Okapi-Pack is a complete implementation of the Okapi system. It is available from the Centre For Interactive Systems Research (CISR) from under the BSD license .

The distributed system requires around 100 Mbytes of disc . There are two versions of Okapi-pack: for Solaris and Linux. A version for Solaris is designed to run on a Sun Sparc station with a minimum of 16 MBytes of memory running Solaris 2.5/2.6. A version for Linux runs on Red Hat Linux 6.0/6.1.

The graphical user interfaces provided are written in a combination of C/C++ and Tcl/Tk . All binaries were compiled with gcc V2.7 . The GUIs have been tested with Tcl-7.4 / Tk-4.0 and Tcl-7.6 / Tk-4.2 .

The package comprises:

1. Indexing Software.

Software to enable users to create and index Okapi type databases. Included is a graphical user interface, indexer, to provide a basic introduction to the process. indexer, allows the creation and indexing of both text and abstracting and indexing (ai) databases. Although the interface will only deal with databases that can be accommodated in one disc volume, the programs called by the application are capable of creating and indexing larger databases that may extend over several volumes. These programs are documented in Appendix E and Appendix F.

There are two sample databases, both just over 1000 records in size, provided with the system:

  1. med.sample : a small text database generated from the Medlars collection.
  2. cacm.sample : a small ai database generated from the CACM collection.
The sample databases were both downloaded from Cornell University (ftp to ftp.cs.cornell.edu and move into directory pub/smart). We are trying to obtain a current database that is more of the size of parts of the TREC collection.

For text databases made up of larger records, it is possible to generate positional information for paragraphs so that a passage search may be implemented. This means that it is possible to conduct an Okapi search such that the system will attempt to find, for each document in the ranked hitlist, the "best" sub-passage within the document.

2. The Basic Search System (BSS).

The BSS consists of a set of low-level commands, implemented as a C library, that enables users to build their own interfaces based around it. The BSS commands are documented in Appendix J.

Corresponding to i0+ is an executable i1+ which may be used both as a command line interface and for trying things out or in shell scripts.

3. The Okapi Interactive Interface.

okapi is a configurable interface that calls BSS commands. It allows users to conduct relevance feedback searches of both text and ai databases. The system allows users to:

  1. Build an initial query by entering both single terms and/or phrases.
  2. Conduct a search on a given query formulation.
  3. View full documents and make relevance judgements.
  4. Incrementally expand the query as relevance judgements are made.
  5. Modify the current state of the query by adding/removing terms and clearing relevance feedback information.
  6. Change some interface parameters interactively.



Okapi-Pack Main Menu Mail Okapi Support Registration


Last modified:   12th November 2001