Okapi-Pack

Centre For Interactive Systems Research
City University
London EC1V 0BH


The Okapi Interactive Interface.

Contents

  1. Setting Up The System.
  2. Running The Interface.
  3. The Structure of the GUI.
  4. User Interaction.
  5. A Sample Search.

For details about the exact searching process and the implementation of incremental query expansion please refer to Appendix G.

1. Setting Up The System.

Please refer to Section C which describes how to install the interface. In particular, take note of the sub-sections that refer to Setting the environment variables:

BSS_PARMPATH The database parameter files
BSS_TEMPPATH Directory for storage of temporary files
GUI_CONFIG_FILES The locations of the two interface configuration files ".okapi_rc" and ".okapi_db"
OKAPI_LOGS_DIR. The location of the logfiles for each search

NOTE: The system will not operate unless these instructions have been followed.

2. Running The Interface.

The interface program, called okapi, has been developed form the one designed at City University's experiments for the TREC Interactive Track. It is designed to be run with three parameters passed to it: i.e.

    okapi <topic_no> <user_id> <rf_flag>

    topic_no An INTEGER number assigned to the query.
    user_id A character string identifier for the searcher
    rf_flag An INTEGER flag to turn relevance feedback off (0) or on (1).

E.g. For user zebra to search on topic 262 with relevance feed back ON we would enter at the Unix prompt:

    okapi 262  zebra   1

The parameters <user_id> and <topic_no> define the subdirectory of <OKAPI_LOGS_DIR> in which the three logfiles history , termset and relsfile will be kept. The purpose and structure of these three files is discussed in Appendix M.

Typing the preceding command will result in the display of the Main Window (Figure 1) . Full documents that the user wishes to see as the result of a search will be shown in a separate, Pop-Up Window (Figure 2) . The following sections will describe the structure of the interface and user interaction.

3. The Structure Of The GUI.

The Okapi interface provided is composed of: (1) a main window divided into six areas as shown in Figure 1. Full documnets that are shown by the searcher are displayed in a pop-up window.

1. Term Entry Box
2. Working query





3. Hitlist





4. Removed Terms





5. Rels Pool





6a. Search 6b. Options 6c. Exit

Figure 1

3.1. Main Window

  1. Query Entry Box: A text entry widget for user input of query terms.
  2. Working Query: A scrollable, ranked list of the terms in the current working query.
  3. Removed Terms: A scrollable list of any terms removed by the user from the working query, displayed in removal order.
  4. Hitlist: A scrollable, ranked hitlist for the current iteration.
  5. Pool of Positive Relevance Judgments: A scrollable, ranked list of positive user relevance judgments.
  6. Function Buttons: At the bottom of the window are either two or three context-sensitive buttons.

    1. Search
    2. Query Options
    3. Exit

NOTE: The "Exit" button is always enabled. The "Search" button is disabled immediately after a search has been executed. It is re-enabled when the "working query" is modified in any way.

3.2. Full Document.

A pop-up text window for the display of a full record selected from the hitlist. At the bottom of the window are buttons that enable the searcher to make a relevance judgement on the document.


Full Document Display












Relevant Not Relevant

Figure 2

The functionality to carry out a passage search is included in the interface. When a search is performed on a database that has been indexed to include positional information about paragraphs the system will search for the best sub-passage of each document. In this case, when each full document is displayed a third button will be available which allows the user to choose the sub-passage for relevance feedback rather than the whole document. This will not be the case for the sample databases provided with "Okapi-Pack" as the records are too short. However, by examining the indexing programs described in Appendix E and Appendix F it should be possible to see how to produce a suitable database.

4. USER INTERACTION.

4.1. Using the keyboard.

The user may enter an initial query or modify an existing one by typing words into the term entry box. Query terms may be entered or added to by typing:

  1. a list of one or more single words. e.g.

    rules regulations laws

  2. a single phrase terminated with a plus sign. e.g.

    stock market +

    A phrase defined in this way, if it exists in the database, will consist of:

    1. a true phrase
    2. a same sentence occurrence of the defined terms
    3. a combination of both a true phrase and a same sentence occurrence.

    The type of phrase found will be determined by the value of the environment variable <BOTH_PHRASE_OPS> which is set in .okapi_rc.

    BOTH_PHRASE_OPS Meaning
    0 TRUE PHRASES only, possibly with intervening stopwords
    1 SAME SENTENCE occurrences and TRUE PHRASES if they also exist.

    The default value of <BOTH_PHRASE_OPS> is 1. Change it to 0 if you only want true phrases. This may be necessary if you have a text database in which there are no sentence delimiter characters in the fields.

    The way in which the interface creates user-defined phrases is described in more detail in Appendix G.

In both (a) and (b) entry is completed by pressing the <Return> key. Errors may be corrected in the normal way by moving the cursor and inserting/deleting characters.

If all input terms are found in the database, the term entry box will be cleared once the terms are displayed in the "working query" window. If, however, one or more terms do not exist in the database the entered terms will be highlighted in the term entry box. The text may be edited or deleted in the usual way.

4.2. Using the Mouse.

Only the <left> mouse button is used by the interface. The user may perform certain actions by:

  1. Single-clicking <left> over interface buttons, e.g. Search and Relevant.

  2. Double-clicking <left> over text displayed on the screen to:

    1. show a full record from the hitlist.
    2. remove a term from the working query (it will move to the window below the working query window).
    3. restore a term from the set of removed terms to the working query.

  3. Moving the mouse pointer over any of the four display windows to enable vertical cursor control for that window by use of the up/down, Pg Up/Pg Dn (possibly modified by the shift or control keys) and Home and End keys.

4.3. Control key combinations.

The functions "Search" and "Exit" may be executed by pressing "Ctrl-S" and "Ctrl-X" respectively.

5. A Sample Search.

The functionality of the interface will be described by running through a short search. User entries and actions will be indicated by >>.

5.1. Building a "Working Query".

The algorithm for constructing the working query is described in Appendix G.

The initial "working query" is built by entering terms via the keyboard.

Enter in the Term Entry Box :
>> stock market rules + <Ret>
>> insider trading + <Ret>
>> regulations laws <Ret>

These will be displayed in the Working Query Window as:

149   0   :   stock market rules (B)
308   0   :   insider trading
10472   0   :   regulations
13292   0   :   laws

where:

  1. the last term entered is displayed in red; existing terms are displayed in green.
  2. The terms are ranked by RSV.
  3. Phrases may be followed by nothing or a bracketed B or S (see Appendix G)
  4. The numbers displayed on the left of each term are, from left to right:
    1. the number of documents in the database containing the term,
    2. the number of documents judged as relevant by the user that contain the term (this will be initially zero).

5.2. Searching the database on the Working query.

>> Single-left-click the Search button

Each entry of the ranked hitlist will consist of three components.

  1. A title line, highlighted in green, consisting of, e.g.

    2
    rank order
    in the set
    FT924-14664
    document
    identifier
    [736]
    Okapi weight
    normalised onto
    the range 1-1000
    1/2 pages
    Passage / Document Lengths
    in screen pages of
    2000 characters

  2. About 3 lines consisting of approximately the first 180 characters of the text field of the document.

  3. Several highlighted lines showing the query term occurrence within the document. E.g.

          insider trading (2) regulations (1) law (1)

    Note: Terms may occur here in different forms than in the working query. This is because:

    1. terms are stemmed, e.g. LAW, Law, LAWS, laws, etc. are all treated as "law" by the system.
    2. some sets of terms are treated as a synonym group by the system, e.g. Netherlands, Holland and Dutch.

5.3. Viewing full documents.

>> Double-left-click on any line in the appropriate hitlist entry (including the blank line at the end).

This will result in the document being displayed in a pop-up window.

The user may scroll up and down through the document using the "Up"/"Down" and "Page Up"/"Page Down" cursor keys.

At the bottom of the window are two buttons:

Relevant Relevance Feedback (RF) from the full record text.
Not Relevant  

The user must make a relevance judgement before moving onto any other part of the search.

5.4. Make a relevance judgement:

Making a positive relevance judgement will cause terms extracted from the appropriate section of the document to be merged with the current set of "candidate terms". The title information about every document judged as relevant by the user is displayed in the relspool window immediately below the hitlist window.

5.5. Modifying the working query.

The working query can/may be modified at any time as follows.

  1. Adding new user-defined terms. All such terms will go into the working query (if there is room).
  2. Making positive (full or passage only) relevance judgements. This may cause extracted terms that satisfy the system-defined threshold condition to be automatically added to the working query.

5.6. Further searches.

Further searches can be made once the working query has been modified as described above. The result will be a new document set from which a new hitlist is formed. A member of the new document set that occured in a previous "hitlist" and has already been "viewed" will:

  1. not appear in the new "hitlist"
  2. be assigned its normalised weight from the new document set if it is already when displayed in the relspool. This may result in some re-ordering of the "relspool".

5.8. Quitting.

Clicking the quit button finishes the application cleanly. The database and all open files are closed. All temporary files are deleted.



>> Single-left-click on one of the three buttons.
Okapi-Pack Main Menu Mail Okapi Support Registration


Last modified:   12th November 2001