1 <!-- $Id: tools.xml,v 1.9 2001-10-26 20:13:44 adam Exp $ -->
2 <chapter id="tools"><title>Supporting Tools</title>
5 In support of the service API - primarily the ASN module, which
6 provides the pro-grammatic interface to the Z39.50 APDUs, &yaz; contains
7 a collection of tools that support the development of applications.
10 <sect1 id="tools.query"><title>Query Syntax Parsers</title>
13 Since the type-1 (RPN) query structure has no direct, useful string
14 representation, every origin application needs to provide some form of
15 mapping from a local query notation or representation to a
16 <token>Z_RPNQuery</token> structure. Some programmers will prefer to
17 construct the query manually, perhaps using
18 <function>odr_malloc()</function> to simplify memory management.
19 The &yaz; distribution includes two separate, query-generating tools
20 that may be of use to you.
23 <sect2><title id="PQF">Prefix Query Format</title>
26 Since RPN or reverse polish notation is really just a fancy way of
27 describing a suffix notation format (operator follows operands), it
28 would seem that the confusion is total when we now introduce a prefix
29 notation for RPN. The reason is one of simple laziness - it's somewhat
30 simpler to interpret a prefix format, and this utility was designed
31 for maximum simplicity, to provide a baseline representation for use
32 in simple test applications and scripting environments (like Tcl). The
33 demonstration client included with YAZ uses the PQF.
36 The PQF is defined by the pquery module in the YAZ library. The
37 <filename>pquery.h</filename> file provides the declaration of the
41 Z_RPNQuery *p_query_rpn (ODR o, oid_proto proto, const char *qbuf);
43 Z_AttributesPlusTerm *p_query_scan (ODR o, oid_proto proto,
44 Odr_oid **attributeSetP, const char *qbuf);
46 int p_query_attset (const char *arg);
49 The function <function>p_query_rpn()</function> takes as arguments an
50 &odr; stream (see section <link linkend="odr">The ODR Module</link>)
51 to provide a memory source (the structure created is released on
52 the next call to <function>odr_reset()</function> on the stream), a
53 protocol identifier (one of the constants <token>PROTO_Z3950</token> and
54 <token>PROTO_SR</token>), an attribute set reference, and
55 finally a null-terminated string holding the query string.
58 If the parse went well, <function>p_query_rpn()</function> returns a
59 pointer to a <literal>Z_RPNQuery</literal> structure which can be
60 placed directly into a <literal>Z_SearchRequest</literal>.
64 The <literal>p_query_attset</literal> specifies which attribute set
65 to use if the query doesn't specify one by the
66 <literal>@attrset</literal> operator.
67 The <literal>p_query_attset</literal> returns 0 if the argument is a
68 valid attribute set specifier; otherwise the function returns -1.
72 The grammar of the PQF is as follows:
76 Query ::= [ '@attrset' AttSet ] QueryStruct.
80 QueryStruct ::= [ Attribute ] Simple | Complex.
82 Attribute ::= '@attr' [ AttSet ] AttributeType '=' AttributeValue.
84 AttributeType ::= integer.
86 AttributeValue ::= integer || string.
88 Complex ::= Operator QueryStruct QueryStruct.
90 Operator ::= '@and' | '@or' | '@not' | '@prox' Proximity.
92 Simple ::= ResultSet | Term.
94 ResultSet ::= '@set' string.
96 Term ::= string | '"' string '"'.
98 Proximity ::= Exclusion Distance Ordered Relation WhichCode UnitCode.
100 Exclusion ::= '1' | '0' | 'void'.
102 Distance ::= integer.
104 Ordered ::= '1' | '0'.
106 Relation ::= integer.
108 WhichCode ::= 'known' | 'private' | integer.
110 UnitCode ::= integer.
114 You will note that the syntax above is a fairly faithful
115 representation of RPN, except for the Attibute, which has been
116 moved a step away from the term, allowing you to associate one or more
117 attributes with an entire query structure. The parser will
118 automatically apply the given attributes to each term as required.
122 The following are all examples of valid queries in the PQF.
130 @or "dylan" "zimmerman"
134 @or @and bob dylan @set Result-1
136 @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming"
138 @attr 4=1 @attr 1=4 "self portrait"
140 @prox 0 3 1 2 k 2 dylan zimmerman
142 @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109
146 <sect2><title id="CCL">Common Command Language</title>
149 Not all users enjoy typing in prefix query structures and numerical
150 attribute values, even in a minimalistic test client. In the library
151 world, the more intuitive Common Command Language (or ISO 8777) has
152 enjoyed some popularity - especially before the widespread
153 availability of graphical interfaces. It is still useful in
154 applications where you for some reason or other need to provide a
155 symbolic language for expressing boolean query structures.
159 The <ulink url="http://europagate.dtv.dk/">EUROPAGATE</ulink>
160 research project working under the Libraries programme
161 of the European Commission's DG XIII has, amongst other useful tools,
162 implemented a general-purpose CCL parser which produces an output
163 structure that can be trivially converted to the internal RPN
164 representation of &yaz; (The <literal>Z_RPNQuery</literal> structure).
165 Since the CCL utility - along with the rest of the software
166 produced by EUROPAGATE - is made freely available on a liberal
167 license, it is included as a supplement to &yaz;.
170 <sect3><title>CCL Syntax</title>
173 The CCL parser obeys the following grammar for the FIND argument.
174 The syntax is annotated by in the lines prefixed by
175 <literal>‐‐</literal>.
179 CCL-Find ::= CCL-Find Op Elements
182 Op ::= "and" | "or" | "not"
183 -- The above means that Elements are separated by boolean operators.
185 Elements ::= '(' CCL-Find ')'
188 | Qualifiers Relation Terms
189 | Qualifiers Relation '(' CCL-Find ')'
190 | Qualifiers '=' string '-' string
191 -- Elements is either a recursive definition, a result set reference, a
192 -- list of terms, qualifiers followed by terms, qualifiers followed
193 -- by a recursive definition or qualifiers in a range (lower - upper).
195 Set ::= 'set' = string
196 -- Reference to a result set
198 Terms ::= Terms Prox Term
200 -- Proximity of terms.
204 -- This basically means that a term may include a blank
206 Qualifiers ::= Qualifiers ',' string
208 -- Qualifiers is a list of strings separated by comma
210 Relation ::= '=' | '>=' | '<=' | '<>' | '>' | '<'
211 -- Relational operators. This really doesn't follow the ISO8777
215 -- Proximity operator
220 The following queries are all valid:
232 (dylan and bob) or set=1
236 Assuming that the qualifiers <literal>ti</literal>, <literal>au</literal>
237 and <literal>date</literal> are defined we may use:
243 au=(bob dylan and slow train coming)
245 date>1980 and (ti=((self portrait)))
250 <sect3><title>CCL Qualifiers</title>
253 Qualifiers are used to direct the search to a particular searchable
254 index, such as title (ti) and author indexes (au). The CCL standard
255 itself doesn't specify a particular set of qualifiers, but it does
256 suggest a few short-hand notations. You can customize the CCL parser
257 to support a particular set of qualifiers to reflect the current target
258 profile. Traditionally, a qualifier would map to a particular
259 use-attribute within the BIB-1 attribute set. However, you could also
260 define qualifiers that would set, for example, the
265 Consider a scenario where the target support ranked searches in the
266 title-index. In this case, the user could specify
270 ti,ranked=knuth computer
273 and the <literal>ranked</literal> would map to relation=relevance
274 (2=102) and the <literal>ti</literal> would map to title (1=4).
278 A "profile" with a set predefined CCL qualifiers can be read from a
279 file. The YAZ client reads its CCL qualifiers from a file named
280 <filename>default.bib</filename>. Each line in the file has the form:
284 <replaceable>qualifier-name</replaceable>
285 <replaceable>type</replaceable>=<replaceable>val</replaceable>
286 <replaceable>type</replaceable>=<replaceable>val</replaceable> ...
290 where <replaceable>qualifier-name</replaceable> is the name of the
291 qualifier to be used (eg. <literal>ti</literal>),
292 <replaceable>type</replaceable> is a BIB-1 category type and
293 <replaceable>val</replaceable> is the corresponding BIB-1 attribute
295 The <replaceable>type</replaceable> can be either numeric or it may be
296 either <literal>u</literal> (use), <literal>r</literal> (relation),
297 <literal>p</literal> (position), <literal>s</literal> (structure),
298 <literal>t</literal> (truncation) or <literal>c</literal> (completeness).
299 The <replaceable>qualifier-name</replaceable> <literal>term</literal>
300 has a special meaning.
301 The types and values for this definition is used when
302 <emphasis>no</emphasis> qualifiers are present.
306 Consider the following definition:
315 Two qualifiers are defined, <literal>ti</literal> and
316 <literal>au</literal>.
317 They both set the structure-attribute to phrase (1).
318 <literal>ti</literal>
319 sets the use-attribute to 4. <literal>au</literal> sets the
321 When no qualifiers are used in the query the structure-attribute is
322 set to free-form-text (105).
326 <sect3><title>CCL API</title>
328 All public definitions can be found in the header file
329 <filename>ccl.h</filename>. A profile identifier is of type
330 <literal>CCL_bibset</literal>. A profile must be created with the call
331 to the function <function>ccl_qual_mk</function> which returns a profile
332 handle of type <literal>CCL_bibset</literal>.
336 To read a file containing qualifier definitions the function
337 <function>ccl_qual_file</function> may be convenient. This function
338 takes an already opened <literal>FILE</literal> handle pointer as
339 argument along with a <literal>CCL_bibset</literal> handle.
343 To parse a simple string with a FIND query use the function
346 struct ccl_rpn_node *ccl_find_str (CCL_bibset bibset, const char *str,
347 int *error, int *pos);
350 which takes the CCL profile (<literal>bibset</literal>) and query
351 (<literal>str</literal>) as input. Upon successful completion the RPN
352 tree is returned. If an error occur, such as a syntax error, the integer
353 pointed to by <literal>error</literal> holds the error code and
354 <literal>pos</literal> holds the offset inside query string in which
359 An English representation of the error may be obtained by calling
360 the <literal>ccl_err_msg</literal> function. The error codes are
361 listed in <filename>ccl.h</filename>.
365 To convert the CCL RPN tree (type
366 <literal>struct ccl_rpn_node *</literal>)
367 to the Z_RPNQuery of YAZ the function <function>ccl_rpn_query</function>
368 must be used. This function which is part of YAZ is implemented in
369 <filename>yaz-ccl.c</filename>.
370 After calling this function the CCL RPN tree is probably no longer
371 needed. The <literal>ccl_rpn_delete</literal> destroys the CCL RPN tree.
375 A CCL profile may be destroyed by calling the
376 <function>ccl_qual_rm</function> function.
380 The token names for the CCL operators may be changed by setting the
381 globals (all type <literal>char *</literal>)
382 <literal>ccl_token_and</literal>, <literal>ccl_token_or</literal>,
383 <literal>ccl_token_not</literal> and <literal>ccl_token_set</literal>.
384 An operator may have aliases, i.e. there may be more than one name for
385 the operator. To do this, separate each alias with a space character.
390 <sect1 id="tools.oid"><title>Object Identifiers</title>
393 The basic YAZ representation of an OID is an array of integers,
394 terminated with the value -1. The &odr; module provides two
395 utility-functions to create and copy this type of data elements:
399 Odr_oid *odr_getoidbystr(ODR o, char *str);
403 Creates an OID based on a string-based representation using dots (.)
404 to separate elements in the OID.
408 Odr_oid *odr_oiddup(ODR odr, Odr_oid *o);
412 Creates a copy of the OID referenced by the <emphasis>o</emphasis>
414 Both functions take an &odr; stream as parameter. This stream is used to
415 allocate memory for the data elements, which is released on a
416 subsequent call to <function>odr_reset()</function> on that stream.
420 The OID module provides a higher-level representation of the
421 family of object identifers which describe the Z39.50 protocol and its
422 related objects. The definition of the module interface is given in
423 the <filename>oid.h</filename> file.
427 The interface is mainly based on the <literal>oident</literal> structure.
428 The definition of this structure looks like this:
432 typedef struct oident
437 int oidsuffix[OID_SIZE];
443 The proto field takes one of the values
452 If you don't care about talking to SR-based implementations (few
453 exist, and they may become fewer still if and when the ISO SR and ANSI
454 Z39.50 documents are merged into a single standard), you can ignore
455 this field on incoming packages, and always set it to PROTO_Z3950
456 for outgoing packages.
460 The oclass field takes one of the values
482 corresponding to the OID classes defined by the Z39.50 standard.
484 Finally, the value field takes one of the values
542 again, corresponding to the specific OIDs defined by the standard.
546 The desc field contains a brief, mnemonic name for the OID in question.
554 struct oident *oid_getentbyoid(int *o);
558 takes as argument an OID, and returns a pointer to a static area
559 containing an <literal>oident</literal> structure. You typically use
560 this function when you receive a PDU containing an OID, and you wish
561 to branch out depending on the specific OID value.
569 int *oid_ent_to_oid(struct oident *ent, int *dst);
573 Takes as argument an <literal>oident</literal> structure - in which
574 the <literal>proto</literal>, <literal>oclass</literal>/, and
575 <literal>value</literal> fields are assumed to be set correctly -
576 and returns a pointer to a the buffer as given by <literal>dst</literal>
578 representation of the corresponding OID. The function returns
579 NULL and the array dst is unchanged if a mapping couldn't place.
580 The array <literal>dst</literal> should be at least of size
581 <literal>OID_SIZE</literal>.
585 The <function>oid_ent_to_oid()</function> function can be used whenever
586 you need to prepare a PDU containing one or more OIDs. The separation of
587 the <literal>protocol</literal> element from the remainer of the
588 OID-description makes it simple to write applications that can
589 communicate with either Z39.50 or OSI SR-based applications.
597 oid_value oid_getvalbyname(const char *name);
601 takes as argument a mnemonic OID name, and returns the
602 <literal>/value</literal> field of the first entry in the database that
603 contains the given name in its <literal>desc</literal> field.
607 Finally, the module provides the following utility functions, whose
608 meaning should be obvious:
612 void oid_oidcpy(int *t, int *s);
613 void oid_oidcat(int *t, int *s);
614 int oid_oidcmp(int *o1, int *o2);
615 int oid_oidlen(int *o);
620 The OID module has been criticized - and perhaps rightly so
621 - for needlessly abstracting the
622 representation of OIDs. Other toolkits use a simple
623 string-representation of OIDs with good results. In practice, we have
624 found the interface comfortable and quick to work with, and it is a
625 simple matter (for what it's worth) to create applications compatible
626 with both ISO SR and Z39.50. Finally, the use of the
627 <literal>/oident</literal> database is by no means mandatory.
628 You can easily create your own system for representing OIDs, as long
629 as it is compatible with the low-level integer-array representation
636 <sect1 id="tools.nmem"><title>Nibble Memory</title>
639 Sometimes when you need to allocate and construct a large,
640 interconnected complex of structures, it can be a bit of a pain to
641 release the associated memory again. For the structures describing the
642 Z39.50 PDUs and related structures, it is convenient to use the
643 memory-management system of the &odr; subsystem (see
644 <link linkend="odr-use">Using ODR</link>). However, in some circumstances
645 where you might otherwise benefit from using a simple nibble memory
646 management system, it may be impractical to use
647 <function>odr_malloc()</function> and <function>odr_reset()</function>.
648 For this purpose, the memory manager which also supports the &odr;
649 streams is made available in the NMEM module. The external interface
650 to this module is given in the <filename>nmem.h</filename> file.
654 The following prototypes are given:
658 NMEM nmem_create(void);
659 void nmem_destroy(NMEM n);
660 void *nmem_malloc(NMEM n, int size);
661 void nmem_reset(NMEM n);
662 int nmem_total(NMEM n);
663 void nmem_init(void);
664 void nmem_exit(void);
668 The <function>nmem_create()</function> function returns a pointer to a
669 memory control handle, which can be released again by
670 <function>nmem_destroy()</function> when no longer needed.
671 The function <function>nmem_malloc()</function> allocates a block of
672 memory of the requested size. A call to <function>nmem_reset()</function>
673 or <function>nmem_destroy()</function> will release all memory allocated
674 on the handle since it was created (or since the last call to
675 <function>nmem_reset()</function>. The function
676 <function>nmem_total()</function> returns the number of bytes currently
677 allocated on the handle.
681 The nibble memory pool is shared amongst threads. POSIX
682 mutex'es and WIN32 Critical sections are introduced to keep the
683 module thread safe. Function <function>nmem_init()</function>
684 initializes the nibble memory library and it is called automatically
685 the first time the <literal>YAZ.DLL</literal> is loaded. &yaz; uses
686 function <function>DllMain</function> to achieve this. You should
687 <emphasis>not</emphasis> call <function>nmem_init</function> or
688 <function>nmem_exit</function> unless you're absolute sure what
689 you're doing. Note that in previous &yaz; versions you'd have to call
690 <function>nmem_init</function> yourself.
696 <!-- Keep this comment at the end of the file
701 sgml-minimize-attributes:nil
702 sgml-always-quote-attributes:t
705 sgml-parent-document: "yaz.xml"
706 sgml-local-catalogs: nil
707 sgml-namecase-general:t