1 <!-- $Id: frontend.xml,v 1.5 2001-08-13 09:42:54 adam Exp $ -->
2 <chapter><title id="server">Making an IR Server for Your Database</title>
3 <sect1><title>Introduction</title>
6 If you aren't into documentation, a good way to learn how the
7 back end interface works is to look at the <filename>backend.h</filename>
8 file. Then, look at the small dummy-server in
9 <filename>ztest/ztest.c</filename>. Finally, you can have a look at
10 the <filename>seshigh.c</filename> file, which is where most of the
11 logic of the frontend server is located. The <filename>backend.h</filename>
12 file also makes a good reference, once you've chewed your way through
13 the prose of this file.
17 If you have a database system that you would like to make available by
18 means of Z39.50, &yaz; basically offers your two options. You
19 can use the APIs provided by the &asn;, &odr;, and &comstack;
21 create and decode PDUs, and exchange them with a client.
22 Using this low-level interface gives you access to all fields and
23 options of the protocol, and you can construct your server as close
24 to your existing database as you like.
25 It is also a fairly involved process, requiring
26 you to set up an event-handling mechanism, protocol state machine,
27 etc. To simplify server implementation, we have implemented a compact
28 and simple, but reasonably full-functioned server-frontend that will
29 handle most of the protocol mechanics, while leaving you to
30 concentrate on your database interface.
35 The backend interface was designed in anticipation of a specific
36 integration task, while still attempting to achieve some degree of
37 generality. We realize fully that there are points where the
38 interface can be improved significantly. If you have specific
39 functions or parameters that you think could be useful, send us a
40 mail (or better, sign on to the mailing list referred to in the
41 top-level README file). We will try to fit good suggestions into future
42 releases, to the extent that it can be done without requiring
43 too many structural changes in existing applications.
48 <sect1><title>The Database Frontend</title>
51 We refer to this software as a generic database frontend. Your
52 database system is the <emphasis>backend database</emphasis>, and the
53 interface between the two is called the <emphasis>backend API</emphasis>.
54 The backend API consists of a small number of function handlers and
55 structure definitions. You are required to provide the
56 <function>main()</function> routine for the server (which can be
57 quite simple), as well as a set of handlers to match each of the prototypes.
58 The interface functions that you write can use any mechanism you like
59 to communicate with your database system: You might link the whole
60 thing together with your database application and access it by
61 function calls; you might use IPC to talk to a database server
62 somewhere; or you might link with third-party software that handles
63 the communication for you (like a commercial database client library).
64 At any rate, the handlers will perform the tasks of:
82 Scanning the database index (optional - if you wish to implement SCAN).
86 Extended Services (optional).
90 Result-Set Delete (optional).
94 Result-Set Sort (optional).
100 (more functions will be added in time to support as much of
101 Z39.50-1995 as possible).
105 <sect1><title>The Backend API</title>
108 The headers files that you need to use the interface are in the
109 <filename>include/yaz</filename> directory. They are called
110 <filename>statserv.h</filename> and <filename>backend.h</filename>. They
111 will include other files from the <filename>include/yaz</filename>
112 directory, so you'll probably want to use the -I option of your
113 compiler to tell it where to find the files. When you run
114 <literal>make</literal> in the top-level &yaz; directory,
115 everything you need to create your server is put the
116 <filename>lib/libyaz.a</filename> library.
120 <sect1><title>Your main() Routine</title>
123 As mentioned, your <function>main()</function> routine can be quite brief.
124 If you want to initialize global parameters, or read global configuration
125 tables, this is the place to do it. At the end of the routine, you should
130 int statserv_main(int argc, char **argv,
131 bend_initresult *(*bend_init)(bend_initrequest *r),
132 void (*bend_close)(void *handle));
136 The third and fourth arguments are pointers to handlers. Handler
137 <function>bend_init</function> is called whenever the server receives
138 an Initialize Request, so it serves as a Z39.50 session initializer. The
139 <function>bend_close</function> handler is called when the session is
144 <function>statserv_main</function> will establish listening sockets
145 according to the parameters given. When connection requests are received,
146 the event handler will typically <function>fork()</function> and
147 create a sub-process to handle a new connection.
148 Alternatively the server may be setup to create threads for each
150 If you do use global variables and forking, you should be aware, then,
151 that these cannot be shared between associations, unless you explicitly
152 disable forking by command line parameters.
156 The server provides a mechanism for controlling some of its behavior
157 without using command-line options. The function
161 statserv_options_block *statserv_getcontrol(void);
165 will return a pointer to a <literal>struct statserv_options_block</literal>
166 describing the current default settings of the server. The structure
167 contains these elements:
171 <literal>int dynamic</literal></term><listitem><para>
172 A boolean value, which determines whether the server
173 will fork on each incoming request (TRUE), or not (FALSE). Default is
174 TRUE. This flag is only read by UNIX-based servers (WIN32 based servers
176 </para></listitem></varlistentry>
179 <literal>int threads</literal></term><listitem><para>
180 A boolean value, which determines whether the server
181 will create a thread on each incoming request (TRUE), or not (FALSE).
182 Default is FALSE. This flag is only read by UNIX-based servers
183 that offer POSIX Threads support.
184 WIN32-based servers always operate in threaded mode.
185 </para></listitem></varlistentry>
188 <literal>int inetd</literal></term><listitem><para>
189 A boolean value, which determines whether the server
190 will operates under a UNIX INET daemon (inetd). Default is FALSE.
191 </para></listitem></varlistentry>
194 <literal>int loglevel</literal></term><listitem><para>
195 Set this by ORing the constants defined in
196 <filename>include/yaz/yaz-log.h</filename>.
197 </para></listitem></varlistentry>
200 <literal>char logfile[ODR_MAXNAME+1]</literal></term>
201 <listitem><para>File for diagnostic output ("": stderr).
202 </para></listitem></varlistentry>
205 <literal>char apdufile[ODR_MAXNAME+1]</literal></term>
207 Name of file for logging incoming and outgoing APDUs
208 ("": don't log APDUs, "-":
209 <literal>stderr</literal>).
210 </para></listitem></varlistentry>
213 <literal>char default_listen[1024]</literal></term>
214 <listitem><para>Same form as the command-line specification of
215 listener address. "": no default listener address.
216 Default is to listen at "tcp:@:9999". You can only
217 specify one default listener address in this fashion.
218 </para></listitem></varlistentry>
221 <literal>enum oid_proto default_proto;</literal></term>
222 <listitem><para>Either <literal>PROTO_SR</literal> or
223 <literal>PROTO_Z3950</literal>.
224 Default is <literal>PROTO_Z39_50</literal>.
225 </para></listitem></varlistentry>
228 <literal>int idle_timeout;</literal></term>
229 <listitem><para>Maximum session idletime, in minutes. Zero indicates
230 no (infinite) timeout. Default is 120 minutes.
231 </para></listitem></varlistentry>
234 <literal>int maxrecordsize;</literal></term>
235 <listitem><para>Maximum permissible record (message) size. Default
236 is 1Mb. This amount of memory will only be allocated if a
237 client requests a very large amount of records in one operation
239 Set it to a lower number if you are worried about resource
240 consumption on your host system.
241 </para></listitem></varlistentry>
244 <literal>char configname[ODR_MAXNAME+1]</literal></term>
245 <listitem><para>Passed to the backend when a new connection is received.
246 </para></listitem></varlistentry>
249 <literal>char setuid[ODR_MAXNAME+1]</literal></term>
250 <listitem><para>Set user id to the user specified, after binding
251 the listener addresses.
252 </para></listitem></varlistentry>
255 <literal>void (*bend_start)(struct statserv_options_block *p)</literal>
257 <listitem><para>Pointer to function which is called after the
258 command line options have been parsed - but before the server
260 For forked UNIX servers this handler is called in the mother
261 process; for threaded servers this handler is called in the
263 The default value of this pointer is NULL in which case it
264 isn't invoked by the frontend server.
265 When the server operates as an NT service this handler is called
266 whenever the service is started.
267 </para></listitem></varlistentry>
270 <literal>void (*bend_stop)(struct statserv_options_block *p)</literal>
272 <listitem><para>Pointer to function which is called whenever the server
273 has stopped listening for incoming connections. This function pointer
274 has a default value of NULL in which case it isn't called.
275 When the server operates as an NT service this handler is called
276 whenever the service is stopped.
277 </para></listitem></varlistentry>
280 <literal>void *handle</literal></term>
281 <listitem><para>User defined pointer (default value NULL).
282 This is a per-server handle that can be used to specify "user-data".
283 Do not confuse this with the session-handle as returned by bend_init.
284 </para></listitem></varlistentry>
290 The pointer returned by <literal>statserv_getcontrol</literal> points to
291 a static area. You are allowed to change the contents of the structure,
292 but the changes will not take effect before you call
296 void statserv_setcontrol(statserv_options_block *block);
301 that you should generally update this structure before calling
302 <function>statserv_main()</function>.
307 <sect1><title>The Backend Functions</title>
310 For each service of the protocol, the backend interface declares one or
311 two functions. You are required to provide implementations of the
312 functions representing the services that you wish to implement.
315 <sect2><title>Init</title>
318 bend_initresult (*bend_init)(bend_initrequest *r);
322 This handler is called once for each new connection request, after
323 a new process/thread has been created, and an Initialize Request has
324 been received from the client. The pointer to the
325 <function>bend_init</function> handler is passed in the call to
326 <function>statserv_start</function>.
329 Unlike previous versions of YAZ, the <function>bend_init</function> also
330 serves as a handler that defines the Z39.50 services that the backend
331 wish to support. Pointers to <emphasis>all</emphasis> service handlers,
332 including search - and fetch must be specified here in this handler.
335 The request - and result structures are defined as
339 typedef struct bend_initrequest
341 Z_IdAuthentication *auth;
342 ODR stream; /* encoding stream */
343 ODR print; /* printing stream */
344 Z_ReferenceId *referenceId;/* reference ID */
345 char *peer_name; /* dns host of peer (client) */
347 char *implementation_name;
348 char *implementation_version;
349 int (*bend_sort) (void *handle, bend_sort_rr *rr);
350 int (*bend_search) (void *handle, bend_search_rr *rr);
351 int (*bend_fetch) (void *handle, bend_fetch_rr *rr);
352 int (*bend_present) (void *handle, bend_present_rr *rr);
353 int (*bend_esrequest) (void *handle, bend_esrequest_rr *rr);
354 int (*bend_delete)(void *handle, bend_delete_rr *rr);
355 int (*bend_scan)(void *handle, bend_scan_rr *rr);
356 int (*bend_segment)(void *handle, bend_segment_rr *rr);
359 typedef struct bend_initresult
361 int errcode; /* 0==OK */
362 char *errstring; /* system error string or NULL */
363 void *handle; /* private handle to the backend module */
368 In general, the server frontend expects that the
369 <literal>bend_*result</literal> pointer that you return is valid at
370 least until the next call to a <literal>bend_* function</literal>.
371 This applies to all of the functions described herein. The parameter
372 structure passed to you in the call belongs to the server frontend, and
373 you should not make assumptions about its contents after the current
374 function call has completed. In other words, if you want to retain any
375 of the contents of a request structure, you should copy them.
379 The <literal>errcode</literal> should be zero if the initialization of
380 the backend went well. Any other value will be interpreted as an error.
381 The <literal>errstring</literal> isn't used in the current version, but
382 one option would be to stick it in the initResponse as a VisibleString.
383 The <literal>handle</literal> is the most important parameter. It should
384 be set to some value that uniquely identifies the current session to
385 the backend implementation. It is used by the frontend server in any
386 future calls to a backend function.
387 The typical use is to set it to point to a dynamically allocated state
388 structure that is private to your backend module.
392 The <literal>auth</literal> member holds the authentication information
393 part of the Z39.50 Initialize Request. Interpret this if your serves
394 requires authentication.
398 The members <literal>peer_name</literal>,
399 <literal>implementation_name</literal> and
400 <literal>implementation_version</literal> holds DNS of client, name
401 of client (Z39.50) implementation - and version.
405 The <literal>bend_</literal> - members are set to NULL when
406 <function>bend_init</function> is called. Modify the pointers by
407 setting them to point to backend functions.
412 <sect2><title>Search and retrieve</title>
414 <para>We now describe the handlers that are required to support search -
415 and retrieve. You must support two functions - one for search - and one
416 for fetch (retrieval of one record). If desirable you can provide a
417 third handler which is called when a present request is received which
418 allows you to optimize retrieval of multiple-records.
422 int (*bend_search) (void *handle, bend_search_rr *rr);
425 char *setname; /* name to give to this set */
426 int replace_set; /* replace set, if it already exists */
427 int num_bases; /* number of databases in list */
428 char **basenames; /* databases to search */
429 Z_ReferenceId *referenceId;/* reference ID */
430 Z_Query *query; /* query structure */
431 ODR stream; /* encode stream */
432 ODR decode; /* decode stream */
433 ODR print; /* print stream */
435 bend_request request;
436 bend_association association;
438 int hits; /* number of hits */
439 int errcode; /* 0==OK */
440 char *errstring; /* system error string or NULL */
446 The <function>bend_search</function> handler is a fairly close
447 approximation of a protocol Search Request - and Response PDUs
448 The <literal>setname</literal> is the resultSetName from the protocol.
449 You are required to establish a mapping between the set name and whatever
450 your backend database likes to use.
451 Similarly, the <literal>replace_set</literal> is a boolean value
452 corresponding to the resultSetIndicator field in the protocol.
453 <literal>num_bases/basenames</literal> is a length of/array of character
454 pointers to the database names provided by the client.
455 The <literal>query</literal> is the full query structure as defined in
456 the protocol ASN.1 specification.
457 It can be either of the possible query types, and it's up to you to
458 determine if you can handle the provided query type.
459 Rather than reproduce the C interface here, we'll refer you to the
460 structure definitions in the file
461 <filename>include/yaz/z-core.h</filename>. If you want to look at the
462 attributeSetId OID of the RPN query, you can either match it against
463 your own internal tables, or you can use the
464 <literal>oid_getentbyoid</literal> function provided by &yaz;.
468 The structure contains a number of hits, and an
469 <literal>errcode/errstring</literal> pair. If an error occurs
470 during the search, or if you're unhappy with the request, you should
471 set the errcode to a value from the BIB-1 diagnostic set. The value
472 will then be returned to the user in a nonsurrogate diagnostic record
473 in the response. The <literal>errstring</literal>, if provided, will
474 go in the addinfo field. Look at the protocol definition for the
475 defined error codes, and the suggested uses of the addinfo field.
480 int (*bend_fetch) (void *handle, bend_fetch_rr *rr);
482 typedef struct bend_fetch_rr {
483 char *setname; /* set name */
484 int number; /* record number */
485 Z_ReferenceId *referenceId;/* reference ID */
486 oid_value request_format; /* One of the CLASS_RECSYN members */
487 int *request_format_raw; /* same as above (raw OID) */
488 Z_RecordComposition *comp; /* Formatting instructions */
489 ODR stream; /* encoding stream - memory source if req */
490 ODR print; /* printing stream */
492 char *basename; /* name of database that provided record */
493 int len; /* length of record or -1 if structured */
494 char *record; /* record */
495 int last_in_set; /* is it? */
496 oid_value output_format; /* format */
497 int *output_format_raw; /* used instead of above if not-null */
498 int errcode; /* 0==success */
499 char *errstring; /* system error string or NULL */
500 int surrogate_flag; /* surrogate diagnostic */
505 The frontend server calls the <function>bend_fetch</function> handler
506 when it needs database records to fulfill a Search Request or a Present
508 The <literal>setname</literal> is simply the name of the result set
509 that holds the reference to the desired record.
510 The <literal>number</literal> is the offset into the set (with 1
511 being the first record in the set). The <literal>format</literal> field
512 is the record format requested by the client (See section
513 <link linkend="oid">Object Identifiers</link>). The value
514 <literal>VAL_NONE</literal> indicates that the client did not
515 request a specific format. The <literal>stream</literal> argument
516 is an &odr; stream which should be used for
517 allocating space for structured data records.
518 The stream will be reset when all records have been assembled, and
519 the response package has been transmitted.
520 For unstructured data, the backend is responsible for maintaining a
521 static or dynamic buffer for the record between calls.
525 In the structure, the <literal>basename</literal> is the name of the
526 database that holds the
527 record. <literal>len</literal> is the length of the record returned, in
528 bytes, and <literal>record</literal> is a pointer to the record.
529 <literal>Last_in_set</literal> should be nonzero only if the record
530 returned is the last one in the given result set.
531 <literal>errcode</literal> and <literal>errstring</literal>, if
532 given, will be interpreted as a global error pertaining to the
533 set, and will be returned in a non-surrogate-diagnostic.
534 If you wish to return the error as a surrogate-diagnostic
535 (local error) you can do this by setting
536 <literal>surrogate_flag</literal> to 1 also.
540 If the <literal>len</literal> field has the value -1, then
541 <literal>record</literal> is assumed to point to a constructed data
542 type. The <literal>format</literal> field will be used to determine
543 which encoder should be used to serialize the data.
548 If your backend generates structured records, it should use
549 <function>odr_malloc()</function> on the provided stream for allocating
550 data: This allows the frontend server to keep track of the record sizes.
555 The <literal>format</literal> field is mapped to an object identifier
556 in the direct reference of the resulting EXTERNAL representation
562 The current version of &yaz; only supports the direct reference mode.
567 int (*bend_present) (void *handle, bend_present_rr *rr);
570 char *setname; /* set name */
572 int number; /* record number */
573 oid_value format; /* One of the CLASS_RECSYN members */
574 Z_ReferenceId *referenceId;/* reference ID */
575 Z_RecordComposition *comp; /* Formatting instructions */
576 ODR stream; /* encoding stream */
577 ODR print; /* printing stream */
578 bend_request request;
579 bend_association association;
581 int hits; /* number of hits */
582 int errcode; /* 0==OK */
583 char *errstring; /* system error string or NULL */
588 The <function>bend_present</function> handler is called when
589 the server receives a Present Request. The <literal>setname</literal>,
590 <literal>start</literal> and <literal>number</literal> is the
591 name of the result set - start position - and number of records to
592 be retrieved respectively. <literal>format</literal> and
593 <literal>comp</literal> is the preferred transfer syntax and element
594 specifications of the present request.
597 Note that this is handler serves as a supplement for
598 <function>bend_fetch</function> and need not to be defined in order to
599 support search - and retrieve.
604 <sect2><title>Delete</title>
607 For back-ends that supports delete of a result set only one handler
612 int (*bend_delete)(void *handle, bend_delete_rr *rr);
614 typedef struct bend_delete_rr {
618 Z_ReferenceId *referenceId;
619 int delete_status; /* status for the whole operation */
620 int *statuses; /* status each set - indexed as setnames */
628 The delete set function definition is rather primitive, mostly because
629 we have had no practical need for it as of yet. If someone wants
630 to provide a full delete service, we'd be happy to add the
631 extra parameters that are required. Are there clients out there
632 that will actually delete sets they no longer need?
638 <sect2><title>scan</title>
641 For servers that wish to offer the scan service one handler
646 int (*bend_delete)(void *handle, bend_delete_rr *rr);
649 BEND_SCAN_SUCCESS, /* ok */
650 BEND_SCAN_PARTIAL /* not all entries could be found */
653 typedef struct bend_scan_rr {
654 int num_bases; /* number of elements in databaselist */
655 char **basenames; /* databases to search */
656 oid_value attributeset;
657 Z_ReferenceId *referenceId; /* reference ID */
658 Z_AttributesPlusTerm *term;
659 ODR stream; /* encoding stream - memory source if required */
660 ODR print; /* printing stream */
662 int *step_size; /* step size */
663 int term_position; /* desired index of term in result list/returned */
664 int num_entries; /* number of entries requested/returned */
666 struct scan_entry *entries;
667 bend_scan_status status;
675 <sect1><title>Application Invocation</title>
678 The finished application has the following
679 invocation syntax (by way of <function>statserv_main()</function>):
683 <replaceable>appname</replaceable> [-szSiTu -a <replaceable>apdufile</replaceable> -l <replaceable>logfile</replaceable> -v <replaceable>loglevel</replaceable> -c <replaceable>config</replaceable>]
684 [listener ...]
692 <varlistentry><term>-a <replaceable>file</replaceable></term>
694 Specify a file for dumping PDUs (for diagnostic purposes).
695 The special name "-" sends output to
696 <literal>stderr</literal>.
697 </para></listitem></varlistentry>
699 <varlistentry><term>-S</term>
701 Don't fork or make threads on connection requests. This is good for
702 debugging, but not recommended for real operation: Although the
703 server is asynchronous and non-blocking, it can be nice to keep
704 a software malfunction (okay then, a crash) from affecting all
706 </para></listitem></varlistentry>
708 <varlistentry><term>-T</term>
710 Operate the server in threaded mode. The server creates a thread
711 for each connection rather than a fork a process. Only available
712 on UNIX systems that offers POSIX threads.
713 </para></listitem></varlistentry>
715 <varlistentry><term>-s</term>
717 Use the SR protocol (obsolete).
718 </para></listitem></varlistentry>
720 <varlistentry><term>-z</term>
722 Use the Z39.50 protocol (default). These two options complement
723 each other. You can use both multiple times on the same command
724 line, between listener-specifications (see below). This way, you
725 can set up the server to listen for connections in both protocols
726 concurrently, on different local ports.
727 </para></listitem></varlistentry>
729 <varlistentry><term>-l <replaceable>file</replaceable></term>
730 <listitem><para>The logfile.
731 </para></listitem></varlistentry>
733 <varlistentry><term>-c <replaceable>config</replaceable></term>
734 <listitem><para>A user option that serves as a specifier for some
735 sort of configuration, e.g. a filename.
736 The argument to this option is transferred to member
737 <literal>configname</literal>of the
738 <literal>statserv_options_block</literal>.
739 </para></listitem></varlistentry>
741 <varlistentry><term>-v <replaceable>level</replaceable></term>
743 The log level. Use a comma-separated list of members of the set
744 {fatal,debug,warn,log,all,none}.
745 </para></listitem></varlistentry>
747 <varlistentry><term>-u <replaceable>userid</replaceable></term>
749 Set user ID. Sets the real UID of the server process to that of the
750 given user. It's useful if you aren't comfortable with having the
751 server run as root, but you need to start it as such to bind a
753 </para></listitem></varlistentry>
755 <varlistentry><term>-w <replaceable>dir</replaceable></term>
758 </para></listitem></varlistentry>
760 <varlistentry><term>-i</term>
762 Use this when running from the <application>inetd</application> server.
763 </para></listitem></varlistentry>
765 <varlistentry><term>-t <replaceable>minutes</replaceable></term>
767 Idle session timeout, in minutes.
768 </para></listitem></varlistentry>
770 <varlistentry><term>-k <replaceable>size</replaceable></term>
772 Maximum record size/message size, in kilobytes.
773 </para></listitem></varlistentry>
779 A listener specification consists of a transport mode followed by a
780 colon (:) followed by a listener address. The transport mode is
781 either <literal>osi</literal> or <literal>tcp</literal>.
785 For TCP, an address has the form
789 hostname | IP-number [: portnumber]
793 The port number defaults to 210 (standard Z39.50 port).
797 For osi, the address form is
801 [t-selector /] hostname | IP-number [: portnumber]
805 The transport selector is given as a string of hex digits (with an even
806 number of digits). The default port number is 102 (RFC1006 port).
816 osi:0402/dbserver.osiworld.com:3000
820 In both cases, the special hostname "@" is mapped to
821 the address INADDR_ANY, which causes the server to listen on any local
822 interface. To start the server listening on the registered ports for
823 Z39.50 and SR over OSI/RFC1006, and to drop root privileges once the
824 ports are bound, execute the server like this (from a root shell):
828 my-server -u daemon tcp:@ -s osi:@
832 You can replace <literal>daemon</literal> with another user, eg. your
833 own account, or a dedicated IR server account.
834 <literal>my-server</literal> should be the name of your
835 server application. You can test the procedure with the
836 <application>yaz-ztest</application> application.
842 <!-- Keep this comment at the end of the file
847 sgml-minimize-attributes:nil
848 sgml-always-quote-attributes:t
851 sgml-parent-document: "yaz.xml"
852 sgml-local-catalogs: "../../docbook/docbook.cat"
853 sgml-namecase-general:t