1 <!-- $Id: frontend.xml,v 1.3 2001-07-19 23:29:40 adam Exp $ -->
2 <chapter><title id="server">Making an IR Server for Your Database</title>
4 <sect1><title>Introduction</title>
7 If you aren't into documentation, a good way to learn how the
8 backend interface works is to look at the <filename>backend.h</filename>
9 file. Then, look at the small dummy-server in
10 <filename>ztest/ztest.c</filename>. Finally, you can have a look at
11 the <filename>seshigh.c</filename> file, which is where most of the
12 logic of the frontend server is located. The <filename>backend.h</filename>
13 file also makes a good reference, once you've chewed your way through
14 the prose of this file.
18 If you have a database system that you would like to make available by
19 means of Z39.50, &yaz; basically offers your two options. You
20 can use the APIs provided by the &asn;, &odr;, and &comstack;
22 create and decode PDUs, and exchange them with a client.
23 Using this low-level interface gives you access to all fields and
24 options of the protocol, and you can construct your server as close
25 to your existing database as you like.
26 It is also a fairly involved process, requiring
27 you to set up an event-handling mechanism, protocol state machine,
28 etc. To simplify server implementation, we have implemented a compact
29 and simple, but reasonably full-functioned server-frontend that will
30 handle most of the protocol mechanics, while leaving you to
31 concentrate on your database interface.
36 The backend interface was designed in anticipation of a specific
37 integration task, while still attempting to achieve some degree of
38 generality. We realise fully that there are points where the
39 interface can be improved significantly. If you have specific
40 functions or parameters that you think could be useful, send us a
41 mail (or better, sign on to the mailing list referred to in the
42 toplevel README file). We will try to fit good suggestions into future
43 releases, to the extent that it can be done without requiring
44 too many structural changes in existing applications.
49 <sect1><title>The Database Frontend</title>
52 We refer to this software as a generic database frontend. Your
53 database system is the <emphasis>backend database</emphasis>, and the
54 interface between the two is called the <emphasis>backend API</emphasis>.
55 The backend API consists of a small number of function handlers and
56 structure definitions. You are required to provide the
57 <function>main()</function> routine for the server (which can be
58 quite simple), as well as a set of handlers to match each of the prototypes.
59 The interface functions that you write can use any mechanism you like
60 to communicate with your database system: You might link the whole
61 thing together with your database application and access it by
62 function calls; you might use IPC to talk to a database server
63 somewhere; or you might link with third-party software that handles
64 the communication for you (like a commercial database client library).
65 At any rate, the handlers will perform the tasks of:
83 Scanning the database index (optional - if you wish to implement SCAN).
87 Extended Services (optional).
91 Result-Set Delete (optional).
95 Result-Set Sort (optional).
101 (more functions will be added in time to support as much of
102 Z39.50-1995 as possible).
106 <sect1><title>The Backend API</title>
109 The headers files that you need to use the interface are in the
110 <filename>include/yaz</filename> directory. They are called
111 <filename>statserv.h</filename> and <filename>backend.h</filename>. They
112 will include other files from the <filename>include/yaz</filename>
113 directory, so you'll probably want to use the -I option of your
114 compiler to tell it where to find the files. When you run
115 <literal>make</literal> in the toplevel &yaz; directory,
116 everything you need to create your server is put the
117 <filename>lib/libyaz.a</filename> library.
121 <sect1><title>Your main() Routine</title>
124 As mentioned, your <function>main()</function> routine can be quite brief.
125 If you want to initialize global parameters, or read global configuration
126 tables, this is the place to do it. At the end of the routine, you should
131 int statserv_main(int argc, char **argv,
132 bend_initresult *(*bend_init)(bend_initrequest *r),
133 void (*bend_close)(void *handle));
137 The third and fourth arguments are pointers to handlers. Handler
138 <function>bend_init</function> is called whenever the server receives
139 an Initialize Request, so it serves as a Z39.50 session initializer. The
140 <function>bend_close</function> handler is called when the session is
145 <function>statserv_main</function> will establish listening sockets
146 according to the parameters given. When connection requests are received,
147 the event handler will typically <function>fork()</function> and
148 create a sub-process to handle a new connection.
149 Alternatively the server may be setup to create threads for each connection.
150 If you do use global variables and forking, you should be aware, then,
151 that these cannot be shared between associations, unless you explicitly
152 disable forking by command line parameters.
156 The server provides a mechanism for controlling some of its behavior
157 without using command-line options. The function
161 statserv_options_block *statserv_getcontrol(void);
165 Will return a pointer to a <literal>struct statserv_options_block</literal>
166 describing the current default settings of the server. The structure
167 contains these elements:
170 <varlistentry><term>int dynamic</term><listitem><para>
171 A boolean value, which determines whether the server
172 will fork on each incoming request (TRUE), or not (FALSE). Default is
173 TRUE. This flag is only read by UNIX-based servers (WIN32 based servers
175 </para></listitem></varlistentry>
177 <varlistentry><term>int threads</term><listitem><para>
178 A boolean value, which determines whether the server
179 will create a thread on each incoming request (TRUE), or not (FALSE).
180 Default is FALSE. This flag is only read by UNIX-based servers that offer
181 POSIX Threads support. WIN32-based servers always operate in threaded mode.
182 </para></listitem></varlistentry>
184 <varlistentry><term>int inetd</term><listitem><para>
185 A boolean value, which determines whether the server
186 will operates under a UNIX INET daemon (inetd). Default is FALSE.
187 </para></listitem></varlistentry>
189 <varlistentry><term>int loglevel</term><listitem><para>
190 Set this by ORing the constants defined in
191 <filename>include/yaz/yaz-log.h</filename>.
192 </para></listitem></varlistentry>
194 <varlistentry><term>char logfile[ODR_MAXNAME+1]</term>
195 <listitem><para>File for diagnostic output ("": stderr).
196 </para></listitem></varlistentry>
197 <varlistentry><term>char apdufile[ODR_MAXNAME+1]</term>
199 Name of file for logging incoming and outgoing APDUs ("": don't
200 log APDUs, "-": <literal>stderr</literal>).
201 </para></listitem></varlistentry>
203 <varlistentry><term>char default_listen[1024]</term>
204 <listitem><para>Same form as the command-line specification of
205 listener address. "": no default listener address.
206 Default is to listen at "tcp:@:9999". You can only
207 specify one default listener address in this fashion.
208 </para></listitem></varlistentry>
210 <varlistentry><term>enum oid_proto default_proto;</term>
211 <listitem><para>Either <literal>PROTO_SR</literal> or
212 <literal>PROTO_Z3950</literal>. Default is <literal>PROTO_Z39_50</literal>.
213 </para></listitem></varlistentry>
214 <varlistentry><term>int idle_timeout;</term>
215 <listitem><para>Maximum session idletime, in minutes. Zero indicates
216 no (infinite) timeout. Default is 120 minutes.
217 </para></listitem></varlistentry>
219 <varlistentry><term>int maxrecordsize;</term>
220 <listitem><para>Maximum permissible record (message) size. Default
221 is 1Mb. This amount of memory will only be allocated if a client requests a
222 very large amount of records in one operation (or a big record). Set it
224 if you are worried about resource consumption on your host system.
225 </para></listitem></varlistentry>
227 <varlistentry><term>char configname[ODR_MAXNAME+1]</term>
228 <listitem><para>Passed to the backend when a new connection is received.
229 </para></listitem></varlistentry>
231 <varlistentry><term>char setuid[ODR_MAXNAME+1]</term>
232 <listitem><para>Set user id to the user specified, after binding
233 the listener addresses.
234 </para></listitem></varlistentry>
237 <term>void (*bend_start)(struct statserv_options_block *p)</term>
238 <listitem><para>Pointer to function which is called after the
239 command line options have been parsed - but before the server
241 For forked UNIX servers this handler is called in the mother
242 process; for threaded servers this handler is called in the
244 The default value of this pointer is NULL in which case it
245 isn't invoked by the frontend server.
246 When the server operates as an NT service this handler is called
247 whenever the service is started.
248 </para></listitem></varlistentry>
251 <term>void (*bend_stop)(struct statserv_options_block *p)</term>
252 <listitem><para>Pointer to function which is called whenver the server
253 has stopped listening for incoming connections. This function pointer
254 has a default value of NULL in which case it isn't called.
255 When the server operates as an NT service this handler is called
256 whenever the service is stopped.
257 </para></listitem></varlistentry>
259 <varlistentry><term>void *handle</term>
260 <listitem><para>User defined pointer (default value NULL).
261 This is a per-server handle that can be used to specify "user-data".
262 Do not confuse this with the session-handle as returned by bend_init.
263 </para></listitem></varlistentry>
269 The pointer returned by <literal>statserv_getcontrol</literal> points to
270 a static area. You are allowed to change the contents of the structure,
271 but the changes will not take effect before you call
275 void statserv_setcontrol(statserv_options_block *block);
280 that you should generally update this structure before calling
281 <function>statserv_main()</function>.
286 <sect1><title>The Backend Functions</title>
289 For each service of the protocol, the backend interface declares one or
290 two functions. You are required to provide implementations of the
291 functions representing the services that you wish to implement.
294 <sect2><title>Init</title>
297 bend_initresult (*bend_init)(bend_initrequest *r);
301 This handler is called once for each new connection request, after
302 a new process/thread has been created, and an Initialize Request has
303 been received from the client. The pointer to the
304 <function>bend_init</function> handler is passed in the call to
305 <function>statserv_start</function>.
308 Unlike previous versions of YAZ, the <function>bend_init</function> also
309 serves as a handler that defines the Z39.50 services that the backend
310 wish to support. Pointers to <emphasis>all</emphasis> service handlers,
311 including search - and fetch must be specified here in this handler.
314 The request - and result structures are defined as
318 typedef struct bend_initrequest
320 Z_IdAuthentication *auth;
321 ODR stream; /* encoding stream */
322 ODR print; /* printing stream */
323 Z_ReferenceId *referenceId;/* reference ID */
324 char *peer_name; /* dns host of peer (client) */
326 char *implementation_name;
327 char *implementation_version;
328 int (*bend_sort) (void *handle, bend_sort_rr *rr);
329 int (*bend_search) (void *handle, bend_search_rr *rr);
330 int (*bend_fetch) (void *handle, bend_fetch_rr *rr);
331 int (*bend_present) (void *handle, bend_present_rr *rr);
332 int (*bend_esrequest) (void *handle, bend_esrequest_rr *rr);
333 int (*bend_delete)(void *handle, bend_delete_rr *rr);
334 int (*bend_scan)(void *handle, bend_scan_rr *rr);
335 int (*bend_segment)(void *handle, bend_segment_rr *rr);
338 typedef struct bend_initresult
340 int errcode; /* 0==OK */
341 char *errstring; /* system error string or NULL */
342 void *handle; /* private handle to the backend module */
347 In general, the server frontend expects that the
348 <literal>bend_*result</literal> pointer that you return is valid at
349 least until the next call to a <literal>bend_* function</literal>.
350 This applies to all of the functions described herein. The parameter
351 structure passed to you in the call belongs to the server frontend, and
352 you should not make assumptions about its contents after the current
353 function call has completed. In other words, if you want to retain any
354 of the contents of a request structure, you should copy them.
358 The <literal>errcode</literal> should be zero if the initialization of
359 the backend went well. Any other value will be interpreted as an error.
360 The <literal>errstring</literal> isn't used in the current version, but
361 one option would be to stick it in the initResponse as a VisibleString.
362 The <literal>handle</literal> is the most important parameter. It should
363 be set to some value that uniquely identifies the current session to
364 the backend implementation. It is used by the frontend server in any
365 future calls to a backend function.
366 The typical use is to set it to point to a dynamically allocated state
367 structure that is private to your backend module.
371 The <literal>auth</literal> member holds the authentication information
372 part of the Z39.50 Initialize Request. Interpret this if your serves
373 requires authentication.
377 The members <literal>peer_name</literal>,
378 <literal>implementation_name</literal> and
379 <literal>implementation_version</literal> holds DNS of client, name
380 of client (Z39.50) implementation - and version.
384 The <literal>bend_</literal> - members are set to NULL when
385 <function>bend_init</function> is called. Modify the pointers by
386 setting them to point to backend functions.
391 <sect2><title>Search and retrieve</title>
393 <para>We now describe the handlers that are required to support search -
394 and retrieve. You must support two functions - one for seearch - and one
395 for fetch (retrieval of one record). If desirable you can provide a
396 third handler which is called when a present request is received which
397 allows you to optimize retrieval of multiple-records.
401 int (*bend_search) (void *handle, bend_search_rr *rr);
404 char *setname; /* name to give to this set */
405 int replace_set; /* replace set, if it already exists */
406 int num_bases; /* number of databases in list */
407 char **basenames; /* databases to search */
408 Z_ReferenceId *referenceId;/* reference ID */
409 Z_Query *query; /* query structure */
410 ODR stream; /* encode stream */
411 ODR decode; /* decode stream */
412 ODR print; /* print stream */
414 bend_request request;
415 bend_association association;
417 int hits; /* number of hits */
418 int errcode; /* 0==OK */
419 char *errstring; /* system error string or NULL */
425 The <function>bend_search</function> handler is a fairly close
426 approximation of a protocol Search Request - and Response PDUs
427 The <literal>setname</literal> is the resultSetName from the protocol.
428 You are required to establish a mapping between the set name and whatever
429 your backend database likes to use.
430 Similarly, the <literal>replace_set</literal> is a boolean value
431 corresponding to the resultSetIndicator field in the protocol.
432 <literal>num_bases/basenames</literal> is a length of/array of character
433 pointers to the database names provided by the client.
434 The <literal>query</literal> is the full query structure as defined in the
435 protocol ASN.1 specification.
436 It can be either of the possible query types, and it's up to you to
437 determine if you can handle the provided query type.
438 Rather than reproduce the C interface here, we'll refer you to the
439 structure definitions in the file
440 <filename>include/yaz/z-core.h</filename>. If you want to look at the
441 attributeSetId OID of the RPN query, you can either match it against
442 your own internal tables, or you can use the
443 <literal>oid_getentbyoid</literal> function provided by &yaz;.
447 The structure contains a number of hits, and an
448 <literal>errcode/errstring</literal> pair. If an error occurs
449 during the search, or if you're unhappy with the request, you should
450 set the errcode to a value from the BIB-1 diagnostic set. The value
451 will then be returned to the user in a nonsurrogate diagnostic record
452 in the response. The <literal>errstring</literal>, if provided, will
453 go in the addinfo field. Look at the protocol definition for the
454 defined error codes, and the suggested uses of the addinfo field.
459 int (*bend_fetch) (void *handle, bend_fetch_rr *rr);
461 typedef struct bend_fetch_rr {
462 char *setname; /* set name */
463 int number; /* record number */
464 Z_ReferenceId *referenceId;/* reference ID */
465 oid_value request_format; /* One of the CLASS_RECSYN members */
466 int *request_format_raw; /* same as above (raw OID) */
467 Z_RecordComposition *comp; /* Formatting instructions */
468 ODR stream; /* encoding stream - memory source if req */
469 ODR print; /* printing stream */
471 char *basename; /* name of database that provided record */
472 int len; /* length of record or -1 if structured */
473 char *record; /* record */
474 int last_in_set; /* is it? */
475 oid_value output_format; /* format */
476 int *output_format_raw; /* used instead of above if not-null */
477 int errcode; /* 0==success */
478 char *errstring; /* system error string or NULL */
479 int surrogate_flag; /* surrogate diagnostic */
484 The frontend server calls the <function>bend_fetch</function> handler
485 when it needs database records to fulfill a Search Request or a Present
487 The <literal>setname</literal> is simply the name of the result set
488 that holds the reference to the desired record.
489 The <literal>number</literal> is the offset into the set (with 1
490 being the first record in the set). The <literal>format</literal> field
491 is the record format requested by the client (See section
492 <link linkend="oid">Object Identifiers</link>). The value
493 <literal>VAL_NONE</literal> indicates that the client did not
494 request a specific format. The <literal>stream</literal> argument
495 is an &odr; stream which should be used for
496 allocating space for structured data records.
497 The stream will be reset when all records have been assembled, and
498 the response package has been transmitted.
499 For unstructured data, the backend is responsible for maintaining a static
500 or dynamic buffer for the record between calls.
504 In the structure, the <literal>basename</literal> is the name of the
505 database that holds the
506 record. <literal>len</literal> is the length of the record returned, in
507 bytes, and <literal>record</literal> is a pointer to the record.
508 <literal>Last_in_set</literal> should be nonzero only if the record
509 returned is the last one in the given result set.
510 <literal>errcode</literal> and <literal>errstring</literal>, if
511 given, will be interpreted as a global error pertaining to the
512 set, and will be returned in a non-surrogate-diagnostic.
513 If you wish to return the error as a surrogate-diagnostic
514 (local error) you can do this by setting
515 <literal>surrogate_flag</literal> to 1 also.
519 If the <literal>len</literal> field has the value -1, then
520 <literal>record</literal> is assumed to point to a constructed data
521 type. The <literal>format</literal> field will be used to determine
522 which encoder should be used to serialize the data.
527 If your backend generates structured records, it should use
528 <function>odr_malloc()</function> on the provided stream for allocating
529 data: This allows the frontend server to keep track of the record sizes.
534 The <literal>format</literal> field is mapped to an object identifier
535 in the direct reference of the resulting EXTERNAL representation
541 The current version of &yaz; only supports the direct reference mode.
546 int (*bend_present) (void *handle, bend_present_rr *rr);
549 char *setname; /* set name */
551 int number; /* record number */
552 oid_value format; /* One of the CLASS_RECSYN members */
553 Z_ReferenceId *referenceId;/* reference ID */
554 Z_RecordComposition *comp; /* Formatting instructions */
555 ODR stream; /* encoding stream - memory source if required */
556 ODR print; /* printing stream */
557 bend_request request;
558 bend_association association;
560 int hits; /* number of hits */
561 int errcode; /* 0==OK */
562 char *errstring; /* system error string or NULL */
567 The <function>bend_present</function> handler is called when
568 the server receives a Present Request. The <literal>setname</literal>,
569 <literal>start</literal> and <literal>number</literal> is the
570 name of the result set - start position - and number of records to
571 be retrieved respectively. <literal>format</literal> and
572 <literal>comp</literal> is the preferred transfer syntax and element
573 specifications of the present request.
576 Note that this is handler serves as a supplement for
577 <function>bend_fetch</function> and need not to be defined in order to
578 support search - and retrieve.
583 <sect2><title>Delete</title>
586 For backends that supports delete of a result set only one handler
591 int (*bend_delete)(void *handle, bend_delete_rr *rr);
593 typedef struct bend_delete_rr {
597 Z_ReferenceId *referenceId;
598 int delete_status; /* status for the whole operation */
599 int *statuses; /* status each set - indexed as setnames */
607 The delete set function definition is rather primitive, mostly because we
608 have had no practical need for it as of yet. If someone wants
609 to provide a full delete service, we'd be happy to add the
610 extra parameters that are required. Are there clients out there
611 that will actually delete sets they no longer need?
617 <sect2><title>scan</title>
620 For servers that wish to offer the scan service one handler
625 int (*bend_delete)(void *handle, bend_delete_rr *rr);
628 BEND_SCAN_SUCCESS, /* ok */
629 BEND_SCAN_PARTIAL /* not all entries could be found */
632 typedef struct bend_scan_rr {
633 int num_bases; /* number of elements in databaselist */
634 char **basenames; /* databases to search */
635 oid_value attributeset;
636 Z_ReferenceId *referenceId; /* reference ID */
637 Z_AttributesPlusTerm *term;
638 ODR stream; /* encoding stream - memory source if required */
639 ODR print; /* printing stream */
641 int *step_size; /* step size */
642 int term_position; /* desired index of term in result list/returned */
643 int num_entries; /* number of entries requested/returned */
645 struct scan_entry *entries;
646 bend_scan_status status;
654 <sect1><title>Application Invocation</title>
657 The finished application has the following
658 invocation syntax (by way of <function>statserv_main()</function>):
662 <replaceable>appname</replaceable> [-szSiTu -a <replaceable>apdufile</replaceable> -l <replaceable>logfile</replaceable> -v <replaceable>loglevel</replaceable> -c <replaceable>config</replaceable>]
663 [listener ...]
671 <varlistentry><term>-a <replaceable>file</replaceable></term>
673 Specify a file for dumping PDUs (for diagnostic purposes).
674 The special name "-" sends output to
675 <literal>stderr</literal>.
676 </para></listitem></varlistentry>
678 <varlistentry><term>-S</term>
680 Don't fork or make threads on connection requests. This is good for
681 debugging, but not recommended for real operation: Although the
682 server is asynchronous and non-blocking, it can be nice to keep
683 a software malfunction (okay then, a crash) from affecting all
685 </para></listitem></varlistentry>
687 <varlistentry><term>-T</term>
689 Operate the server in threaded mode. The server creates a thread
690 for each connection rather than a fork a process. Only available
691 on UNIX systems that offers POSIX threads.
692 </para></listitem></varlistentry>
694 <varlistentry><term>-s</term>
696 Use the SR protocol (obsolete).
697 </para></listitem></varlistentry>
699 <varlistentry><term>-z</term>
701 Use the Z39.50 protocol (default). These two options complement
702 each other. You can use both multiple times on the same command
703 line, between listener-specifications (see below). This way, you
704 can set up the server to listen for connections in both protocols
705 concurrently, on different local ports.
706 </para></listitem></varlistentry>
708 <varlistentry><term>-l <replaceable>file</replaceable></term>
709 <listitem><para>The logfile.
710 </para></listitem></varlistentry>
712 <varlistentry><term>-c <replaceable>config</replaceable></term>
713 <listitem><para>A user option that serves as a specifier for some
714 sort of configuration, e.g. a filename.
715 The argument to this option is transferred to member
716 <literal>configname</literal>of the
717 <literal>statserv_options_block</literal>.
718 </para></listitem></varlistentry>
720 <varlistentry><term>-v <replaceable>level</replaceable></term>
722 The log level. Use a comma-separated list of members of the set
723 {fatal,debug,warn,log,all,none}.
724 </para></listitem></varlistentry>
726 <varlistentry><term>-u <replaceable>userid</replaceable></term>
728 Set user ID. Sets the real UID of the server process to that of the
729 given user. It's useful if you aren't comfortable with having the
730 server run as root, but you need to start it as such to bind a
732 </para></listitem></varlistentry>
734 <varlistentry><term>-w <replaceable>dir</replaceable></term>
737 </para></listitem></varlistentry>
739 <varlistentry><term>-i</term>
741 Use this when running from the <application>inetd</application> server.
742 </para></listitem></varlistentry>
744 <varlistentry><term>-t <replaceable>minutes</replaceable></term>
746 Idle session timeout, in minutes.
747 </para></listitem></varlistentry>
749 <varlistentry><term>-k <replaceable>size</replaceable></term>
751 Maximum record size/message size, in kilobytes.
752 </para></listitem></varlistentry>
758 A listener specification consists of a transport mode followed by a
759 colon (:) followed by a listener address. The transport mode is
760 either <literal>osi</literal> or <literal>tcp</literal>.
764 For TCP, an address has the form
768 hostname | IP-number [: portnumber]
772 The port number defaults to 210 (standard Z39.50 port).
776 For osi, the address form is
780 [t-selector /] hostname | IP-number [: portnumber]
784 The transport selector is given as a string of hex digits (with an even
785 number of digits). The default port number is 102 (RFC1006 port).
795 osi:0402/dbserver.osiworld.com:3000
799 In both cases, the special hostname "@" is mapped to
800 the address INADDR_ANY, which causes the server to listen on any local
801 interface. To start the server listening on the registered ports for
802 Z39.50 and SR over OSI/RFC1006, and to drop root privileges once the
803 ports are bound, execute the server like this (from a root shell):
807 my-server -u daemon tcp:@ -s osi:@
811 You can replace <literal>daemon</literal> with another user, eg. your
812 own account, or a dedicated IR server account.
813 <literal>my-server</literal> should be the name of your
814 server application. You can test the procedure with the
815 <application>yaz-ztest</application> application.
821 <!-- Keep this comment at the end of the file
826 sgml-minimize-attributes:nil
827 sgml-always-quote-attributes:t
830 sgml-parent-document: "yaz.xml"
831 sgml-local-catalogs: "../../docbook/docbook.cat"
832 sgml-namecase-general:t