2 <title>The YAZ Proxy</title>
4 The YAZ proxy is a transparent Z39.50-to-Z39.50 gateway. That is,
5 it is a Z39.50 server which has as its back-end a Z39.50 client
6 that forwards requests on to another server (known as the
7 <firstterm>backend target</firstterm>.)
10 The YAZ Proxy is useful for debugging Z39.50 software, logging
11 APDUs, redirecting Z39.50 packages through firewalls, etc.
12 Furthermore, it offers facilities that often
13 boost performance for connectionless Z39.50 clients such
17 Unlike most other server software, the proxy runs single-threaded,
18 single-process. Every I/O operation
19 is non-blocking so it is very lightweight and extremely fast.
20 It does not store any state information on the hard drive,
21 except any log files you ask for.
24 <section id="proxy-example">
25 <title>Example: Using the Proxy to Log APDUs</title>
27 Suppose you use a commercial Z39.50 client for which you do not
28 have source code, and it's not behaving how you think it should
29 when running against some specific server that you have no control
30 over. One way to diagnose the problem is to find out what packets
31 (APDUs) are being sent and received, but not all client
32 applications have facilities to do APDU logging.
35 No problem. Run the proxy on a friendly machine, get it to log
36 APDUs, and point the errant client at the proxy instead of
37 directly at the server that's causing it problems.
40 Suppose the server is running on <literal>foo.bar.com</literal>,
41 port 18398. Run the proxy on the machine of your choice, say
42 <literal>your.company.com</literal> like this:
45 yaz-proxy -a - -t tcp:foo.bar.com:18398 tcp:@:9000
48 (The <literal>-a -</literal> option requests APDU logging on
49 standard output, <literal>-t tcp:foo.bar.com:18398</literal>
50 specifies where the backend target is, and
51 <literal>tcp:@:9000</literal> tells the proxy to listen on port
52 9000 and accept connections from any machine.)
55 Now change your client application's configuration so that instead
56 of connecting to <literal>foo.bar.com</literal> port 18398, it
57 connects to <literal>your.company.com</literal> port 9000, and
58 start it up. It will work exactly as usual, but all the packets
59 will be sent via the proxy, which will generate a log like this:
64 referenceId OCTETSTRING(len=4) 69 6E 69 74
65 protocolVersion BITSTRING(len=1)
66 options BITSTRING(len=2)
67 preferredMessageSize 1048576
68 maximumRecordSize 1048576
69 implementationId 'Mike Taylor (id=169)'
70 implementationName 'Net::Z3950.pm (Perl)'
71 implementationVersion '0.31'
75 referenceId OCTETSTRING(len=4) 69 6E 69 74
76 protocolVersion BITSTRING(len=1)
77 options BITSTRING(len=2)
78 preferredMessageSize 1048576
79 maximumRecordSize 1048576
82 implementationName 'GFS/YAZ / Zebra Information Server'
83 implementationVersion 'YAZ 1.9.1 / Zebra 1.3.3'
87 referenceId OCTETSTRING(len=1) 30
90 mediumSetPresentNumber 0
92 resultSetName 'default'
97 smallSetElementSetNames choice
101 mediumSetElementSetNames choice
104 preferredRecordSyntax OID: 1 2 840 10003 5 10
108 attributeSetId OID: 1 2 840 10003 3 1
116 general OCTETSTRING(len=7) 6D 69 6E 65 72 61 6C
125 <section id="proxy-target">
126 <title>Specifying the Backend Target</title>
128 When the proxy accepts a Z39.50 client session, it
129 determines the backend target by the following rules:
132 <para> If the <literal>InitializeRequest</literal> PDU from the
134 <link linkend="otherinfo-encoding"><literal>otherInfo</literal></link>
136 <literal>1.2.840.10003.10.1000.81.1</literal>, then the
137 contents of that element specify the target to be used, in the
138 usual YAZ address format (typically
139 <literal>tcp:<parameter>hostname</parameter>:<parameter>port</parameter></literal>)
141 <ulink url="http://www.indexdata.dk/yaz/doc/comstack.addresses.php"
142 >the Addresses section of the YAZ manual</ulink>.
146 <para> Otherwise, the Proxy uses the default target, if one was
147 specified on the command-line with the <literal>-t</literal>
148 option. A default target can also be specified in the
153 <para> Otherwise, the proxy closes the connection with
160 <section id="proxy-keepalive">
161 <title>Keep-alive Facility</title>
163 The keep-alive is a facility where the proxy keeps the connection to the
164 backend - even if the client closes the connection to the proxy.
167 If a new or another client connects to the proxy again and requests the
168 same backend it will be reassigned to this backend. In this case, the
169 proxy sends an initialize response directly to the client and an
170 initialize handshake with the backend is omitted.
173 When a client reconnects, query and record caching works better, if the
174 proxy assigns it to the same backend as before. And the result set
175 (if any) is re-used. To achive this, Index Data defined a session
176 cookie which identifies the backend session.
179 The cookie is defined by the client and is sent as part of the
180 Initialize Request and passed in an
181 <link linkend="otherinfo-encoding"><literal>otherInfo</literal></link>
182 element with OID <literal>1.2.840.10003.10.1000.81.2</literal>.
185 Clients that do not send a cookie as part of the initialize request
186 may still better performance, since the init handshake is saved.
190 <section id="query-cache">
191 <title>Query Caching</title>
193 Simple stateless clients often send identical Z39.50 searches
194 in a relatively short period of time (e.g. in order to produce a
195 results-list page, the next page,
196 a single full-record, etc). And for many targets, it's
197 much more expensive to produce a new result set than to
198 reuse an existing one.
201 The proxy tries to solve that by remembering the last query for each
202 backend target, so that if an identical query is received next, it
203 is turned into Present Requests rather than new Search Requests.
207 In a future we release will will probably allows for
208 an arbitrary-sized cache for targets supporting named result sets.
212 You can enable/disable query caching using option -o.
216 <section id="record-cache">
217 <title>Record Caching</title>
219 As an option, the proxy may also cache result set records for the
221 The proxy takes into account the Record Syntax and CompSpec.
222 The CompSpec includes simple element set names as well.
223 By default the cache is 200000 bytes per session.
227 <section id="query-validation">
228 <title>Query Validation</title>
230 The Proxy may also be configured to trap particular attributes in
231 Type-1 queries and send Bib-1 diagnostics back to the client without
232 even consulting the backend target. This facility may be useful if
233 a target does not properly issue diagnostics when unsupported attributes
238 <section id="record-validation">
239 <title>Record Syntax Validation</title>
241 The proxy may be configured to accept, reject or convert records.
242 When accepted, the target passes search/present requests to the
243 backend target under the assumption that the target can honor the
244 request (In fact it may not do that). When a record is rejected because
245 the record syntax is "unsupported" the proxy returns a diagnostic to the
246 client. Finally, the proxy may convert records.
249 In the current version the only supported conversion is
250 MARC21/USMARC in MARC-8 charset to MARCXML in UTF-8. Future version of
251 the proxy may do other record/charset conversions.
255 <section id="other-optimizations">
256 <title>Other Optimizations</title>
258 We've had some plans to support global caching of result set records,
259 but this has not yet been implemented.
263 <section id="proxy-config-file">
264 <title>Proxy Configuration File</title>
266 The Proxy as an option may read a configuration file using option
267 <literal>-c</literal> followed by the filename of a config file.
270 The config file is in XML format. The YAZ proxy must be compiled
271 with <ulink url="http://www.xmlsoft.org/">libxml2</ulink> support in
272 order for the config file facility to be enabled.
275 <para>To check for a config file to be well-formed, the yaz-proxy may
276 be invoked without specifying a listening port, i.e.
278 yaz-proxy -c myconfig.xml
280 If this does not produce errors, the file is well-formed.
283 <section id="proxy-config-header">
284 <title>Proxy Configuration Header</title>
286 The proxy config file must have a root element called
287 <literal>proxy</literal>. All information except an optional XML
288 header must be stored within the <literal>proxy</literal> element.
291 <?xml version="1.0"?>
294 <!-- content here .. -->
298 <section id="proxy-config-target">
299 <title>Configuration: target</title>
301 The element <literal>target</literal> which may be repeated zero
302 or more times with parent elemtn <literal>proxy</literal> contains
303 information about each backend target.
304 The <literal>target</literal> element have two attributes:
305 <literal>name</literal> which holds the logical name of the backend
306 target (required) and <literal>default</literal> (optional) which
307 (when given) specifies that the backend target is the default target -
308 equivalent to command line option <literal>-t</literal>.
312 <?xml version="1.0"?>
315 <target name="server1" default="1">
316 <!-- description of server1 .. -->
318 <target name="server2">
319 <!-- description of server2 .. -->
325 <section id="proxy-config-url">
326 <title>Configuration:url</title>
328 The <literal>url</literal> which may be repeated one or more times
329 should be the child of the <literal>target</literal> element.
330 The CDATA of <literal>url</literal> is the Z-URL of the backend.
333 Multiple <literal>url</literal> element may be used. In that case, then
334 a client initiates a session, the proxy chooses the URL with the lowest
335 number of active sessions, thereby distributing the load. It is
336 assumed that each URL represents the same database (data).
339 <section id="proxy-config-keepalive">
340 <title>Configuration: keepalive</title>
341 <para>The <literal>keepalive</literal> element holds information about
342 the keepalive Z39.50 sessions. Keepalive sessions are proxy-to-backend
343 sessions that is no longer associated with a client session.
345 <para>The <literal>keepalive</literal> element which is the child of
346 the <literal>target</literal>holds two elements:
347 <literal>bandwidth</literal> and <literal>pdu</literal>.
348 The <literal>bandwidth</literal> is the maximum total bytes
349 transferred to/from the target. If a target session exceeds this
350 amount it is shut down (and no longer kept alive).
351 The <literal>pdu</literal> is the maximum number of requests sent
352 to the target. If a target session exceeds this amount it is
353 shut down. The idea of these two limits is that avoid very long
354 sessions that eat resources in a backend (that leaks!).
357 <section id="proxy-config-limit">
358 <title>Configuration:limit</title>
360 The <literal>limit</literal> section specifies bandwidth/pdu requests
361 limits for an active session.
362 The proxy records bandwidth/pdu requests during the last 60 seconds
363 (1 minute). The <literal>limit</literal> may include the
364 elements <literal>bandwidth</literal>, <literal>pdu</literal>,
365 and <literal>retrieve</literal>. The <literal>bandwidth</literal>
366 measures the number of bytes transferred within the last minute.
367 The <literal>pdu</literal> is the number of requests in the last
368 minute. The <literal>retrieve</literal> holds the maximum records to
369 be retrived in one Present Request.
372 If a bandwidth/pdu limit is reached the proxy will postpone the
373 requests to the target and wait one or more seconds. The idea of the
374 limit is to ensure that clients that downloads hundreds or thousands of
375 records do not hurt other users.
379 <section id="proxy-config-attribute">
380 <title>Configuration: attribute</title>
382 The <literal>attribute</literal> element specifies accept or reject
383 or a particular attribute type, value pair.
386 The <literal>attribute</literal> has two required attributes:
387 <literal>type</literal> which is the Attribute Type-1 type, and
388 <literal>value</literal> which is the Attribute Type-1 value.
391 If attribute <literal>error</literal> is given, that holds a
392 Bib-1 diagnostic which is sent to the client if the particular
393 type, value is part of a query.
396 If attribute <literal>error</literal> is not given, the attribute
397 type, value is accepted and passed to the backend target.
401 <section id="proxy-config-syntax">
402 <title>Configuration: syntax</title>
404 The <literal>syntax</literal> element specifies accept or reject
405 or a particular record syntax request from the client.
408 The <literal>syntax</literal> has one equired attribute:
409 <literal>type</literal> which is the Preferred Record Syntax.
412 If attribute <literal>error</literal> is given, that holds a
413 Bib-1 diagnostic which is sent to the client if the particular
414 record syntax is part of a present - or search request.
417 If attribute <literal>error</literal> is not given, the record syntax
418 is accepted and passed to the backend target.
421 If attribute <literal>marcxml</literal> is given, the proxy will
422 perform MARC21 to MARCXML conversion. In this case the
423 <literal>type</literal> should be XML. The proxy will use
424 preferred record syntax USMARC/MARC21 against the backend target.
426 <para>To accept USMARC and offer MARCXML XML recors but reject
427 all other requests the following configuaration could be used:
430 <target name="mytarget">
431 <syntax type="usmarc"/>
432 <syntax type="xml" marcxml="1"/>
433 <syntax type="*" error="238"/>
440 <section id="proxy-config-target-timeout">
441 <title>Configuration: target-timeout</title>
443 The element <literal>target-timeout</literal> is the child of element
444 <literal>target</literal> and specifies the amount in seconds before
445 a target session is shut down.
449 <section id="proxy-config-client-timeout">
450 <title>Configuration: client-timeout</title>
452 The element <literal>client-timeout</literal> is the child of element
453 <literal>target</literal> and specifies the amount in seconds before
454 a client session is shut down.
458 <section id="proxy-config-preinit">
459 <title>Configuration: preinit</title>
461 The element <literal>preinit</literal> is the child of element
462 <literal>target</literal> and specifies the number of spare
463 connection to a target. By default no spare connection are
464 created by the proxy. If the proxy uses a target exclusive or
465 a lot, the preinit session will ensure that target sessions
466 have been made before the client makes a connection and will therefore
467 reduce the connect-init handshake dramatically. Never set this to
472 <section id="proxy-config-max-clients">
473 <title>Configuration: max-clients</title>
475 The element <literal>max-clients</literal> is the child of element
476 <literal>proxy</literal> and specifies the total number of
477 allowed connections to targets (all targets). If this limit
478 is reached the proxy will close the least recently used connection.
483 <section id="proxy-usage">
484 <title>Proxy Usage</title>
487 <refentry id="yaz-proxy">
491 <section id="otherinfo-encoding"><title>OtherInformation Encoding</title>
493 The proxy uses the OtherInformation definition to carry
494 information about the target address and cookie.
497 OtherInformation ::= [201] IMPLICIT SEQUENCE OF SEQUENCE{
498 category [1] IMPLICIT InfoCategory OPTIONAL,
500 characterInfo [2] IMPLICIT InternationalString,
501 binaryInfo [3] IMPLICIT OCTET STRING,
502 externallyDefinedInfo [4] IMPLICIT EXTERNAL,
503 oid [5] IMPLICIT OBJECT IDENTIFIER}}
505 InfoCategory ::= SEQUENCE{
506 categoryTypeId [1] IMPLICIT OBJECT IDENTIFIER OPTIONAL,
507 categoryValue [2] IMPLICIT INTEGER}
510 The <literal>categoryTypeId</literal> is either
511 OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2
512 for proxy target and proxy cookie respectively. The
513 integer element <literal>category</literal> is set to 0.
514 The value proxy and cookie is stored in element
515 <literal>characterInfo</literal> of the <literal>information</literal>
520 <!-- Keep this comment at the end of the file
525 sgml-minimize-attributes:nil
526 sgml-always-quote-attributes:t
529 sgml-parent-document: "yaz++.xml"
530 sgml-local-catalogs: nil
531 sgml-namecase-general:t