1 <!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook V4.4//EN"
2 "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
3 <!ENTITY copyright SYSTEM "copyright.xml">
4 <!ENTITY % idcommon SYSTEM "common/common.ent">
7 <refentry id="ref-zoom">
9 <productname>Metaproxy</productname>
10 <info><orgname>Index Data</orgname></info>
14 <refentrytitle>zoom</refentrytitle>
15 <manvolnum>3mp</manvolnum>
16 <refmiscinfo class="manual">Metaproxy Module</refmiscinfo>
20 <refname>zoom</refname>
21 <refpurpose>Metaproxy ZOOM Module</refpurpose>
25 <title>DESCRIPTION</title>
27 This filter implements a generic client based on
28 <ulink url="&url.yaz.zoom;">ZOOM</ulink> of YAZ.
29 The client implements the protocols that ZOOM C does: Z39.50, SRU
30 (GET, POST, SOAP) and Solr .
34 This filter only deals with Z39.50 on input. The following services
35 are supported: init, search, present and close. The backend target
36 is selected based on the database as part search and
37 <emphasis>not</emphasis> as part of init.
41 This filter is an alternative to the z3950_client filter but also
42 shares properties of the virt_db - in that the target is selected
43 for a specific database
47 The ZOOM filter relies on a target profile description, which is
48 XML based. It picks the profile for a given database from a web service
49 or it may be locally given for each unique database (AKA virtual database
50 in virt_db). Target profiles are directly and indrectly given as part
51 of the <literal>torus</literal> element in the configuration.
57 <title>CONFIGURATION</title>
59 The configuration consists of six parts: <literal>torus</literal>,
60 <literal>fieldmap</literal>, <literal>cclmap</literal>,
61 <literal>contentProxy</literal>, <literal>log</literal>
62 and <literal>zoom</literal>.
67 The <literal>torus</literal> element specifies target profiles
68 and takes the following content:
72 <term>attribute <literal>url</literal></term>
75 URL of Web service to be used when fetch target profiles from
76 a remote service (Torus normally).
79 The sequence <literal>%query</literal> is replaced with a CQL
80 query for the Torus search.
83 The special sequence <literal>%realm</literal> is replaced by value
84 of attribute <literal>realm</literal> or by realm DATABASE argument.
87 The special sequence <literal>%db</literal> is replaced with
88 a single database while searching. Note that this sequence
89 is no longer needed, because the <literal>%query</literal> can already
90 query for a single database by using CQL query
91 <literal>udb==...</literal>.
96 <term>attribute <literal>content_url</literal></term>
99 URL of Web service to be used to fetch target profile
100 for a given database (udb) of type content. Semantics otherwise like
101 <literal>url</literal> attribute above.
105 <varlistentry id="auth_url">
106 <term>attribute <literal>auth_url</literal></term>
109 URL of Web service to be used to for auth/IP lookup. If this
110 defined, all access is granted or denied as part of Z39.50 Init
111 by the ZOOM module and the use of database parameters realm and
112 torus_url is not allowed. If this setting is not defined,
113 all access is allowed and realm and/or torus_url may be used.
117 <varlistentry id="auth_hostname">
118 <term>attribute <literal>auth_hostname</literal></term>
121 Limits IP lookup to a given logical hostname.
126 <term>attribute <literal>realm</literal></term>
129 The default realm value. Used for %realm in URL, unless
130 specified in DATABASE parameter.
135 <term>attribute <literal>proxy</literal></term>
138 HTTP proxy to bse used for fetching target profiles.
143 <term>attribute <literal>xsldir</literal></term>
146 Directory that is searched for XSL stylesheets. Stylesheets
147 are specified in the target profile by the
148 <literal>transform</literal> element.
153 <term>attribute <literal>element_transform</literal></term>
156 Specifies the element that triggers retrieval and transform using
157 the parameters elementSet, recordEncoding, requestSyntax, transform
158 from the target profile. Default value
159 is "pz2", due to the fact that for historical reasons the
160 common format is that used in Pazpar2.
165 <term>attribute <literal>element_raw</literal></term>
168 Specifies an element that triggers retrieval using the
169 parameters elementSet, recordEncoding, requestSyntax from the
170 target profile. Same actions as for element_transform, but without
171 the XSL transform. Useful for debugging.
172 The default value is "raw".
177 <term>attribute <literal>explain_xsl</literal></term>
180 Specifies a stylesheet that converts one or more Torus records
181 to ZeeExplain records. The content of recordData is assumed to be
182 holding each Explain record.
187 <term>attribute <literal>record_xsl</literal></term>
190 Specifies a stylesheet that converts retrieval records after
191 transform/literal operations.
194 When Metaproxy creates a content proxy session, the XSL parameter
195 <literal>cproxyhost</literal> is passed to the transform.
200 <term>element <literal>records</literal></term>
203 Local target profiles. This element may includes zero or
204 more <literal>record</literal> elements (one per target
205 profile). See section TARGET PROFILE.
211 <refsect2 id="fieldmap">
212 <title>fieldmap</title>
214 The <literal>fieldmap</literal> may be specified zero or more times and
215 specifies the map from CQL fields to CCL fields and takes the
220 <term>attribute <literal>cql</literal></term>
223 CQL field that we are mapping "from".
228 <term>attribute <literal>ccl</literal></term>
231 CCL field that we are mapping "to".
237 <refsect2 id="cclmap_base">
238 <title>cclmap</title>
240 The third part of the configuration consists of zero or more
241 <literal>cclmap</literal> elements that specifies
242 <emphasis>base</emphasis> CCL profile to be used for all targets.
243 This configuration, thus, will be combined with cclmap-definitions
244 from the target profile.
248 <title>contentProxy</title>
250 The <literal>contentProxy</literal> element controls content proxy'in.
252 is optional and must only be defined if content proxy'ing is enabled.
256 <term>attribute <literal>config_file</literal></term>
259 Specifies the file that configures the cf-proxy system. Metaproxy
260 uses setting <literal>sessiondir</literal> and
261 <literal>proxyhostname</literal> from that file to configure
262 name of proxy host and directory of parameter files for the cf-proxy.
267 <term>attribute <literal>server</literal></term>
270 Specifies the content proxy host. The host is of the form
271 host[:port]. That is without a method (such as HTTP) and optional
276 This setting is deprecated. Use the config_file (above)
277 to inform about the proxy server.
283 <term>attribute <literal>tmp_file</literal></term>
286 Specifies a filename of a session file for content proxy'ing. The
287 file should be an absolute filename that includes
288 <literal>XXXXXX</literal> which is replaced by a unique filename
289 using the mkstemp(3) system call. The default value of this
290 setting is <literal>/tmp/cf.XXXXXX.p</literal>.
294 This setting is deprecated. Use the config_file (above)
295 to inform about the session file area.
305 The <literal>log</literal> element controls logging for the
310 <term>attribute <literal>apdu</literal></term>
313 If the value of apdu is "true", then protocol packages
314 (APDUs and HTTP packages) from the ZOOM filter will be
315 logged to the yaz_log system. A value of "false" will
316 not perform logging of protocol packages (the default
327 The <literal>zoom</literal> element controls settings for the
332 <term>attribute <literal>timeout</literal></term>
335 Is an integer that specifies, in seconds, how long an operation
336 may take before ZOOM gives up. Default value is 40.
341 <term>attribute <literal>proxy_timeout</literal></term>
344 Is an integer that specifies, in seconds, how long an operation
345 a proxy check will wait before giving up. Default value is 1.
354 <title>QUERY HANDLING</title>
356 The ZOOM filter accepts three query types: RPN(Type-1), CCL and
360 Queries are converted in two separate steps. In the first step
361 the input query is converted to RPN/Type-1. This is always
362 the common internal format between step 1 and step 2.
363 In step 2 the query is converted to the native query type of the target.
366 Step 1: for RPN, the query is passed un-modified to the target.
369 Step 1: for CCL, the query is converted to RPN via
370 <link linkend="zoom-torus-cclmap"><literal>cclmap</literal></link>
372 the target profile as well as
373 <link linkend="cclmap_base">base CCL maps</link>.
376 Step 1: For CQL, the query is converted to CCL. The mappings of
377 CQL fields to CCL fields are handled by
378 <link linkend="fieldmap"><literal>fieldmap</literal></link>
379 elements as part of the target profile. The resulting query, CCL,
380 is the converted to RPN using the schema mentioned earlier (via
381 <literal>cclmap</literal>).
384 Step 2: If the target is Z39.50-based, it is passed verbatim (RPN).
385 If the target is SRU-based, the RPN will be converted to CQL.
386 If the target is Solr-based, the RPN will be converted to Solr's query
392 <title>SORTING</title>
394 The ZOOM module actively handle CQL sorting - using the SORTBY parameter
395 which was introduced in SRU version 1.2. The conversion from SORTBY clause
396 to native sort for some target is driven by the two parameters:
397 <link linkend="zoom-torus-sortStrategy">
398 <literal>sortStrategy</literal>
400 and <link linkend="zoom-torus-sortmap">
401 <literal>sortmap_</literal><replaceable>field</replaceable>
405 If a sort field that does not have an equivalent
406 <literal>sortmap_</literal>-mapping is passed un-modified through the
407 conversion. It doesn't throw a diagnostic.
412 <title>TARGET PROFILE</title>
414 The ZOOM module is driven by a number of settings that specifies how
415 to handle each target.
416 Note that unknown elements are silently <emphasis>ignored</emphasis>.
419 The elements, in alphabetical order, are:
423 <term id="zoom-torus-authentication">authentication</term><listitem>
425 Authentication parameters to be sent to the target. For
426 Z39.50 targets, this will be sent as part of the
427 Init Request. Authentication consists of two components: username
428 and password, separated by a slash.
431 If this value is omitted or empty no authentication information is sent.
437 <term id="zoom-torus-authenticationMode">authenticationMode</term><listitem>
439 Specifies how authentication parameters are passed to server
440 for SRU. Possible values are: <literal>url</literal>
441 and <literal>basic</literal>. For the url mode username and password
442 are carried in URL arguments x-username and x-password.
443 For the basic mode, HTTP basic authentication is used.
444 The settings only takes effect
445 if <link linkend="zoom-torus-authentication">authentication</link>
449 If this value is omitted HTTP basic authencation is used.
454 <varlistentry id="zoom-torus-cclmap">
455 <term>cclmap_<replaceable>field</replaceable></term><listitem>
457 This value specifies CCL field (qualifier) definition for some
458 field. For Z39.50 targets this most likely will specify the
459 mapping to a numeric use attribute + a structure attribute.
460 For SRU targets, the use attribute should be string based, in
461 order to make the RPN to CQL conversion work properly (step 2).
467 <term>cfAuth</term><listitem>
469 When cfAuth is defined, its value will be used as authentication
470 to backend target and authentication setting will be specified
471 as part of a database. This is like a "proxy" for authentication and
472 is used for Connector Framework based targets.
478 <term id="zoom-torus-cfproxy">cfProxy</term><listitem>
480 Specifies HTTP proxy for the target in the form
481 <replaceable>host</replaceable>:<replaceable>port</replaceable>.
487 <term>cfSubDB</term><listitem>
489 Specifies sub database for a Connector Framework based target.
494 <varlistentry id="zoom-torus-contentConnector">
495 <term>contentConnector</term><listitem>
497 Specifies a database for content-based proxy'ing.
503 <term>elementSet</term><listitem>
505 Specifies the elementSet to be sent to the target if record
506 transform is enabled (not to be confused' with the record_transform
507 module). The record transform is enabled only if the client uses
508 record syntax = XML and a element set determined by
509 the <literal>element_transform</literal> /
510 <literal>element_raw</literal> from the configuration.
511 By default that is the element sets <literal>pz2</literal>
512 and <literal>raw</literal>.
513 If record transform is not enabled, this setting is
514 not used and the element set specified by the client
521 <term>literalTransform</term><listitem>
523 Specifies a XSL stylesheet to be used if record
524 transform is anabled; see description of elementSet.
525 The XSL transform is only used if the element set is set to the
526 value of <literal>element_transform</literal> in the configuration.
529 The value of literalTransform is the XSL - string encoded.
535 <term>piggyback</term><listitem>
537 A value of 1/true is a hint to the ZOOM module that this Z39.50
538 target supports piggyback searches, ie Search Response with
539 records. Any other value (false) will prevent the ZOOM module
540 to make use of piggyback (all records part of Present Response).
546 <term>queryEncoding</term><listitem>
548 If this value is defined, all queries will be converted
549 to this encoding. This should be used for all Z39.50 targets that
550 do not use UTF-8 for query terms.
556 <term>recordEncoding</term><listitem>
558 Specifies the character encoding of records that are returned
559 by the target. This is primarily used for targets were records
560 are not UTF-8 encoded already. This setting is only used
561 if the record transform is enabled (see description of elementSet).
567 <term>requestSyntax</term><listitem>
569 Specifies the record syntax to be specified for the target
570 if record transform is enabled; see description of elementSet.
571 If record transform is not enabled, the record syntax of the
572 client is passed verbatim to the target.
577 <varlistentry id="zoom-torus-sortmap">
578 <term>sortmap_<replaceable>field</replaceable></term><listitem>
580 This value the native field for a target. The form of the value is
581 given by <link linkend="zoom-torus-sortStrategy">sortStrategy</link>.
586 <varlistentry id="zoom-torus-sortStrategy">
587 <term>sortStrategy</term><listitem>
589 Specifies sort strategy for a target. One of:
590 <literal>z3950</literal>, <literal>type7</literal>,
591 <literal>cql</literal>, <literal>sru11</literal> or
592 <literal>embed</literal>. The <literal>embed</literal> chooses type-7
593 or CQL sortby depending on whether Type-1 or CQL is
594 actually sent to the target.
600 <term>sru</term><listitem>
602 If this setting is set, it specifies that the target is web service
603 based and must be one of : <literal>get</literal>,
604 <literal>post</literal>, <literal>soap</literal>
605 or <literal>solr</literal>.
610 <varlistentry id="sruVersion">
611 <term>sruVersion</term><listitem>
613 Specifies the SRU version to use. It unset, version 1.2 will be
614 used. Some servers do not support this version, in which case
615 version 1.1 or even 1.0 could be set it.
620 <varlistentry id="transform">
621 <term>transform</term><listitem>
623 Specifies a XSL stylesheet filename to be used if record
624 transform is anabled; see description of elementSet.
625 The XSL transform is only used if the element set is set to the
626 value of <literal>element_transform</literal> in the configuration.
631 <varlistentry id="udb">
632 <term>udb</term><listitem>
634 This value is required and specifies the unique database for
635 this profile . All target profiles should hold a unique database.
640 <varlistentry id="urlRecipe">
641 <term>urlRecipe</term><listitem>
643 The value of this field is a string that generates a dynamic link
644 based on record content. If the resulting string is non-zero in length
645 a new field, <literal>metadata</literal> with attribute
646 <literal>type="generated-url"</literal> is generated.
647 The contents of this field is the result of the URL recipe conversion.
648 The urlRecipe value may refer to an existing metadata element by
649 ${field[pattern/result/flags]}, which will take content
650 of field and perform a regular expression conversion using the pattern
651 given. For example: <literal>${md-title[\s+/+/g]}</literal> takes
652 metadata element <literal>title</literal> and converts one or more
653 spaces to a plus character.
658 <varlistentry id="zurl">
659 <term>zurl</term><listitem>
661 This is setting is mandatory and specifies the ZURL of the
662 target in the form of host/database. The HTTP method should
663 not be provided as this is guessed from the "sru" attribute value.
670 <title>DATABASE parameters</title>
672 Extra information may be carried in the Z39.50 Database or SRU path,
673 such as authentication to be passed to backend etc. Some of
674 the parameters override TARGET profile values. The format is
677 udb,parm1=value1&parm2=value2&...
680 Where udb is the unique database recognised by the backend and parm1,
681 value1, .. are parameters to be passed. The following describes the
682 supported parameters. Like form values in HTTP the parameters and
683 values are URL encoded. The separator, though, between udb and parameters
684 is a comma rather than a question mark. What follows question mark are
685 HTTP arguments (in this case SRU arguments).
689 <term>content-password</term>
692 The password to be used for content proxy session. If this parameter
693 is not given, value of parameter <literal>password</literal> is passed
694 to content proxy session.
699 <term>content-proxy</term>
702 Specifies proxy to be used for content proxy session. If this parameter
703 is not given, value of parameter <literal>proxy</literal> is passed
704 to content proxy session.
709 <term>content-user</term>
712 The user to be used for content proxy session. If this parameter
713 is not given, value of parameter <literal>user</literal> is passed
714 to content proxy session.
719 <term>cproxysession</term>
722 Specifies the session ID for content proxy. This parameter is, generally,
723 not used by anything but the content proxy itself when invoking
729 <term>nocproxy</term>
732 If this parameter is specified, content-proxying is disabled
738 <term>password</term>
741 Specifies password to be passed to backend. It is also passed
742 to content proxy session unless overriden by content-password.
743 If this parameter is omitted, the password will be taken from
744 TARGET profile setting
745 <link linkend="zoom-torus-authentication">
746 <literal>authentication</literal>
756 Specifies one or more proxies for backend. If this parameter is
757 omitted, the proxy will be taken from TARGET profile setting
758 <link linkend="zoom-torus-cfproxy">
759 <literal>cfProxy</literal></link>.
760 The parameter is a list of comma-separated host:port entries.
761 Bost host and port must be given for each proxy.
769 Session realm to be used for this target, changed the resulting
770 URL to be used for getting a target profile, by changing the
771 value that gets substituted for the %realm string. This parameter
772 is not allowed if access is controlled by
773 <link linkend="auth_url">auth_url</link>
780 <term>torus_url</term>
783 Sets the URL to be used for Torus records fetch - overriding value
784 of <literal>url</literal> attribute of element <literal>torus</literal>
785 in zoom configuration. This parameter is not allowed if access is
787 <link linkend="auth_url">auth_url</link> in configuration.
796 Specifies user to be passed to backend. It is also passed
797 to content proxy session unless overriden by content-user.
798 If this parameter is omitted, the user will be taken from TARGET
800 <link linkend="zoom-torus-authentication">
801 <literal>authentication</literal>
811 All parameters that has prefix x, dash are passed verbatim
819 <title>SCHEMA</title>
820 <literallayout><xi:include
821 xi:href="../xml/schema/filter_zoom.rnc"
823 xmlns:xi="http://www.w3.org/2001/XInclude" />
828 <title>EXAMPLES</title>
830 In example below Target definitions (Torus records) are fetched
831 from a web service via a proxy. A CQL profile is configured which
832 maps to a set of CCL fields ("no field", au, tu and su). Presumably
833 the target definitions fetched maps the CCL to their native RPN.
834 A CCL "ocn" is mapped for all targets. Logging of APDUs are enabled,
835 and a timeout is given.
839 url="http://torus.indexdata.com/src/records/?query=%query"
840 proxy="localhost:3128"
842 <fieldmap cql="cql.anywhere"/>
843 <fieldmap cql="cql.serverChoice"/>
844 <fieldmap cql="dc.creator" ccl="au"/>
845 <fieldmap cql="dc.title" ccl="ti"/>
846 <fieldmap cql="dc.subject" ccl="su"/>
850 <attr type="u" value="12"/>
851 <attr type="s" value="107"/>
862 Here is another example with two locally defined targets: A
863 Solr target and a Z39.50 target.
871 <cclmap_term>t=z</cclmap_term>
872 <cclmap_ti>u=title t=z</cclmap_ti>
874 <zurl>ocs-test.indexdata.com/solr/select</zurl>
878 <cclmap_term>t=l,r</cclmap_term>
879 <cclmap_ti>u=4 t=l,r</cclmap_ti>
880 <zurl>z3950.loc.gov:7090/voyager</zurl>
884 <fieldmap cql="cql.serverChoice"/>
885 <fieldmap cql="dc.title" ccl="ti"/>
893 <title>SEE ALSO</title>
896 <refentrytitle>metaproxy</refentrytitle>
897 <manvolnum>1</manvolnum>
902 <refentrytitle>virt_db</refentrytitle>
903 <manvolnum>3mp</manvolnum>
911 <!-- Keep this comment at the end of the file