-<!-- $Id: book.xml,v 1.11 2006-04-21 17:08:12 mike Exp $ -->
+<?xml version="1.0" standalone="no"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN"
+ "http://www.oasis-open.org/docbook/xml/4.1/docbookx.dtd"
+[
+ <!ENTITY local SYSTEM "local.ent">
+ <!ENTITY manref SYSTEM "manref.xml">
+ <!ENTITY progref SYSTEM "progref.xml">
+ <!ENTITY % common SYSTEM "common/common.ent">
+ %common;
+ <!-- Next line allows imagedata/@format="PDF" and is taken from
+ http://lists.oasis-open.org/archives/docbook/200303/msg00163.html
+ -->
+ <!ENTITY % local.notation.class "| PDF">
+ <!-- Next line is necessary for some XML parsers, for reasons I
+ don't understand. I got this from
+ http://lists.oasis-open.org/archives/docbook/200303/msg00180.html
+ -->
+ <!NOTATION PDF SYSTEM "PDF">
+]>
+<!-- $Id: book.xml,v 1.34 2006-06-10 14:29:11 adam Exp $ -->
+<book id="metaproxy">
<bookinfo>
<title>Metaproxy - User's Guide and Reference</title>
<author>
</author>
<copyright>
<year>2006</year>
- <holder>Index Data</holder>
+ <holder>Index Data ApS</holder>
</copyright>
<abstract>
<simpara>
Metaproxy is a universal router, proxy and encapsulated
metasearcher for information retrieval protocols. It accepts,
processes, interprets and redirects requests from IR clients using
- standard protocols such as ANSI/NISO Z39.50 (and in the future SRU
- and SRW), as well as functioning as a limited
- HTTP server. Metaproxy is configured by an XML file which
+ standard protocols such as
+ <ulink url="&url.z39.50;">ANSI/NISO Z39.50</ulink>
+ (and in the future <ulink url="&url.sru;">SRU</ulink>
+ and <ulink url="&url.srw;">SRW</ulink>), as
+ well as functioning as a limited
+ <ulink url="&url.http;">HTTP</ulink> server.
+ Metaproxy is configured by an XML file which
specifies how the software should function in terms of routes that
the request packets can take through the proxy, each step on a
route being an instantiation of a filter. Filters come in many
should not at this stage redistribute the code without explicit
written permission from the copyright holders, Index Data ApS.
</simpara>
+ <simpara>
+ <inlinemediaobject>
+ <imageobject>
+ <imagedata fileref="common/id.png" format="PNG"/>
+ </imageobject>
+ <imageobject>
+ <imagedata fileref="common/id.eps" format="EPS"/>
+ </imageobject>
+ </inlinemediaobject>
+ </simpara>
</abstract>
</bookinfo>
<title>Introduction</title>
+ <para>
+ <ulink url="&url.metaproxy;">Metaproxy</ulink>
+ is a standalone program that acts as a universal router, proxy and
+ encapsulated metasearcher for information retrieval protocols such
+ as <ulink url="&url.z39.50;">Z39.50</ulink>, and in the future
+ <ulink url="&url.sru;">SRU</ulink> and <ulink url="&url.srw;">SRW</ulink>.
+ To clients, it acts as a server of these protocols: it can be searched,
+ records can be retrieved from it, etc.
+ To servers, it acts as a client: it searches in them,
+ retrieves records from them, etc. it satisfies its clients'
+ requests by transforming them, multiplexing them, forwarding them
+ on to zero or more servers, merging the results, transforming
+ them, and delivering them back to the client. In addition, it
+ acts as a simple <ulink url="&url.http;">HTTP</ulink> server; support
+ for further protocols can be added in a modular fashion, through the
+ creation of new filters.
+ </para>
+ <screen>
+ Anything goes in!
+ Anything goes out!
+ Fish, bananas, cold pyjamas,
+ Mutton, beef and trout!
+ - attributed to Cole Porter.
+ </screen>
+ <para>
+ Metaproxy is a more capable alternative to
+ <ulink url="&url.yazproxy;">YAZ Proxy</ulink>,
+ being more powerful, flexible, configurable and extensible. Among
+ its many advantages over the older, more pedestrian work are
+ support for multiplexing (encapsulated metasearching), routing by
+ database name, authentication and authorisation and serving local
+ files via HTTP. Equally significant, its modular architecture
+ facilitites the creation of pluggable modules implementing further
+ functionality.
+ </para>
+ <para>
+ This manual will briefly describe Metaproxy's licensing situation
+ before giving an overview of its architecture, then discussing the
+ key concept of a filter in some depth and giving an overview of
+ the various filter types, then discussing the configuration file
+ format. After this come several optional chapters which may be
+ freely skipped: a detailed discussion of virtual databases and
+ multi-database searching, some notes on writing extensions
+ (additional filter types) and a high-level description of the
+ source code. Finally comes the reference guide, which contains
+ instructions for invoking the <command>metaproxy</command>
+ program, and detailed information on each type of filter,
+ including examples.
+ </para>
+ </chapter>
+
+ <chapter id="license">
+ <title>The Metaproxy License</title>
+ <orderedlist numeration="arabic">
+ <listitem>
+ <para>
+ You are allowed to download this software for evaluation purposes.
+ You can unpack it, build it, run it, see how it works and how it fits
+ your needs, all at zero cost.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ You may NOT deploy the software. For the purposes of this license,
+ deployment means running it for any purpose other than evaluation,
+ whether or not you or anyone else makes a profit from doing so. If
+ you wish to deploy the software, you must first contact Index Data and
+ arrange to purchase a DEPLOYMENT LICENCE. If you are unsure
+ whether or not your proposed use of the software constitutes
+ deployment, email us at <literal>info@indexdata.com</literal>
+ for clarification.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ You may modify your copy of the software (fix bugs, add features)
+ if you need to. We encourage you to send your changes back to us for
+ integration into the master copy, but you are not obliged to do so. You
+ may NOT pass your changes on to any other party.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ There is NO WARRANTY for this software, to the extent permitted by
+ applicable law. We provide the software ``as is'' without warranty of
+ any kind, either expressed or implied, including, but not limited to, the
+ implied warranties of MERCHANTABILITY and FITNESS FOR A
+ PARTICULAR PURPOSE. The entire risk as to the quality and
+ performance of the software is with you. Should the software prove
+ defective, you assume the cost of all necessary servicing, repair or
+ correction. In no event unless required by applicable law will we be
+ liable to you for damages, arising out of the use of the software,
+ including but not limited to loss of data or data being rendered
+ inaccurate.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ All rights to the software are reserved by Index Data except where
+ this license explicitly says otherwise.
+ </para>
+ </listitem>
+ </orderedlist>
+ </chapter>
+
+ <chapter id="installation">
+ <title>Installation</title>
+ <para>
+ Metaproxy depends on the following tools/libraries:
+ <variablelist>
+ <varlistentry><term><ulink url="&url.yazplusplus;">YAZ++</ulink></term>
+ <listitem>
+ <para>
+ This is a C++ library based on <ulink url="&url.yaz;">YAZ</ulink>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><ulink url="&url.libxslt;">Libxslt</ulink></term>
+ <listitem>
+ <para>This is an XSLT processor - based on
+ <ulink url="&url.libxml2;">Libxml2</ulink>. Both Libxml2 and
+ Libxslt must be installed with the development components
+ (header files, etc.) as well as the run-time libraries.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><ulink url="&url.boost;">Boost</ulink></term>
+ <listitem>
+ <para>
+ The popular C++ library. Initial versions of Metaproxy
+ was built with 1.33.0. Version 1.33.1 works too.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ In order to compile Metaproxy a modern C++ compiler is
+ required. Boost, in particular, requires the C++ compiler
+ to facilitate the newest features. Refer to Boost
+ <ulink url="&url.boost.compilers.status;">Compiler Status</ulink>
+ for more information.
+ </para>
+ <para>
+ We have succesfully built Metaproxy using the compilers
+ <ulink url="&url.gcc;">GCC</ulink> version 4.0 and
+ <ulink url="&url.vstudio;">Microsoft Visual Studio</ulink> 2003/2005.
+ </para>
+
+ <section id="installation.unix">
+ <title>Installation on Unix (from Source)</title>
+ <para>
+ Here is a quick step-by-step guide on how to compile all the
+ tools that Metaproxy uses. Only few systems have none of the required
+ tools binary packages. If, for example, Libxml2/libxslt are already
+ installed as development packages use those (and omit compilation).
+ </para>
+
<para>
- <ulink url="http://www.indexdata.com/metaproxy/">Metaproxy</ulink>
- is a standalone program that acts as a universal router, proxy and
- encapsulated metasearcher for information retrieval protocols such
- as Z39.50, and in the future SRU and SRW. To clients, it acts as a
- server of these
- protocols: it can be searched, records can be retrieved from it,
- etc. To servers, it acts as a client: it searches in them,
- retrieves records from them, etc. it satisfies its clients'
- requests by transforming them, multiplexing them, forwarding them
- on to zero or more servers, merging the results, transforming
- them, and delivering them back to the client. In addition, it
- acts as a simple HTTP server; support for further protocols can be
- added in a modular fashion, through the creation of new filters.
+ Libxml2/libxslt:
</para>
<screen>
- Anything goes in!
- Anything goes out!
- Cold bananas, fish, pyjamas,
- Mutton, beef and trout!
- - attributed to Cole Porter.
+ gunzip -c libxml2-version.tar.gz|tar xf -
+ cd libxml2-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ <screen>
+ gunzip -c libxslt-version.tar.gz|tar xf -
+ cd libxslt-version
+ ./configure
+ make
+ su
+ make install
</screen>
<para>
- Metaproxy is a more capable alternative to
- <ulink url="http://www.indexdata.com/yazproxy/">YAZ Proxy</ulink>,
- being more powerful, flexible, configurable and extensible. Among
- its many advantages over the older, more pedestrian work are
- support for multiplexing (encapsulated metasearching), routing by
- database name, authentication and authorisation and serving local
- files via HTTP. Equally significant, its modular architecture
- facilitites the creation of pluggable modules implementing further
- functionality.
+ YAZ/YAZ++:
</para>
- </chapter>
+ <screen>
+ gunzip -c yaz-version.tar.gz|tar xf -
+ cd yaz-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ <screen>
+ gunzip -c yazpp-version.tar.gz|tar xf -
+ cd yazpp-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ <para>
+ Boost:
+ </para>
+ <screen>
+ gunzip -c boost-version.tar.gz|tar xf -
+ cd boost-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ <para>
+ Metaproxy:
+ </para>
+ <screen>
+ gunzip -c metaproxy-version.tar.gz|tar xf -
+ cd metaproxy-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ </section>
+ <section id="installation.debian">
+ <title>Installation on Debian GNU/Linux</title>
+ <para>
+ All dependencies for Metaproxy are available as
+ <ulink url="&url.debian;">Debian</ulink>
+ packages for the sarge (stable in 2005) and etch (testing in 2005)
+ distributions.
+ </para>
+ <para>
+ The procedures for Debian based systems, such as
+ <ulink url="&url.ubuntu;">Ubuntu</ulink> is probably similar
+ </para>
+ <para>
+ There is currently no official Debian package for YAZ++.
+ And the Debian package for YAZ is probably too old.
+ Update the <filename>/etc/apt/sources.list</filename>
+ to include the Index Data repository.
+ See YAZ' <ulink url="&url.yaz.download.debian;">Download Debian</ulink>
+ for more information.
+ </para>
+ <screen>
+ apt-get install libxslt1-dev
+ apt-get install libyazpp-dev
+ apt-get install libboost-dev
+ apt-get install libboost-thread-dev
+ apt-get install libboost-date-time-dev
+ apt-get install libboost-program-options-dev
+ apt-get install libboost-test-dev
+ </screen>
+ <para>
+ With these packages installed, the usual configure + make
+ procedure can be used for Metaproxy as outlined in
+ <xref linkend="installation.unix"/>.
+ </para>
+ </section>
+ <section id="installation.windows">
+ <title>Installation on Windows</title>
+ <para>
+ Metaproxy can be compiled with Microsoft
+ <ulink url="&url.vstudio;">Visual Studio</ulink>.
+ Version 2003 (C 7.1) and 2005 (C 8.0) is known to work.
+ </para>
+ <section id="installation.windows.boost">
+ <title>Boost</title>
+ <para>
+ Get Boost from its <ulink url="&url.boost;">home page</ulink>.
+ You also need Boost Jam (an alternative to make).
+ That's also available from the Boost home page.
+ The files to be downloaded are called something like:
+ <filename>boost_1_33-1.exe</filename>
+ and
+ <filename>boost-jam-3.1.12-1-ntx86.zip</filename>.
+ Unpack Boost Jam first. Put <filename>bjam.exe</filename>
+ in your system path. Make a command prompt and ensure
+ it can be found automatically. If not check the PATH.
+ The Boost .exe is a self-extracting exe with
+ complete source for Boost. Compile that source with
+ Boost Jam (An alternative to Make).
+ The compilation takes a while.
+ For Visual Studio 2003, use
+ <screen>
+ bjam "-sTOOLS=vc-7_1"
+ </screen>
+ Here <literal>vc-7_1</literal> refers to a "Toolset" (compiler system).
+ For Visual Studio 2005, use
+ <screen>
+ bjam "-sTOOLS=vc-8_0"
+ </screen>
+ To install the libraries in a common place, use
+ <screen>
+ bjam "-sTOOLS=vc-7_1" install
+ </screen>
+ (or vc-8_0 for VS 2005).
+ </para>
+ <para>
+ By default, the Boost build process installs the resulting
+ libraries + header files in
+ <literal>\boost\lib</literal>, <literal>\boost\include</literal>.
+ </para>
+ <para>
+ For more informatation about installing Boost refer to the
+ <ulink url="&url.boost.getting.started;">getting started</ulink>
+ pages.
+ </para>
+ </section>
- <chapter id="licence">
- <title>The Metaproxy Licence</title>
- <para>
- <emphasis role="strong">
- No decision has yet been made on the terms under which
- Metaproxy will be distributed.
- </emphasis>
- It is possible that, unlike
- other Index Data products, metaproxy may not be released under a
- free-software licence such as the GNU GPL. Until a decision is
- made and a public statement made, then, and unless it has been
- delivered to you other specific terms, please treat Metaproxy as
- though it were proprietary software.
- The code should not be redistributed without explicit
- written permission from the copyright holders, Index Data ApS.
- </para>
- </chapter>
+ <section id="installation.windows.libxslt">
+ <title>Libxslt</title>
+ <para>
+ <ulink url="&url.libxslt;">Libxslt</ulink> can be downloaded
+ for Windows from
+ <ulink url="&url.libxml2.download.win32;">here</ulink>.
+ </para>
+ <para>
+ Libxslt has other dependencies, but thes can all be downloaded
+ from the same site. Get the following:
+ iconv, zlib, libxml2, libxslt.
+ </para>
+ </section>
+ <section id="installation.windows.yaz">
+ <title>YAZ</title>
+ <para>
+ <ulink url="&url.yaz;">YAZ</ulink> can be downloaded
+ for Windows from
+ <ulink url="&url.yaz.download.win32;">here</ulink>.
+ </para>
+ </section>
+ <section id="installation.windows.yazplusplus">
+ <title>YAZ++</title>
+ <para>
+ Get <ulink url="&url.yazplusplus;">YAZ++</ulink> as well.
+ Version 1.0 or later is required. For now get it from
+ Index Data's
+ <ulink url="&url.snapshot.download;">Snapshot area</ulink>.
+ </para>
+ <para>
+ YAZ++ includes NMAKE makefiles, similar to those found in the
+ YAZ package.
+ </para>
+ </section>
+
+ <section id="installation.windows.metaproxy">
+ <title>Metaproxy</title>
+ <para>
+ Metaproxy is shipped with NMAKE makfiles as well - similar
+ to those found in the YAZ++/YAZ packages. Adjust this Makefile
+ to point to the proper locations of Boost, Libxslt, Libxml2,
+ zlib, iconv, yaz and yazpp.
+ </para>
+
+ <variablelist>
+ <varlistentry><term><literal>DEBUG</literal></term>
+ <listitem><para>
+ If set to 1, the software is
+ compiled with debugging libraries (code generation is
+ multi-threaded debug DLL).
+ If set to 0, the software is compiled with release libraries
+ (code generation is multi-threaded DLL).
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>BOOST</literal></term>
+ <listitem>
+ <para>
+ Boost install location
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>BOOST_VERSION</literal></term>
+ <listitem>
+ <para>
+ Boost version (replace . with _).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>BOOST_TOOLSET</literal></term>
+ <listitem>
+ <para>
+ Boost toolset.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>LIBXSLT_DIR</literal>,
+ <literal>LIBXML2_DIR</literal> ..</term>
+ <listitem>
+ <para>
+ Specify the locations of Libxslt, libiconv, libxml2 and
+ libxslt.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ After succesful compilation you'll find
+ <literal>metaproxy.exe</literal> in the
+ <literal>bin</literal> directory.
+ </para>
+ </section>
+
+ </section>
+ </chapter>
+
<chapter id="architecture">
<title>The Metaproxy Architecture</title>
<para>
<title><literal>multi</literal>
(mp::filter::Multi)</title>
<para>
- Performs multicast searching.
+ Performs multi-database searching.
See
<link linkend="multidb">the extended discussion</link>
of virtual databases and multi-database searching below.
file (included in the distribution as
<literal>metaproxy/etc/config0.xml</literal>).
This file defines a very simple configuration that simply proxies
- to whatever backend server the client requests, but logs each
+ to whatever back-end server the client requests, but logs each
request and response. This can be useful for debugging complex
client-server dialogues.
</para>
- <screen><![CDATA[
-<?xml version="1.0"?>
+ <screen><![CDATA[<?xml version="1.0"?>
<yp2 xmlns="http://indexdata.dk/yp2/config/1">
<start route="start"/>
<filters>
a <literal>log</literal> filter that emits a message for each
request; they are then fed into a <literal>z3950_client</literal>
filter, which forwards the requests to the client-specified
- backend Z39.509 server. When the response arrives, it is handed
+ back-end Z39.509 server. When the response arrives, it is handed
back to the <literal>log</literal> filter, which emits another
message; and then to the front-end filter, which returns the
response to the client.
Two of Metaproxy's filters are concerned with multiple-database
operations. Of these, <literal>virt_db</literal> can work alone
to control the routing of searches to one of a number of servers,
- while <literal>multi</literal> can work with the output of
- <literal>virt_db</literal> to perform multicast searching, merging
- the results into a unified result-set. The interaction between
- these two filters is necessarily complex, reflecting the real
- complexity of multicast searching in a protocol such as Z39.50
- that separates initialisation from searching, with the database to
- search known only during the latter operation.
+ while <literal>multi</literal> can work together with
+ <literal>virt_db</literal> to perform multi-database searching, merging
+ the results into a unified result-set - ``metasearch in a box''.
+ </para>
+ <para>
+ The interaction between
+ these two filters is necessarily complex: it reflects the real,
+ irreducible complexity of multi-database searching in a protocol such
+ as Z39.50 that separates initialisation from searching, and in
+ which the database to be searched is not known at initialisation
+ time.
+ </para>
+ <para>
+ It's possible to use these filters without understanding the
+ details of their functioning and the interaction between them; the
+ next two sections of this chapter are ``HOWTO'' guides for doing
+ just that. However, debugging complex configurations will require
+ a deeper understanding, which the last two sections of this
+ chapters attempt to provide.
+ </para>
+ </section>
+
+
+ <section id="multidb.virt_db">
+ <title>Virtual databases with the <literal>virt_db</literal> filter</title>
+ <para>
+ Working alone, the purpose of the
+ <literal>virt_db</literal>
+ filter is to route search requests to one of a selection of
+ back-end databases. In this way, a single Z39.50 endpoint
+ (running Metaproxy) can provide access to several different
+ underlying services, including those that would otherwise be
+ inaccessible due to firewalls. In many useful configurations, the
+ back-end databases are local to the Metaproxy installation, but
+ the software does not enforce this, and any valid Z39.50 servers
+ may be used as back-ends.
+ </para>
+ <para>
+ For example, a <literal>virt_db</literal>
+ filter could be set up so that searches in the virtual database
+ ``lc'' are forwarded to the Library of Congress bibliographic
+ catalogue server, and searches in the virtual database ``marc''
+ are forwarded to the toy database of MARC records that Index Data
+ hosts for testing purposes. A <literal>virt_db</literal>
+ configuration to make this switch would look like this:
+ </para>
+ <screen><![CDATA[<filter type="virt_db">
+ <virtual>
+ <database>lc</database>
+ <target>z3950.loc.gov:7090/voyager</target>
+ </virtual>
+ <virtual>
+ <database>marc</database>
+ <target>indexdata.dk/marc</target>
+ </virtual>
+</filter>]]></screen>
+ <para>
+ As well as being useful in it own right, this filter also provides
+ the foundation for multi-database searching.
+ </para>
+ </section>
+
+
+ <section id="multidb.multi">
+ <title>Multi-database search with the <literal>multi</literal> filter</title>
+ <para>
+ To arrange for Metaproxy to broadcast searches to multiple back-end
+ servers, the configuration needs to include two components: a
+ <literal>virt_db</literal>
+ filter that specifies multiple
+ <literal><target></literal>
+ elements, and a subsequent
+ <literal>multi</literal>
+ filter. Here, for example, is a complete configuration that
+ broadcasts searches to both the Library of Congress catalogue and
+ Index Data's tiny testing database of MARC records:
+ </para>
+ <screen><![CDATA[<?xml version="1.0"?>
+<yp2 xmlns="http://indexdata.dk/yp2/config/1">
+ <start route="start"/>
+ <routes>
+ <route id="start">
+ <filter type="frontend_net">
+ <threads>10</threads>
+ <port>@:9000</port>
+ </filter>
+ <filter type="virt_db">
+ <virtual>
+ <database>lc</database>
+ <target>z3950.loc.gov:7090/voyager</target>
+ </virtual>
+ <virtual>
+ <database>marc</database>
+ <target>indexdata.dk/marc</target>
+ </virtual>
+ <virtual>
+ <database>all</database>
+ <target>z3950.loc.gov:7090/voyager</target>
+ <target>indexdata.dk/marc</target>
+ </virtual>
+ </filter>
+ <filter type="multi"/>
+ <filter type="z3950_client">
+ <timeout>30</timeout>
+ </filter>
+ </route>
+ </routes>
+</yp2>]]></screen>
+ <para>
+ (Using a
+ <literal>virt_db</literal>
+ filter that specifies multiple
+ <literal><target></literal>
+ elements but without a subsequent
+ <literal>multi</literal>
+ filter yields surprising and undesirable results, as will be
+ described below. Don't do that.)
+ </para>
+ <para>
+ Metaproxy can be invoked with this configuration as follows:
+ </para>
+ <screen>../src/metaproxy --config config-simple-multi.xml</screen>
+ <para>
+ And thereafter, Z39.50 clients can connect to the running server
+ (on port 9000, as specified in the configuration) and search in
+ any of the databases
+ <literal>lc</literal> (the Library of Congress catalogue),
+ <literal>marc</literal> (Index Data's test database of MARC records)
+ or
+ <literal>all</literal> (both of these). As an example, a session
+ using the YAZ command-line client <literal>yaz-client</literal> is
+ here included (edited for brevity and clarity):
+ </para>
+ <screen><![CDATA[$ yaz-client @:9000
+Connecting...OK.
+Z> base lc
+Z> find computer
+Search was a success.
+Number of hits: 10000, setno 1
+Elapsed: 5.521070
+Z> base marc
+Z> find computer
+Search was a success.
+Number of hits: 10, setno 3
+Elapsed: 0.060187
+Z> base all
+Z> find computer
+Search was a success.
+Number of hits: 10010, setno 4
+Elapsed: 2.237648
+Z> show 1
+[marc]Record type: USmarc
+001 11224466
+003 DLC
+005 00000000000000.0
+008 910710c19910701nju 00010 eng
+010 $a 11224466
+040 $a DLC $c DLC
+050 00 $a 123-xyz
+100 10 $a Jack Collins
+245 10 $a How to program a computer
+260 1 $a Penguin
+263 $a 8710
+300 $a p. cm.
+Elapsed: 0.119612
+Z> show 2
+[VOYAGER]Record type: USmarc
+001 13339105
+005 20041229102447.0
+008 030910s2004 caua 000 0 eng
+035 $a (DLC) 2003112666
+906 $a 7 $b cbc $c orignew $d 4 $e epcn $f 20 $g y-gencatlg
+925 0 $a acquire $b 1 shelf copy $x policy default
+955 $a pc10 2003-09-10 $a pv12 2004-06-23 to SSCD; $h sj05 2004-11-30 $e sj05 2004-11-30 to Shelf.
+010 $a 2003112666
+020 $a 0761542892
+040 $a DLC $c DLC $d DLC
+050 00 $a MLCM 2004/03312 (G)
+245 10 $a 007, everything or nothing : $b Prima's official strategy guide / $c created by Kaizen Media Group.
+246 3 $a Double-O-seven, everything or nothing
+246 30 $a Prima's official strategy guide
+260 $a Roseville, CA : $b Prima Games, $c c2004.
+300 $a 161 p. : $b col. ill. ; $c 28 cm.
+500 $a "Platforms: Nintendo GameCube, Macintosh, PC, PlayStation 2 computer entertainment system, Xbox"--P. [4] of cover.
+650 0 $a Video games.
+710 2 $a Kaizen Media Group.
+856 42 $3 Publisher description $u http://www.loc.gov/catdir/description/random052/2003112666.html
+Elapsed: 0.150623
+Z>
+]]></screen>
+ <para>
+ As can be seen, the first record in the result set is from the
+ Index Data test database, and the second from the Library of
+ Congress database. The result-set continues alternating records
+ round-robin style until the point where one of the databases'
+ records are exhausted.
+ </para>
+ <para>
+ This example uses only two back-end databases; more may be used.
+ There is no limitation imposed on the number of databases that may
+ be metasearched in this way: issues of resource usage and
+ administrative complexity dictate the practical limits.
+ </para>
+ <para>
+ What happens when one of the databases doesn't respond? By default,
+ the entire multi-database search fails, and the appropriate
+ diagnostic is returned to the client. This is usually appropriate
+ during development, when technicians need maximum information, but
+ can be inconvenient in deployment, when users typically don't want
+ to be bothered with problems of this kind and prefer just to get
+ the records from the databases that are available. To obtain this
+ latter behaviour add an empty
+ <literal><hideunavailable></literal>
+ element inside the
+ <literal>multi</literal> filter:
+ </para>
+ <screen><![CDATA[ <filter type="multi">
+ <hideunavailable/>
+ </filter>]]></screen>
+ <para>
+ Under this regime, an error is reported to the client only if
+ <emphasis>all</emphasis> the databases in a multi-database search
+ are unavailable.
+ </para>
+ </section>
+
+
+ <section id="multidb.what">
+ <title>What's going on?</title>
+ <warning>
+ <title>Lark's vomit</title>
+ <para>
+ This section goes into a level of technical detail that is
+ probably not necessary in order to configure and use Metaproxy.
+ It is provided only for those who like to know how things work.
+ You should feel free to skip on to the next section if this one
+ doesn't seem like fun.
+ </para>
+ </warning>
+ <para>
+ Hold on tight - this may get a little hairy.
</para>
<para>
In the general course of things, a Z39.50 Init request may carry
with it an otherInfo packet of type <literal>VAL_PROXY</literal>,
whose value indicates the address of a Z39.50 server to which the
ultimate connection is to be made. (This otherInfo packet is
- supported by YAZ-based Z39.50 servers and clients, but has not yet
+ supported by YAZ-based Z39.50 clients and servers, but has not yet
been ratified by the Maintenance Agency and so is not widely used
in non-Index Data software. We're working on it.)
The <literal>VAL_PROXY</literal> packet functions
<ulink url="http://www.w3.org/Protocols/rfc2616/rfc2616.html"
>the HTTP 1.1 specification</ulink>.
</para>
+ <para>
+ Within Metaproxy, Search requests that are part of the same
+ session as an Init request that carries a
+ <literal>VAL_PROXY</literal> otherInfo are also annotated with the
+ same information. The role of the <literal>virt_db</literal>
+ filter is to rewrite this otherInfo packet dependent on the
+ virtual database that the client wants to search.
+ </para>
+ <para>
+ When Metaproxy receives a Z39.50 Init request from a client, it
+ doesn't immediately forward that request to the back-end server.
+ Why not? Because it doesn't know <emphasis>which</emphasis>
+ back-end server to forward it to until the client sends a Search
+ request that specifies the database that it wants to search in.
+ Instead, it just treasures the Init request up in its heart; and,
+ later, the first time the client does a search on one of the
+ specified virtual databases, a connection is forged to the
+ appropriate server and the Init request is forwarded to it. If,
+ later in the session, the same client searches in a different
+ virtual database, then a connection is forged to the server that
+ hosts it, and the same cached Init request is forwarded there,
+ too.
+ </para>
+ <para>
+ All of this clever Init-delaying is done by the
+ <literal>frontend_net</literal> filter. The
+ <literal>virt_db</literal> filter knows nothing about it; in
+ fact, because the Init request that is received from the client
+ doesn't get forwarded until a Search request is received, the
+ <literal>virt_db</literal> filter (and the
+ <literal>z3950_client</literal> filter behind it) doesn't even get
+ invoked at Init time. The <emphasis>only</emphasis> thing that a
+ <literal>virt_db</literal> filter ever does is rewrite the
+ <literal>VAL_PROXY</literal> otherInfo in the requests that pass
+ through it.
+ </para>
+ <para>
+ It is possible for a <literal>virt_db</literal> filter to contain
+ multiple
+ <literal><target></literal>
+ elements. What does this mean? Only that the filter will add
+ multiple <literal>VAL_PROXY</literal> otherInfo packets to the
+ Search requests that pass through it. That's because the virtual
+ DB filter is dumb, and does exactly what it's told - no more, no
+ less.
+ If a Search request with multiple <literal>VAL_PROXY</literal>
+ otherInfo packets reaches a <literal>z3950_client</literal>
+ filter, this is an error. That filter doesn't know how to deal
+ with multiple targets, so it will either just pick one and search
+ in it, or (better) fail with an error message.
+ </para>
+ <para>
+ The <literal>multi</literal> filter comes to the rescue! This is
+ the only filter that knows how to deal with multiple
+ <literal>VAL_PROXY</literal> otherInfo packets, and it does so by
+ making multiple copies of the entire Search request: one for each
+ <literal>VAL_PROXY</literal>. Each of these new copies is then
+ passed down through the remaining filters in the route. (The
+ copies are handled in parallel though the
+ spawning of new threads.) Since the copies each have only one
+ <literal>VAL_PROXY</literal> otherInfo, they can be handled by the
+ <literal>z3950_client</literal> filter, which happily deals with
+ each one individually. When the results of the individual
+ searches come back up to the <literal>multi</literal> filter, it
+ merges them into a single Search response, which is what
+ eventually makes it back to the client.
+ </para>
+ </section>
+
+
+ <section id="multidb.picture">
+ <title>A picture is worth a thousand words (but only five hundred on 64-bit architectures)</title>
+ <simpara>
+ <inlinemediaobject>
+ <imageobject>
+ <imagedata fileref="multi.pdf" format="PDF" scale="50"/>
+ </imageobject>
+ <imageobject>
+ <imagedata fileref="multi.png" format="PNG"/>
+ </imageobject>
+ <textobject>
+ <!-- Fall back if none of the images can be used -->
+ <phrase>
+ [Here there should be a diagram showing the progress of
+ packages through the filters during a simple virtual-database
+ search and a multi-database search, but is seems that your
+ toolchain has not been able to include the diagram in this
+ document. This is because of LaTeX suckage. Time to move to
+ OpenOffice. Yes, really.]
+ </phrase>
+ </textobject>
+<!-- ### This used to work with an older version of DocBook
+ <caption>
+ <para>Caption: progress of packages through filters.</para>
+ </caption>
+-->
+ </inlinemediaobject>
+ </simpara>
</section>
</chapter>
&manref;
</section>
</chapter>
-
-
+</book>
<!-- Keep this comment at the end of the file
Local variables:
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
- sgml-parent-document: "main.xml"
+ sgml-parent-document: nil
sgml-local-catalogs: nil
sgml-namecase-general:t
- nxml-child-indent: 1
End:
-->