Use Odr_oid for OIDs. Requires YAZ 3.0.2
[metaproxy-moved-to-github.git] / doc / book.xml
index d7e7383..427962f 100644 (file)
@@ -18,7 +18,7 @@
      -->
      <!NOTATION PDF SYSTEM "PDF">
 ]>
-<!-- $Id: book.xml,v 1.52 2007-01-18 09:39:38 adam Exp $ -->
+<!-- $Id: book.xml,v 1.58 2007-04-19 06:55:50 adam Exp $ -->
 <book id="metaproxy">
  <bookinfo>
   <title>Metaproxy - User's Guide and Reference</title>
    </para>
   </section>
 
+  <section id="installation.rpm">
+   <title>Installation on RPM based Linux Systems</title>
+   <para>
+    All external dependencies for Metaproxy are available as 
+    RPM packages, either from your distribution site, or from the 
+    <ulink url="http://fr.rpmfind.net/">RPMfind</ulink> site.
+   </para>
+   <para>
+    For example, an installation of the requires Boost C++ development 
+    libraries on RedHat Fedora C4 and C5 can be done like this:
+    <screen> 
+    wget ftp://fr.rpmfind.net/wlinux/fedora/core/updates/testing/4/SRPMS/boost-1.33.0-3.fc4.src.rpm
+    sudo rpmbuild --buildroot src/ --rebuild -p fc4/boost-1.33.0-3.fc4.src.rpm
+    sudo rpm -U /usr/src/redhat/RPMS/i386/boost-*rpm
+    </screen>
+   </para>
+   <para>
+    The  <ulink url="&url.yaz;">YAZ</ulink> library is needed to
+    compile &metaproxy;, see there
+    for more information on available RPM packages.
+   </para>
+   <para>
+    There is currently no official RPM package for YAZ++.
+    See the <ulink url="&url.yazplusplus;">YAZ++</ulink> pages 
+    for more information on a Unix tarball install.
+   </para>
+   <para>
+    With these packages installed, the usual configure + make
+    procedure can be used for Metaproxy as outlined in
+    <xref linkend="installation.unix"/>.
+   </para>
+  </section>
+
   <section id="installation.windows">
    <title>Installation on Windows</title>
    <para>
   </section>
  </chapter>
  
+<chapter id="yazproxy-comparison">
+ <title>YAZ Proxy Comparison</title>
+ <para>
+  The table below lists facilities either supported by either
+   <ulink url="&url.yazproxy;">YAZ Proxy</ulink> or Metaproxy.
+ </para>
+<table id="yazproxy-comparison-table">
+ <title>Metaproxy / YAZ Proxy comparison</title>
+ <tgroup cols="3">
+  <thead>
+   <row>
+    <entry>Facility</entry>
+    <entry>Metaproxy</entry>
+    <entry>YAZ Proxy</entry>
+   </row>
+  </thead>
+  <tbody>
+   <row>
+    <entry>Z39.50 server</entry>
+    <entry>Using filter <literal>frontend_net</literal></entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>SRU server</entry>
+    <entry>Supported with filter <literal>sru_z3950</literal></entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Z39.50 client</entry>
+    <entry>Supported with filter <literal>z3950_client</literal></entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>SRU client</entry>
+    <entry>Unsupported</entry>
+    <entry>Unsupported</entry>
+   </row>
+   <row>
+    <entry>Connection reuse</entry>
+    <entry>Supported with filter <literal>session_shared</literal></entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Connection share</entry>
+    <entry>Supported with filter <literal>session_shared</literal></entry>
+    <entry>Unsupported</entry>
+   </row>
+   <row>
+    <entry>Result set reuse</entry>
+    <entry>Supported with filter <literal>session_shared</literal></entry>
+    <entry>Within one Z39.50 session / HTTP keep-alive</entry>
+   </row>
+   <row>
+    <entry>Record cache</entry>
+    <entry>Unsupported</entry>
+    <entry>Supported for last result set within one Z39.50/HTTP-keep alive session</entry>
+   </row>
+   <row>
+    <entry>Z39.50 Virtual database, i.e. select any Z39.50 target for database</entry>
+    <entry>Supported with filter <literal>virt_db</literal></entry>
+    <entry>Unsupported</entry>
+   </row>
+   <row>
+    <entry>SRU Virtual database, i.e. select any Z39.50 target for path</entry>
+    <entry>Supported with filter <literal>virt_db</literal>, 
+     <literal>sru_z3950</literal></entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Multi target search</entry>
+    <entry>Supported with filter <literal>multi</literal> (round-robin)</entry>
+    <entry>Unsupported</entry>
+   </row>
+   <row>
+    <entry>Retrieval and search limits</entry>
+    <entry>Unsupported</entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Bandwidth limits</entry>
+    <entry>Unsupported</entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Connect limits</entry>
+    <entry>Unsupported</entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Retrieval sanity check and conversions</entry>
+    <entry>Supported using filter <literal>record_transform</literal></entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Query check</entry>
+    <entry>
+     Supported in a limited way using <literal>query_rewrite</literal>
+    </entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Query rewrite</entry>
+    <entry>Supported with <literal>query_rewrite</literal></entry>
+    <entry>Unsupported</entry>
+   </row>
+   <row>
+    <entry>Session invalidate for -1 hits</entry>
+    <entry>Unsupported</entry>
+    <entry>Supported</entry>
+   </row>
+   <row>
+    <entry>Architecture</entry>
+    <entry>Multi-threaded + select for networked modules such as
+      <literal>frontend_net</literal>)</entry>
+    <entry>Single-threaded using select</entry>
+   </row>
+
+   <row>
+    <entry>Extensability</entry>
+    <entry>Most functionality implemented as loadable modules</entry>
+    <entry>Unsupported and experimental</entry>
+   </row>
+
+   <row>
+    <entry><ulink url="&url.usemarcon;">USEMARCON</ulink></entry>
+    <entry>Unsupported</entry>
+    <entry>Supported</entry>
+   </row>
+
+   <row>
+    <entry>Portability</entry>
+    <entry>
+     Requires YAZ, YAZ++ and modern C++ compiler supporting
+     <ulink url="&url.boost;">Boost</ulink>.
+    </entry>
+    <entry>
+     Requires YAZ and YAZ++.
+     STL is not required so pretty much any C++ compiler out there should work.
+    </entry>
+   </row>
+
+  </tbody>
+ </tgroup>
+</table>
+</chapter>
+
  <chapter id="architecture">
   <title>The Metaproxy Architecture</title>
   <para>
       plugins that provide new filters.  The filter API is small and
       conceptually simple, but there are many details to master.  See
       the section below on
-      <link linkend="extensions">extensions</link>.
+      <link linkend="filters">Filters</link>.
      </para>
     </listitem>
    </varlistentry>
@@ -741,8 +920,8 @@ Figure out what additional information we need in:
      sets Z39.50 packages to Z_Close, and HTTP_Request packages to
      HTTP_Response err code 400 packages, and adds a suitable bounce
      message. 
-     The bounce filter is usually added at end of each filter chain
-     config.xml to prevent infinite hanging of for example HTTP
+     The bounce filter is usually added at end of each filter chain route
+     to prevent infinite hanging of for example HTTP
      requests packages when only the Z39.50 client partial sink 
      filter is found in the
      route.  
@@ -750,6 +929,19 @@ Figure out what additional information we need in:
    </section>
    
    <section>
+    <title><literal>cql_rpn</literal>
+    (mp::filter::CQLtoRPN)</title>
+    <para>
+     A query language transforming filter which catches Z39.50 
+     <literal>searchRequest</literal>
+     packages containing <literal>CQL</literal> queries, transforms
+     those to <literal>RPN</literal> queries,
+     and sends the <literal>searchRequests</literal> on to the next
+     filters. It is among other things useful in a SRU context. 
+    </para>
+   </section>
+   
+   <section>
     <title><literal>frontend_net</literal>
      (mp::filter::FrontendNet)</title>
     <para>
@@ -764,7 +956,8 @@ Figure out what additional information we need in:
     <title><literal>http_file</literal>
      (mp::filter::HttpFile)</title>
     <para>
-     A partial sink which swallows only HTTP_Request packages, and 
+     A partial sink which swallows only 
+     <literal>HTTP_Request</literal> packages, and 
      returns the contents of files from the local
      filesystem in response to HTTP requests.  
      It lets Z39.50 packages and all other forthcoming package types
@@ -821,7 +1014,9 @@ Figure out what additional information we need in:
    <title><literal>query_rewrite</literal>
      (mp::filter::QueryRewrite)</title>
     <para>
-     Rewrites Z39.50 Type-1 and Type-101 (``RPN'') queries by a
+     Rewrites Z39.50 <literal>Type-1</literal> 
+     and <literal>Type-101</literal> (``<literal>RPN</literal>'') 
+     queries by a
      three-step process: the query is transliterated from Z39.50
      packet structures into an XML representation; that XML
      representation is transformed by an XSLT stylesheet; and the
@@ -851,10 +1046,8 @@ Figure out what additional information we need in:
     <para>
      This filter implements global sharing of
      result sets (i.e. between threads and therefore between
-     clients), yielding performance improvements especially when
-     incoming requests are from a stateless environment such as a
-     web-server, in which the client process representing a session
-     might be any one of many.
+     clients), yielding performance improvements by clever resource
+     pooling. 
     </para>
    </section>
 
@@ -1123,7 +1316,26 @@ Figure out what additional information we need in:
     which returns the response to the client.
    </para>
   </section>
-  <section id="checking.xml.syntax">
+
+  <section id="config-file-modularity">
+   <title>Config file modularity</title>
+   <para>
+    Metaproxy XML configuration snippets can be reused by other
+    filters using the <literal>XInclude</literal> standard, as seen in
+    the <literal>/etc/config-sru-to-z3950.xml</literal> example SRU 
+    configuration.
+   <screen><![CDATA[
+    <filter id="sru" type="sru_z3950">
+      <database name="Default">
+       <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
+                    href="explain.xml"/>
+      </database>
+    </filter>
+]]></screen>
+    </para>
+  </section>
+
+  <section id="config-file-syntax-check">
    <title>Config file syntax checking</title>
    <para>
     The distribution contains RelaxNG Compact and XML syntax checking
  </chapter>
 
 
+ <chapter id="sru-server">
+  <title>Combined SRU webservice and Z39.50 server configuration</title>
+  <para>
+   Metaproxy can act as 
+   <ulink url="&url.sru;">SRU</ulink> and 
+   <ulink url="&url.srw;">SRW</ulink> 
+   web service server, which translates web service requests to 
+   <ulink url="&url.z39.50;">ANSI/NISO Z39.50</ulink> packages and
+   sends them off to common available targets.
+  </para>
+  <para>
+  A typical setup for this operation needs a filter route including the
+  following modules: 
+  </para>
+   
+  <table id="sru-server-table-config" frame="top">
+   <title>SRU/Z39.50 Server Filter Route Configuration</title>
+   <tgroup cols="3">
+    <thead>
+     <row>
+      <entry>Filter</entry>
+      <entry>Importance</entry>
+      <entry>Purpose</entry>
+     </row>
+    </thead>
+    
+    <tbody>
+     <row>
+      <entry><literal>frontend_net</literal></entry>
+      <entry>required</entry>
+      <entry>Accepting HTTP connections and passing them to following
+      filters. Since this filter also accepts Z39.50 connections, the
+      server works as SRU and Z39.50 server on the same port.</entry>
+     </row>
+     <row>
+      <entry><literal>sru_z3950</literal></entry>
+      <entry>required</entry>
+      <entry>Accepting SRU GET/POST/SOAP explain and
+       searchRetrieve requests for the the configured databases.
+       Explain requests are directly served from the static XML configuration.
+       SearchRetrieve requests are
+       transformed  to Z39.50 search and present packages.
+       All other HTTP and Z39.50 packages are passed unaltered.</entry>
+     </row>
+     <row>
+      <entry><literal>http_file</literal></entry>
+      <entry>optional</entry>
+      <entry>Serving HTTP requests from the filesystem. This is only
+      needed if the server should serve XSLT stylesheets, static HTML
+      files or Java Script for thin browser based clients.
+       Z39.50 packages are passed unaltered.</entry>
+     </row>
+     <row>
+      <entry><literal>cql_rpn</literal></entry>
+      <entry>required</entry>
+      <entry>Usually, Z39.50 servers do not talk CQL, hence the
+      translation of the CQL query language to RPN is mandatory in
+      most cases. Affects only  Z39.50 search packages.</entry>
+     </row>
+     <row>
+      <entry><literal>record_transform</literal></entry>
+      <entry>optional</entry>
+      <entry>Some Z39.50 backend targets can not present XML record
+      syntaxes in common wanted element sets. using this filter, one
+      can transform binary MARC records to MARCXML records, and
+      further transform those to any needed XML schema/format by XSLT
+      transformations. Changes only  Z39.50 present packages.</entry>
+     </row>
+     <row>
+      <entry><literal>session_shared</literal></entry>
+      <entry>optional</entry>
+      <entry>The stateless nature of web services requires frequent
+      re-searching of the same targets for display of paged result set
+      records. This might be an unacceptable burden for the accessed
+      backend Z39.50 targets, and this mosule can be added for
+      efficient backend target resource pooling.</entry>
+     </row>
+     <row>
+      <entry><literal>z3950_client</literal></entry>
+      <entry>required</entry>
+      <entry>Finally, a Z39.50 package sink is needed in the filter
+      chain to provide the response packages. The Z39.50 client module
+      is used to access external targets over the network, but any
+      coming local Z39.50 package sink could be used instead of.</entry>
+     </row>
+     <row>
+      <entry><literal>bounce</literal></entry>
+      <entry>required</entry>
+      <entry>Any Metaproxy package arriving here did not do so by
+      purpose, and is bounced back with connection closure. this
+      prevents inifinite package hanging inside the SRU server.</entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+  <para> 
+   A typical minimal example  <ulink url="&url.sru;">SRU</ulink> and 
+   <ulink url="&url.srw;">SRW</ulink> server configuration file is found
+   in the tarball distribution at 
+   <literal>etc/config-sru-to-z3950.xml</literal>.
+  </para> 
+  <para>
+   Off course, any other metaproxy modules can be integrated into a
+   SRU server solution, including, but not limited to, load balancing, 
+   multiple target querying 
+   (see  <xref linkend="multidb"/>), and complex RPN query rewrites. 
+  </para>
+
+
+ </chapter>
 
+ <!--
  <chapter id="extensions">
   <title>Writing extensions for Metaproxy</title>
   <para>### To be written</para>
  </chapter>
-
+ -->