+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ From Pazpar2 version 1.1 the ICU wrapper from YAZ is used.
+ Refer to the <ulink url="&url.yaz.yaz-icu;">yaz-icu</ulink>
+ utility for more information.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>relevance</term>
+ <listitem>
+ <para>
+ Specifies the ICU rule set used for relevance ranking.
+ The child element of 'relevance' must be 'icu_chain' and the
+ 'id' attribute of the icu_chain is ignored. This
+ definition is obsolete and should be replaced by the equivalent
+ construct:
+ <screen>
+ <icu_chain id="relevance" locale="en">..<icu_chain>
+ </screen>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>sort</term>
+ <listitem>
+ <para>
+ Specifies the ICU rule set used for sorting.
+ The child element of 'sort' must be 'icu_chain' and the
+ 'id' attribute of the icu_chain is ignored. This
+ definition is obsolete and should be replaced by the equivalent
+ construct:
+ <screen>
+ <icu_chain id="sort" locale="en">..<icu_chain>
+ </screen>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>mergekey</term>
+ <listitem>
+ <para>
+ Specifies ICU tokenization and transformation rules
+ for tokens that are used in Pazpar2's mergekey.
+ The child element of 'mergekey' must be 'icu_chain' and the
+ 'id' attribute of the icu_chain is ignored. This
+ definition is obsolete and should be replaced by the equivalent
+ construct:
+ <screen>
+ <icu_chain id="mergekey" locale="en">..<icu_chain>
+ </screen>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>facet</term>
+ <listitem>
+ <para>
+ Specifies ICU tokenization and transformation rules
+ for tokens that are used in Pazpar2's facets.
+ The child element of 'facet' must be 'icu_chain' and the
+ 'id' attribute of the icu_chain is ignored. This
+ definition is obsolete and should be replaced by the equivalent
+ construct:
+ <screen>
+ <icu_chain id="facet" locale="en">..<icu_chain>
+ </screen>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>settings</term>
+ <listitem>
+ <para>
+ Specifies target settings for this service. Refer to
+ <xref linkend="target_settings"/>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>timeout</term>
+ <listitem>
+ <para>
+ Specifies timeout parameters for this service.
+ The <literal>timeout</literal>
+ element supports the following attributes:
+ <literal>session</literal>, <literal>z3950_operation</literal>,
+ <literal>z3950_session</literal> which specifies
+ 'session timeout', 'Z39.50 operation timeout',
+ 'Z39.50 session timeout' respectively. The Z39.50 operation
+ timeout is the time Pazpar2 will wait for an active Z39.50/SRU
+ operation before it gives up (times out). The Z39.50 session
+ time out is the time Pazpar2 will keep the session alive for
+ an idle session (no operation).
+ </para>
+ <para>
+ The following is recommended but not required:
+ z3950_operation (30) < session (60) < z3950_session (180) .
+ The default values are given in parantheses.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist> <!-- Data elements in service directive -->
+ </listitem>
+ </varlistentry>
+ </variablelist> <!-- Data elements in server directive -->
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>EXAMPLE</title>
+ <para>
+ Below is a working example configuration:
+ </para>
+ <screen>
+ <![CDATA[
+<?xml version="1.0" encoding="UTF-8"?>
+<pazpar2 xmlns="http://www.indexdata.com/pazpar2/1.0">
+
+ <threads number="10"/>
+ <server>
+ <listen port="9004"/>
+ <service>
+ <metadata name="title" brief="yes" sortkey="skiparticle"
+ merge="longest" rank="6"/>
+ <metadata name="isbn" merge="unique"/>
+ <metadata name="date" brief="yes" sortkey="numeric"
+ type="year" merge="range" termlist="yes"/>
+ <metadata name="author" brief="yes" termlist="yes"
+ merge="longest" rank="2"/>
+ <metadata name="subject" merge="unique" termlist="yes" rank="3"/>
+ <metadata name="url" merge="unique"/>
+ <icu_chain id="relevance" locale="el">
+ <transform rule="[:Control:] Any-Remove"/>
+ <tokenize rule="l"/>
+ <transform rule="[[:WhiteSpace:][:Punctuation:]] Remove"/>
+ <casemap rule="l"/>
+ </icu_chain>
+ <settings src="mysettings"/>
+ <timeout session="60"/>
+ <service>
+ </server>
+</pazpar2>
+ ]]>
+ </screen>
+ </refsect1>
+
+ <refsect1 id="config-include">
+ <title>INCLUDE FACILITY</title>
+ <para>
+ The XML configuration may be partitioned into multiple files by using
+ the <literal>include</literal> element which takes a single attribute,
+ <literal>src</literal>. The of the <literal>src</literal> attribute is
+ regular Shell like glob-pattern. For example,
+ <screen><![CDATA[
+ <include src="/etc/pazpar2/conf.d/*.xml"/>
+ ]]></screen>
+ </para>
+ <para>
+ The include facility requires Pazpar2 version 1.2.
+ </para>
+ </refsect1>
+
+ <refsect1 id="target_settings">
+ <title>TARGET SETTINGS</title>
+ <para>
+ Pazpar2 features a cunning scheme by which you can associate various
+ kinds of attributes, or settings with search targets. This can be done
+ through XML files which are read at startup; each file can associate
+ one or more settings with one or more targets. The file format is generic
+ in nature, designed to support a wide range of application requirements. The
+ settings can be purely technical things, like, how to perform a title
+ search against a given target, or it can associate arbitrary name=value
+ pairs with groups of targets -- for instance, if you would like to
+ place all commercial full-text bases in one group for selection
+ purposes, or you would like to control what targets are accessible
+ to users by default. Per-database settings values can even be used
+ to drive sorting, facet/termlist generation, or end-user interface display
+ logic.
+ </para>
+
+ <para>
+ During startup, Pazpar2 will recursively read a specified directory
+ (can be identified in the pazpar2.cfg file or on the command line), and
+ process any settings files found therein.
+ </para>
+
+ <para>
+ Clients of the Pazpar2 webservice interface can selectively override
+ settings for individual targets within the scope of one session. This
+ can be used in conjunction with an external authentication system to
+ determine which resources are to be accessible to which users. Pazpar2
+ itself has no notion of end-users, and so can be used in conjunction
+ with any type of authentication system. Similarly, the authentication
+ tokens submitted to access-controlled search targets can similarly be
+ overridden, to allow use of Pazpar2 in a consortial or multi-library
+ environment, where different end-users may need to be represented to
+ some search targets in different ways. This, again, can be managed
+ using an external database or other lookup mechanism. Setting overrides
+ can be performed either using the
+ <link linkend="command-init">init</link> or the
+ <link linkend="command-settings">settings</link> webservice
+ command.
+ </para>
+
+ <para>
+ In fact, every setting that applies to a database (except pz:id, which
+ can only be used for filtering targets to use for a search) can be overridden
+ on a per-session basis. This allows the client to override specific CCL fields
+ for searching, etc., to meet the needs of a session or user.
+ </para>
+
+ <para>
+ Finally, as an extreme case of this, the webservice client can
+ introduce entirely new targets, on the fly, as part of the
+ <link linkend="command-init">init</link> or
+ <link linkend="command-settings">settings</link> command.
+ This is useful if you desire to manage information
+ about your search targets in a separate application such as a database.
+ You do not need any static settings file whatsoever to run Pazpar2 -- as
+ long as the webservice client is prepared to supply the necessary
+ information at the beginning of every session.
+ </para>
+
+ <note>
+ <para>
+ The following discussion of practical issues related to session and settings
+ management are cast in terms of a user interface based on Ajax/Javascript
+ technology. It would apply equally well to many other kinds of browser-based logic.
+ </para>
+ </note>
+
+ <para>
+ Typically, a Javascript client is not allowed to directly alter the parameters
+ of a session. There are two reasons for this. One has to do with access
+ to information; typically, information about a user will be stored in a
+ system on the server side, or it will be accessible in some way from the server.
+ However, since the Javascript client cannot be entirely trusted (some hostile
+ agent might in fact 'pretend' to be a regular ws client), it is more robust
+ to control session settings from scripting that you run as part of your
+ webserver. Typically, this can be handled during the session initialization,
+ as follows:
+ </para>
+
+ <para>
+ Step 1: The Javascript client loads, and asks the webserver for a new Pazpar2
+ session ID. This can be done using a Javascript call, for instance. Note that
+ it is possible to submit Ajax HTTPXmlRequest calls either to Pazpar2 or to the
+ webserver that Pazpar2 is proxying for. See (XXX Insert link to Pazpar2 protocol).
+ </para>
+
+ <para>
+ Step 2: Code on the webserver authenticates the user, by database lookup,
+ LDAP access, NCIP, etc. Determines which resources the user has access to,
+ and any user-specific parameters that are to be applied during this session.
+ </para>
+
+ <para>
+ Step 3: The webserver initializes a new Pazpar2 settings, and sets user-specific
+ parameters as necessary, using the init webservice command. A new session ID is
+ returned.
+ </para>
+
+ <para>
+ Step 4: The webserver returns this session ID to the Javascript client, which then
+ uses the session ID to submit searches, show results, etc.
+ </para>
+
+ <para>
+ Step 5: When the Javascript client ceases to use the session, Pazpar2 destroys
+ any session-specific information.
+ </para>
+
+ <refsect2>
+ <title>SETTINGS FILE FORMAT</title>
+ <para>
+ Each file contains a root element named <settings>. It may
+ contain one or more <set> elements. The settings and set
+ elements may contain the following attributes. Attributes in the set node
+ overrides those in the setting root element. Each set node must
+ specify (directly, or inherited from the parent node) at least a
+ target, name, and value.
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term>target</term>
+ <listitem>
+ <para>
+ This specifies the search target to which this setting should be
+ applied. Targets are identified by their Z39.50 URL, generally
+ including the host, port, and database name, (e.g.
+ <literal>bagel.indexdata.com:210/marc</literal>).
+ Two wildcard forms are accepted:
+ * (asterisk) matches all known targets;
+ <literal>bagel.indexdata.com:210/*</literal> matches all
+ known databases on the given host.
+ </para>
+ <para>
+ A precedence system determines what happens if there are
+ overlapping values for the same setting name for the same
+ target. A setting for a specific target name overrides a
+ setting which specifies target using a wildcard. This makes it
+ easy to set defaults for all targets, and then override them
+ for specific targets or hosts. If there are
+ multiple overlapping settings with the same name and target
+ value, the 'precedence' attribute determines what happens.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>name</term>
+ <listitem>
+ <para>
+ The name of the setting. This can be anything you like.
+ However, Pazpar2 reserves a number of setting names for
+ specific purposes, all starting with 'pz:', and it is a good
+ idea to avoid that prefix if you make up your own setting
+ names. See below for a list of reserved variables.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>value</term>
+ <listitem>
+ <para>
+ The value of the setting. Generally, this can be anything you
+ want -- however, some of the reserved settings may expect
+ specific kinds of values.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>precedence</term>
+ <listitem>
+ <para>
+ This should be an integer. If not provided, the default value
+ is 0. If two (or more) settings have the same content for
+ target and name, the precedence value determines the outcome.
+ If both settings have the same precedence value, they are both
+ applied to the target(s). If one has a higher value, then the
+ value of that setting is applied, and the other one is ignored.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ By setting defaults for target, name, or value in the root
+ settings node, you can use the settings files in many different
+ ways. For instance, you can use a single file to set defaults for
+ many different settings, like search fields, retrieval syntaxes,
+ etc. You can have one file per server, which groups settings for
+ that server or target. You could also have one file which associates
+ a number of targets with a given setting, for instance, to associate
+ many databases with a given category or class that makes sense
+ within your application.
+ </para>
+
+ <para>
+ The following examples illustrate uses of the settings system to
+ associate settings with targets to meet different requirements.
+ </para>
+
+ <para>
+ The example below associates a set of default values that can be
+ used across many targets. Note the wildcard for targets.
+ This associates the given settings with all targets for which no
+ other information is provided.
+ <screen><![CDATA[
+ <settings target="*">
+
+ <!-- This file introduces default settings for pazpar2 -->
+
+ <!-- mapping for unqualified search -->
+ <set name="pz:cclmap:term" value="u=1016 t=l,r s=al"/>
+
+ <!-- field-specific mappings -->
+ <set name="pz:cclmap:ti" value="u=4 s=al"/>
+ <set name="pz:cclmap:su" value="u=21 s=al"/>
+ <set name="pz:cclmap:isbn" value="u=7"/>
+ <set name="pz:cclmap:issn" value="u=8"/>
+ <set name="pz:cclmap:date" value="u=30 r=r"/>
+
+ <set name="pz:limitmap:title" value="rpn:@attr 1=4 @attr 6=3"/>
+ <set name="pz:limitmap:date" value="ccl:date"/>
+
+ <!-- Retrieval settings -->
+
+ <set name="pz:requestsyntax" value="marc21"/>
+ <set name="pz:elements" value="F"/>
+
+ <!-- Query encoding -->
+ <set name="pz:queryencoding" value="iso-8859-1"/>
+
+ <!-- Result normalization settings -->
+
+ <set name="pz:nativesyntax" value="iso2709"/>
+ <set name="pz:xslt" value="../etc/marc21.xsl"/>
+
+ </settings>
+
+ ]]></screen>
+ </para>
+
+ <para>
+ The next example shows certain settings overridden for one target,
+ one which returns XML records containing DublinCore elements, and
+ which furthermore requires a username/password.
+ <screen><![CDATA[
+ <settings target="funkytarget.com:210/db1">
+ <set name="pz:requestsyntax" value="xml"/>
+ <set name="pz:nativesyntax" value="xml"/>
+ <set name="pz:xslt" value="../etc/dublincore.xsl"/>
+
+ <set name="pz:authentication" value="myuser/password"/>
+ </settings>
+ ]]></screen>
+ </para>
+
+ <para>
+ The following example associates a specific name/value combination
+ with a number of targets. The targets below are access-restricted,
+ and can only be used by users with special credentials.
+ <screen><![CDATA[
+ <settings name="pz:allow" value="0">
+ <set target="funkytarget.com:210/*"/>
+ <set target="commercial.com:2100/expensiveDb"/>
+ </settings>
+ ]]></screen>
+ </para>
+
+ </refsect2>