+ <title>ALVIS Canonical Indexing Format</title>
+ <para>The output of the indexing XSLT stylesheets must contain
+ certain elements in the magic
+ <literal>xmlns:z="http://indexdata.dk/zebra/xslt/1"</literal>
+ namespace. The output of the XSLT indexing transformation is then
+ parsed using DOM methods, and the contained instructions are
+ performed on the <emphasis>magic elements and their
+ subtrees</emphasis>.
+ </para>
+ <para>
+ For example, the output of the command
+ <screen>
+ xsltproc xsl/oai2index.xsl one-record.xml
+ </screen>
+ might look like this:
+ <screen>
+ <?xml version="1.0" encoding="UTF-8"?>
+ <z:record xmlns:z="http://indexdata.dk/zebra/xslt/1"
+ z:id="oai:JTRS:CP-3290---Volume-I"
+ z:rank="47896"
+ z:type="update">
+ <z:index name="oai:identifier" type="0">
+ oai:JTRS:CP-3290---Volume-I</z:index>
+ <z:index name="oai:datestamp" type="0">2004-07-09</z:index>
+ <z:index name="oai:setspec" type="0">jtrs</z:index>
+ <z:index name="dc:all" type="w">
+ <z:index name="dc:title" type="w">Proceedings of the 4th
+ International Conference and Exhibition:
+ World Congress on Superconductivity - Volume I</z:index>
+ <z:index name="dc:creator" type="w">Kumar Krishen and *Calvin
+ Burnham, Editors</z:index>
+ </z:index>
+ </z:record>
+ </screen>
+ </para>
+ <para>This means the following: From the original XML file
+ <literal>one-record.xml</literal> (or from the XML record DOM of the
+ same form coming from a splitted input file), the indexing
+ stylesheet produces an indexing XML record, which is defined by
+ the <literal>record</literal> element in the magic namespace
+ <literal>xmlns:z="http://indexdata.dk/zebra/xslt/1"</literal>.
+ Zebra uses the content of
+ <literal>z:id="oai:JTRS:CP-3290---Volume-I"</literal> as internal
+ record ID, and - in case static ranking is set - the content of
+ <literal>z:rank="47896"</literal> as static rank. Following the
+ discussion in <xref linkend="administration-ranking"/>
+ we see that this records is internally ordered
+ lexicographically according to the value of the string
+ <literal>oai:JTRS:CP-3290---Volume-I47896</literal>.
+ The type of action performed during indexing is defined by
+ <literal>z:type="update"></literal>, with recognized values
+ <literal>insert</literal>, <literal>update</literal>, and
+ <literal>delete</literal>.
+ </para>
+ <para>In this example, the following literal indexes are constructed:
+ <screen>
+ oai:identifier
+ oai:datestamp
+ oai:setspec
+ dc:all
+ dc:title
+ dc:creator
+ </screen>
+ where the indexing type is defined in the
+ <literal>type</literal> attribute
+ (any value from the standard configuration
+ file <filename>default.idx</filename> will do). Finally, any
+ <literal>text()</literal> node content recursively contained
+ inside the <literal>index</literal> will be filtered through the
+ appropriate charmap for character normalization, and will be
+ inserted in the index.
+ </para>
+ <para>
+ Specific to this example, we see that the single word
+ <literal>oai:JTRS:CP-3290---Volume-I</literal> will be literal,
+ byte for byte without any form of character normalization,
+ inserted into the index named <literal>oai:identifier</literal>,
+ the text
+ <literal>Kumar Krishen and *Calvin Burnham, Editors</literal>
+ will be inserted using the <literal>w</literal> character
+ normalization defined in <filename>default.idx</filename> into
+ the index <literal>dc:creator</literal> (that is, after character
+ normalization the index will keep the inidividual words
+ <literal>kumar</literal>, <literal>krishen</literal>,
+ <literal>and</literal>, <literal>calvin</literal>,
+ <literal>burnham</literal>, and <literal>editors</literal>), and
+ finally both the texts
+ <literal>Proceedings of the 4th International Conference and Exhibition:
+ World Congress on Superconductivity - Volume I</literal>
+ and
+ <literal>Kumar Krishen and *Calvin Burnham, Editors</literal>
+ will be inserted into the index <literal>dc:all</literal> using
+ the same character normalization map <literal>w</literal>.
+ </para>
+ <para>
+ Finally, this example configuration can be queried using PQF
+ queries, either transported by Z39.50, (here using a yaz-client)
+ <screen>
+ <![CDATA[
+ Z> open localhost:9999
+ Z> elem dc
+ Z> form xml
+ Z>
+ Z> f @attr 1=dc:creator Kumar
+ Z> scan @attr 1=dc:creator adam
+ Z>
+ Z> f @attr 1=dc:title @attr 4=2 "proceeding congress superconductivity"
+ Z> scan @attr 1=dc:title abc
+ ]]>
+ </screen>
+ or the proprietary
+ extentions <literal>x-pquery</literal> and
+ <literal>x-pScanClause</literal> to
+ SRU, and SRW
+ <screen>
+ <![CDATA[
+ http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=%40attr+1%3Ddc%3Acreator+%40attr+4%3D6+%22the
+ http://localhost:9999/?version=1.1&operation=scan&x-pScanClause=@attr+1=dc:date+@attr+4=2+a
+ ]]>
+ </screen>
+ See <xref linkend="server-sru"/> for more information on SRU/SRW
+ configuration, and <xref linkend="gfs-config"/> or
+ <ulink url="http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql">
+ the YAZ manual CQL section</ulink>
+ for the details
+ of the YAZ frontend server
+ <ulink url="http://www.loc.gov/standards/sru/cql/">CQL</ulink>
+ configuration.
+ </para>
+ <para>
+ Notice that there are no <filename>*.abs</filename>,
+ <filename>*.est</filename>, <filename>*.map</filename>, or other GRS-1
+ filter configuration files involves in this process, and that the
+ literal index names are used during search and retrieval.
+ </para>