X-Git-Url: http://lists.indexdata.com/cgi-bin?a=blobdiff_plain;f=doc%2Fpazpar2_conf.xml;h=6dca6102efc16dac11f792b71e2decbd42790831;hb=b2807317725db68d786503711be67ecf163115b7;hp=8eb16c2a6b3c5ae93cb92b8fd9188224b2bb38ce;hpb=d7dc14dcdfbd1ecdc805a0d649203f3b9888749c;p=pazpar2-moved-to-github.git diff --git a/doc/pazpar2_conf.xml b/doc/pazpar2_conf.xml index 8eb16c2..6dca610 100644 --- a/doc/pazpar2_conf.xml +++ b/doc/pazpar2_conf.xml @@ -9,7 +9,6 @@ %idcommon; ]> - Pazpar2 @@ -60,8 +59,9 @@ server - This section governs overall behavior of the client. The data - elements are described below. + This section governs overall behavior of the server. The data + elements are described below. From Pazpar2 version 1.2 this is + a repeatable element. @@ -101,91 +101,30 @@ - - - relevance - - - Specifies ICU tokenization and normalization rules - for tokens that are used in Pazpar2's relevance ranking. The 'id' - attribute is currently not used, and the 'locale' - attribute must be set to one of the locale strings - defined in ICU. The child elements listed below can be - in any order, except the 'index' element which logically - belongs to the end of the list. The stated tokenization, - normalization and charmapping instructions are performed - in order from top to bottom. - - - casemap - - - The attribute 'rule' defines the direction of the - per-character casemapping, allowed values are "l" - (lower), "u" (upper), "t" (title). - - - - normalize - - - Normalization and transformation of tokens follows - the rules defined in the 'rule' attribute. For - possible values we refer to the extensive ICU - documentation found at the - ICU - transformation home page. Set filtering - principles are explained at the - ICU set and - filtering page. - - - - tokenize - - - Tokenization is the only rule in the ICU chain - which splits one token into multiple tokens. The - 'rule' attribute may have the following values: - "s" (sentence), "l" (line-break), "w" (word), and - "c" (character), the later probably not being - very useful in a pruning Pazpar2 installation. - - - - index - - - Finally the 'index' element instruction - without - any 'rule' attribute - is used to store the tokens - after chain processing in the relevance ranking - unit of Pazpar2. It will always be the last - instruction in the chain. - - - - - - - sort + relevance / sort / mergekey - Specifies ICU tokenization and normalization rules - for tokens that are used in Pazpar2's sorting. The contents - is similar to that of relevance. + Specifies character set normalization for relevancy / sorting + and the mergekey - for the server. These definitions serves as + default for services that don't have these given. For the meaning + of these settings refer to the "relevance" element inside service. - mergekey + settings - Specifies ICU tokenization and normalization rules - for tokens that are used in Pazpar2's mergekey. The contents - is similar to that of relevance. + Specifies target settings for the server.. These settings serves + as default for all services which don't have these given. + The settings element requires one attribute 'src' which specifies + a settings file or a directory . If a directory is given all + files with suffix .xml is read from this + directory. Refer to + for more information. @@ -201,7 +140,16 @@ extraction of data from the internal representation, primarily through the 'metadata' sub-element. - + + Pazpar2 version 1.2 and later allows multiple service elements. + Multiple services must be given a unique ID by specifying + attribute id. + A single service may be unnamed (service ID omitted). The + service ID is referred to in the + init webservice + command's service parameter. + + metadata @@ -314,6 +262,29 @@ + mergekey + + + If set to 'required', the value of this + metadata element is appended to the resulting mergekey if + the metadata is present in a record instance. + If the metadata element is not present, the a unique mergekey + will be generated instead. + + + If set to 'optional', the value of this + metadata element is appended to the resulting mergekey if the + the metadata is present in a record instance. If the metadata + is not present, it will be empty. + + + If set to 'no' or the mergekey attribute is + omitted, the metadata will not be used in the creation of a + mergekey. + + + + setting @@ -328,64 +299,196 @@ the value to decide how to deal with other data values. - The purpose of using settings in this way can either be to control the behavior of normalization stylesheet in a database- dependent way, or to easily make database-dependent values available to display-logic in your user interface, without having to implement complicated interactions between the user interface and your configuration system. + + + + + relevance + + + Specifies ICU tokenization and transformation rules + for tokens that are used in Pazpar2's relevance ranking. + The 'id' attribute is currently not used, and the 'locale' + attribute must be set to one of the locale strings + defined in ICU. The child elements listed below can be + in any order, except the 'index' element which logically + belongs to the end of the list. The stated tokenization, + transformation and charmapping instructions are performed + in order from top to bottom. + + + casemap + + + The attribute 'rule' defines the direction of the + per-character casemapping, allowed values are "l" + (lower), "u" (upper), "t" (title). + + + + transform + + + Normalization and transformation of tokens follows + the rules defined in the 'rule' attribute. For + possible values we refer to the extensive ICU + documentation found at the + ICU + transformation home page. Set filtering + principles are explained at the + ICU set and + filtering page. + + + + tokenize + + + Tokenization is the only rule in the ICU chain + which splits one token into multiple tokens. The + 'rule' attribute may have the following values: + "s" (sentence), "l" (line-break), "w" (word), and + "c" (character), the later probably not being + very useful in a pruning Pazpar2 installation. + + + + + + From Pazpar2 version 1.1 the ICU wrapper from YAZ is used. + Refer to the yaz-icu + utility for more information. + + + + + + sort + + + Specifies ICU tokenization and transformation rules + for tokens that are used in Pazpar2's sorting. The contents + is similar to that of relevance. + + + + + + mergekey + + + Specifies ICU tokenization and transformation rules + for tokens that are used in Pazpar2's mergekey. The contents + is similar to that of relevance. + + + + + + settings + + + Specifies target settings for this service. Refer to + . + + + + + + timeout + + + Specifies timeout parameters for this service. + The timeout + element supports the following attributes: + session, z3950_operation, + z3950_session which specifies + 'session timeout', 'Z39.50 operation timeout', + 'Z39.50 session timeout' respectively. The Z39.50 operation + timeout is the time Pazpar2 will wait for an active Z39.50/SRU + operation before it gives up (times out). The Z39.50 session + time out is the time Pazpar2 will keep the session alive for + an idle session (no operation). + + + The following is recommended but not required: + z3950_operation (30) < session (60) < z3950_session (180) . + The default values are given in parantheses. + + + + + - + EXAMPLE Below is a working example configuration: - - - - - - - - - - - - - - - - - - - - - -]]> + + + + + + + + + + + + + + + + + + + + + + + + + + ]]> - + + INCLUDE FACILITY + + The XML configuration may be partitioned into multiple files by using + the include element which takes a single attribute, + src. The of the src attribute is + regular Shell like glob-pattern. For example, + + ]]> + + + The include facility requires Pazpar2 version 1.2. + + + TARGET SETTINGS Pazpar2 features a cunning scheme by which you can associate various @@ -421,7 +524,9 @@ environment, where different end-users may need to be represented to some search targets in different ways. This, again, can be managed using an external database or other lookup mechanism. Setting overrides - can be performed either using the 'init' or the 'settings' webservice + can be performed either using the + init or the + settings webservice command. @@ -434,8 +539,10 @@ Finally, as an extreme case of this, the webservice client can - introduce entirely new targets, on the fly, as part of the init or - settings command. This is useful if you desire to manage information + introduce entirely new targets, on the fly, as part of the + init or + settings command. + This is useful if you desire to manage information about your search targets in a separate application such as a database. You do not need any static settings file whatsoever to run Pazpar2 -- as long as the webservice client is prepared to supply the necessary @@ -590,7 +697,6 @@ - @@ -605,7 +711,10 @@ - + + + + @@ -685,7 +794,7 @@ The element set name to be used when retrieving records from a - server (not yet implemented). + server. @@ -713,8 +822,24 @@ For iso2709, can also specify a native character set, e.g. "iso2709;latin-1". If no character set is provided, MARC-8 is assumed. + + If pz:nativesyntax is not specified, pazpar2 will attempt to determine + the value based on the response from the server. + + + + pz:queryencoding + + + The encoding of the search terms that a target accepts. Most + targets do not honor UTF-8 in which case this needs to be specified. + Each term in a query will be converted if this setting is given. + + + + pz:xslt @@ -722,6 +847,20 @@ Provides the path of an XSLT stylesheet which will be used to map incoming records to the internal representation. + + When mapping MARC XML records, XSLT can be bypassed for increased + performance with the alternate "MARC map" format. Provide the + path of a file with extension ".mmap" containing on each line: + + <field> <subfield> <metadata element> + For example: + + 245 a title + 500 $ description + 773 * citation + To map the field value specify a subfield of '$'. To store a + concatenation of all subfields, specify a subfield of '*'. + @@ -749,7 +888,7 @@ Controls the maximum number of records to be retrieved from a - server. The default is 100 (not yet implemented). + server. The default is 100. @@ -784,20 +923,66 @@ + + + pz:sru + + + This setting enables SRU/SRW support. It has three possible settings. + 'get', enables SRU access through GET requests. 'post' enables SRU/POST + support, less commonly supported, but useful if very large requests are + to be submitted. 'srw' enables the SRW variation of the protocol. + + + + + + pz:sru_version + + + This allows SRU version to be specified. If unset Pazpar2 + will the default of YAZ (currently 1.2). Should be set + to 1.1 or 1.2. + + + + + + pz:pqf_prefix + + + Allows you to specify an arbitrary PQF query language substring. The provided + string is prefixed the user's query after it has been normalized to PQF + internally in pazpar2. This allows you to attach complex 'filters' to + queries for a gien target, sometimes necessary to select sub-catalogs + in union catalog systems, etc. + + + + + + pz:sort + + + Specifies sort criteria to be applied to the result set. Only works for targets + which support the sort service. + + + SEE ALSO - Pazpar2: pazpar2 8 - - - Pazpar2 protocol: + + yaz-icu + 1 + pazpar2_protocol 7