X-Git-Url: http://lists.indexdata.com/cgi-bin?a=blobdiff_plain;f=doc%2Fpazpar2_conf.xml;h=57427cae7b4350530859df5539f510e63bdd53db;hb=23a2402edc299e4ec53b5deabce34fc306cbe848;hp=9605df3dc4f69cf9926fbff9b8beb6a832eea83b;hpb=4f6b54c65f9ac5b0760bc11eb51747857c1d5589;p=pazpar2-moved-to-github.git
diff --git a/doc/pazpar2_conf.xml b/doc/pazpar2_conf.xml
index 9605df3..57427ca 100644
--- a/doc/pazpar2_conf.xml
+++ b/doc/pazpar2_conf.xml
@@ -133,7 +133,7 @@
mergekey and facets - for the server. These definitions serves as
default for services that don't have these given. For the meaning
of these settings refer to the
- "icu_chain" element inside service.
+ element inside service.
@@ -163,7 +163,7 @@
- service
+ service
This nested element controls the behavior of Pazpar2 with
@@ -262,13 +262,35 @@
Specifies that this element is to be used to
help rank
records against the user's query (when ranking is
- requested). The value is an integer, used as a
- multiplier against the basic TF*IDF score. A value of
- 1 is the base, higher values give additional
- weight to
+ requested).
+ The valus is of the form
+
+ M [F N]
+
+ where M is an integer, used as a
+ weight against the basic TF*IDF score. A value of
+ 1 is the base, higher values give additional weight to
elements of this type. The default is '0', which
excludes this element from the rank calculation.
+
+ F is a CCL field and N is the multipler for terms
+ that matches those part of the CCL field in search.
+ The F+N combo allows the system to use a different
+ multipler for a certain field. For example, a rank value of
+ "1 au 3" gives a multipler of 3 for
+ all terms part of the au(thor) terms and 1 for everything else.
+
+
+ For Pazpar2 1.6.13 and later, the rank may also defined
+ "per-document", by the normalization stylesheet.
+
+
+ The per field rank was introduced in Pazpar2 1.6.15. Earlier
+ releases only allowed a rank value M (simple integer).
+
+ See for more
+ about ranking.
@@ -300,6 +322,11 @@
all elements), or 'no' (don't merge; this is the
default);
+
+ Pazpar 1.6.24 also offers a new value for merge, 'first', which
+ is like 'all' but only takes all from first database that returns
+ the particular metadata field.
+
@@ -326,6 +353,55 @@
+
+
+ facetrule
+
+
+ Specifies the ICU rule set to be used for normalizing
+ facets. If facetrule is omitted from metadata, the
+ rule set 'facet' is used.
+
+
+
+
+
+ limitcluster
+
+
+ Allow a limit on merged metadata. The value of this attribute
+ is the name of actual metadata content to be used for matching
+ (most often same name as metadata name).
+
+
+
+ Requires Pazpar2 1.6.23 or later.
+
+
+
+
+
+
+ limitmap
+
+
+ Specifies a default limitmap for this field. This is to avoid mass
+ configuring of targets. However it is important to review/do this on a per
+ target since it is usually target-specific. See limitmap for format.
+
+
+
+
+
+ facetmap
+
+
+ Specifies a default facetmap for this field. This is to avoid mass
+ configuring of targets. However it is important to review/do this on a per
+ target since it is usually target-specific. See facetmap for format.
+
+
+
setting
@@ -358,7 +434,21 @@
- icu_chain
+ xslt
+
+
+ Defines a XSLT stylesheet. The xslt
+ element takes exactly one attribute id
+ which names the stylesheet. This can be referred to in target
+ settings .
+
+
+ The content of the xslt element is the embedded stylesheet XML
+
+
+
+
+ icu_chain
Specifies a named ICU rule set. The icu_chain element must include
@@ -368,8 +458,10 @@
Rule set 'relevance' is used to normalize
terms for relevance ranking. Rule set 'sort' is used to
normalize terms for sorting. Rule set 'mergekey' is used to
- normalize terms for making a mergekey and, finally, 'facet'
- is used to normalize facet terms (AKA termlists).
+ normalize terms for making a mergekey and, finally. Rule set 'facet'
+ is normally used to normalize facet terms, unless
+ facetrule is given for a
+ metadata field.
The icu_chain element must also include a 'locale'
@@ -494,7 +586,138 @@
+
+
+ ccldirective
+
+
+ Customizes the CCL parsing (interpretation of query parameter
+ in search).
+ The name and value of the CCL directive is gigen by attributes
+ 'name' and 'value' respectively. Refer to possible list of names
+ in the
+
+ YAZ manual
+ .
+
+
+
+
+
+ rank
+
+
+ Customizes the ranking (relevance) algorithm. Also known as
+ rank tweaks. The rank element
+ accepts the following attributes - all being optional:
+
+
+
+ cluster
+
+
+ Attribute 'cluster' is a boolean
+ that controls whether Pazpar2 should boost ranking for merged
+ records. Is 'yes' by default. A value of 'no' will make
+ Pazpar2 average ranking of each record in a cluster.
+
+
+
+
+ debug
+
+
+ Attribute 'debug' is a boolean
+ that controls whether Pazpar2 should include details
+ about ranking for each document in the show command's
+ response. Enable by using value "yes", disable by using
+ value "no" (default).
+
+
+
+
+ follow
+
+
+ Attribute 'follow' is a a floating point number greater than
+ or equal to 0. A positive number will boost weight for terms
+ that occur close to each other (proximity, distance).
+ A value of 1, will double the weight if two terms are in
+ proximity distance of 1 (next to each other). The default
+ value of 'follow' is 0 (order will not affect weight).
+
+
+
+
+ lead
+
+
+ Attribute 'lead' is a floating point number.
+ It controls if term weight should be reduced by position
+ from start in a metadata field. A positive value of 'lead'
+ will reduce weight as it apperas further away from the lead
+ of the field. Default value is 0 (no reduction of weight by
+ position).
+
+
+
+
+ length
+
+
+ Attribute 'length' determines how/if term weight should be
+ divided by lenght of metadata field. A value of "linear"
+ divide by length. A value of "log" will divide by log2(length).
+ A value of "none" will leave term weight as is (no division).
+ Default value is "linear".
+
+
+
+
+
+ Refer to to see how
+ these tweaks are used in computation of score.
+
+
+ Customization of ranking algorithm was introduced with
+ Pazpar2 1.6.18. The semantics of some of the fields changed
+ in versions up to 1.6.22.
+
+
+
+
+ sort-default
+
+
+ Specifies the default sort criteria (default 'relevance'),
+ which previous was hard-coded as default criteria in search.
+ This is a fix/work-around to avoid re-searching when using
+ target-based sorting. In order for this to work efficient,
+ the search must also have the sort critera parameter; otherwise
+ pazpar2 will do re-searching on search criteria changes, if
+ changed between search and show command.
+
+
+ This configuration was added in pazpar2 1.6.20.
+
+
+
+
+
settings
@@ -556,7 +779,7 @@
type="year" merge="range" termlist="yes"/>
-
+
@@ -578,7 +801,7 @@
The XML configuration may be partitioned into multiple files by using
the include element which takes a single attribute,
- src. The of the src attribute is
+ src. The src attribute is
regular Shell like glob-pattern. For example,
@@ -652,29 +875,31 @@
- The following discussion of practical issues related to session and settings
- management are cast in terms of a user interface based on Ajax/Javascript
- technology. It would apply equally well to many other kinds of browser-based logic.
+ The following discussion of practical issues related to session
+ and settings management are cast in terms of a user interface based on
+ Ajax/Javascript technology. It would apply equally well to many other
+ kinds of browser-based logic.
- Typically, a Javascript client is not allowed to directly alter the parameters
- of a session. There are two reasons for this. One has to do with access
- to information; typically, information about a user will be stored in a
- system on the server side, or it will be accessible in some way from the server.
- However, since the Javascript client cannot be entirely trusted (some hostile
- agent might in fact 'pretend' to be a regular ws client), it is more robust
- to control session settings from scripting that you run as part of your
- webserver. Typically, this can be handled during the session initialization,
- as follows:
+ Typically, a Javascript client is not allowed to directly alter the
+ parameters of a session. There are two reasons for this. One has to do
+ with access to information; typically, information about a user will
+ be stored in a system on the server side, or it will be accessible in
+ some way from the server. However, since the Javascript client cannot
+ be entirely trusted (some hostile agent might in fact 'pretend' to be
+ a regular ws client), it is more robust to control session settings
+ from scripting that you run as part of your webserver. Typically, this
+ can be handled during the session initialization, as follows:
- Step 1: The Javascript client loads, and asks the webserver for a new Pazpar2
- session ID. This can be done using a Javascript call, for instance. Note that
- it is possible to submit Ajax HTTPXmlRequest calls either to Pazpar2 or to the
- webserver that Pazpar2 is proxying for. See (XXX Insert link to Pazpar2 protocol).
+ Step 1: The Javascript client loads, and asks the webserver for a
+ new Pazpar2 session ID. This can be done using a Javascript call, for
+ instance. Note that it is possible to submit Ajax HTTPXmlRequest calls
+ either to Pazpar2 or to the webserver that Pazpar2 is proxying
+ for. See (XXX Insert link to Pazpar2 protocol).
@@ -684,19 +909,20 @@
- Step 3: The webserver initializes a new Pazpar2 settings, and sets user-specific
- parameters as necessary, using the init webservice command. A new session ID is
- returned.
+ Step 3: The webserver initializes a new Pazpar2 settings, and sets
+ user-specific parameters as necessary, using the init webservice
+ command. A new session ID is returned.
- Step 4: The webserver returns this session ID to the Javascript client, which then
- uses the session ID to submit searches, show results, etc.
+ Step 4: The webserver returns this session ID to the Javascript
+ client, which then uses the session ID to submit searches, show
+ results, etc.
- Step 5: When the Javascript client ceases to use the session, Pazpar2 destroys
- any session-specific information.
+ Step 5: When the Javascript client ceases to use the session,
+ Pazpar2 destroys any session-specific information.
@@ -704,8 +930,8 @@
Each file contains a root element named <settings>. It may
contain one or more <set> elements. The settings and set
- elements may contain the following attributes. Attributes in the set node
- overrides those in the setting root element. Each set node must
+ elements may contain the following attributes. Attributes in the set
+ node overrides those in the setting root element. Each set node must
specify (directly, or inherited from the parent node) at least a
target, name, and value.
@@ -734,6 +960,11 @@
multiple overlapping settings with the same name and target
value, the 'precedence' attribute determines what happens.
+
+ For Pazpar2 1.6.4 or later, the target ID may be user-defined, in
+ which case, the actual host, port, etc is given by setting
+ .
+
@@ -970,13 +1201,21 @@
- pz:xslt
+ pz:xslt
- Is a comma separated list of of files that specifies
+ Is a comma separated list of of stylesheet names that specifies
how to convert incoming records to the internal representation.
+ For each name, the embedded stylesheets (XSL) that comes with the
+ service definition are consulted first and takes precedence over
+ external files; see
+ of service definition).
+ If the name does not match an embedded stylesheet it is
+ considered a filename.
+
+
The suffix of each file specifies the kind of tranformation.
Suffix ".xsl" makes an XSL transform. Suffix
".mmap" will use the MMAP transform (described below).
@@ -1035,6 +1274,35 @@
+ pz:extendrecs
+
+
+ If a show command goes to the boundary of a result set for a
+ database - depends on sorting - and pz:extendrecs is set to a positive
+ value. then Pazpar2 wait for show to fetch pz:extendrecs more
+ records. This setting is best used if a database does native
+ sorting, because the result set otherwise may be completely
+ re-sorted during extended fetch.
+ The default value of pz:extendrecs is 0 (no extended fetch).
+
+
+
+ The pz:extendrecs setting appeared in Pazpar2 version 1.6.26.
+ But the bahavior changed with the release of Pazpar2 1.6.29.
+
+
+
+
+
+ pz:presentchunk
+
+
+ Controls the chunk size in present requests. Pazpar2 will
+ make (maxrecs / chunk) request(s). The default is 20.
+
+
+
+
pz:id
@@ -1071,7 +1339,7 @@
This setting enables
- SRU/SOLR
+ SRU/Solr
support.
It has four possible settings.
'get', enables SRU access through GET requests. 'post' enables SRU/POST
@@ -1080,7 +1348,7 @@
the protocol.
- A value of 'solr' anables SOLR client support. This is supported
+ A value of 'solr' enables Solr client support. This is supported
for Pazpar version 1.5.0 and later.
@@ -1092,7 +1360,7 @@
This allows SRU version to be specified. If unset Pazpar2
will the default of YAZ (currently 1.2). Should be set
- to 1.1 or 1.2. For SOLR, the current supported/tested version is 1.4
+ to 1.1 or 1.2. For Solr, the current supported/tested version is 1.4 and 3.x.
@@ -1102,7 +1370,7 @@
Allows you to specify an arbitrary PQF query language substring.
- The provided string is prefixed the user's query after it has been
+ The provided string is prefixed to the user's query after it has been
normalized to PQF internally in pazpar2.
This allows you to attach complex 'filters' to queries for a given
target, sometimes necessary to select sub-catalogs
@@ -1125,6 +1393,17 @@
@and @attr 1=30 @attr 2=3 %Y %%
would search for current year combined with the original PQF (%%).
+
+ This setting can also be used as more general alternative to
+ pz:pqf_prefix -- a way of embedding the submitted query
+ anywhere in the string rather than appending it to prefix. For
+ example, if it is desired to omit all records satisfying the
+ query @attr 1=pica.bib 0007 then this
+ subquery can be combined with the submitted query as the second
+ argument of @andnot by using the
+ pz:pqf_strftime value @not %% @attr 1=pica.bib
+ 0007.
+
@@ -1143,12 +1422,13 @@
Specifies a filter which allows Pazpar2 to only include
- records that meet a certain criteria in a result. Unmatched records
- will be ignored. The filter takes the form name, name~value, or name=value, which
+ records that meet a certain criteria in a result.
+ Unmatched records will be ignored.
+ The filter takes the form name, name~value, or name=value, which
will include only records with metadata element (name) that has the
- substring (~value) given, or matches exactly (=value). If value is omitted all records
- with the named
- metadata element present will be included.
+ substring (~value) given, or matches exactly (=value).
+ If value is omitted all records with the named metadata element
+ present will be included.
@@ -1157,18 +1437,43 @@
pz:preferred
- Specifies that a target is preferred, e.g. possible local, faster target. Using block=pref on show command
- will wait for all these targets to return records before releasing the block. If no target is preferred,
- the block=pref will identical to block=1, which release when one target has returned records.
+ Specifies that a target is preferred, e.g. possible local, faster
+ target. Using block=pref on show command will wait for all these
+ targets to return records before releasing the block.
+ If no target is preferred, the block=pref will identical to block=1,
+ which release when one target has returned records.
-
pz:block_timeout
- (Not yet implemented). Specifies the time for which a block should be released anyway.
+ (Not yet implemented).
+ Specifies the time for which a block should be released anyway.
+
+
+
+
+ pz:termlist_term_count
+
+
+ Specifies number of facet terms to be requested from the target.
+ The default is unspecified e.g. server-decided. Also see pz:facetmap.
+
+
+
+
+ pz:termlist_term_factor
+
+
+ Specifies whether to use a factor for pazpar2 generated facets (1) or not (0).
+ When mixing locallly generated (by the downloaded (pz:maxrecs) samples)
+ facet with native (target-generated) facets, the later will dominated the dominate the facet list
+ since they are generated based on the complete result set.
+ By scaling up the facet count using the ratio between total hit count and the sample size,
+ the total facet count can be approximated and thus better compared with native facets.
+ This is not enabled by default.
@@ -1183,39 +1488,86 @@
- At this point only SOLR targets have been tested with this
+ At this point only Solr targets have been tested with this
facility.
-
+
pz:limitmap:name
Specifies attributes for limiting a search to a field - using
- the limit parameter for search. In some cases the mapping of
+ the limit parameter for search. It can be used to filter locally
+ or remotely (search in a target). In some cases the mapping of
a field to a value is identical to an existing cclmap field; in
other cases the field must be specified in a different way - for
example to match a complete field (rather than parts of a subfield).
- The value of limitmap may have one of two forms: referral to
- an exisiting CCL field or a raw PQF string. Leading string
- determines type; either ccl: for CCL field or
- rpn: for PQF/RPN.
+ The value of limitmap may have one of three forms: referral to
+ an existing CCL field, a raw PQF string or a local limit. Leading string
+ determines type; either ccl: for CCL field,
+ rpn: for PQF/RPN, or local:
+ for filtering in Pazpar2. The local filtering may be followed
+ by a field a metadata field (default is to use the name of the
+ limitmap itself).
+
+
+ For Pazpar2 version 1.6.23 and later the limitmap may include multiple
+ specifications, separated by , (comma).
+ For example:
+ ccl:title,local:ltitle,rpn:@attr 1=4.
The limitmap facility is supported for Pazpar2 version 1.6.0.
+ Local filtering is supported in Pazpar2 1.6.6.
+
+
+
+
+
+
+ pz:url
+
+
+ Specifies URL for the target and overrides the target ID.
+
+
+
+ pz:url is only recognized for
+ Pazpar2 1.6.4 and later.
+
+
+
+
+
+
+ pz:sortmap:field
+
+
+ Specifies native sorting for a target where
+ field is a sort criteria (see command
+ show). The value has to components separated by colon: strategy and
+ native-field. Strategy is one of z3950,
+ type7, cql,
+ sru11, or embed.
+ The second component, native-field, is the field that is recognized
+ by the target.
+
+
+
+ Only supported for Pazpar2 1.6.4 and later.
-
+