1 <chapter id="introduction">
2 <!-- $Id: introduction.xml,v 1.4 2002-04-09 19:20:22 adam Exp $ -->
3 <title>Introduction</title>
6 <title>Overview</title>
9 The Zebra system is a fielded free-text indexing and retrieval engine with a
10 Z39.50 front-end. You can use any commercial or free-ware Z39.50 client
11 to access data stored in Zebra.
15 The Zebra server is our first step towards the development of a fully
16 configurable, open information system. Eventually, it will be paired
17 off with a powerful Z39.50 client to support complex information
18 management tasks within almost any application domain. We're making
19 the server available now because it's no fun to be in the open
20 information retrieval business all by yourself. We want to allow
21 people with interesting data to make their things
22 available in interesting ways, without having to start out
23 by implementing yet another protocol stack from scratch.
27 This document is an introduction to the Zebra system. It will tell you
28 how to compile the software, and how to prepare your first database.
29 It also explains how the server can be configured to give you the
30 functionality that you need.
34 If you find the software interesting, you should join the support
35 mailing-list by sending email to
36 <literal>zebra-request@indexdata.dk</literal>.
42 <title>Features</title>
45 This is a list of some of the most important features of the
53 Supports updating - records can be added and deleted without
54 rebuilding the index from scratch.
55 The update procedure is tolerant to crashes or hard interrupts
56 during register updating - registers can be reconstructed following
58 Registers can be safely updated even while users are accessing
65 Supports large databases - files for indices, etc. can be
66 automatically partitioned over multiple disks.
72 Supports arbitrarily complex records - base input format is an
73 SGML-like syntax which allows nested (structured) data elements, as
74 well as variant forms of data.
80 Supports random storage formats. A system of input filters driven by
81 regular expressions allows you to easily process most ASCII-based
82 data formats. SGML, XML, ISO2709 (MARC), and raw text are also
89 Supports boolean queries as well as relevance-ranking (free-text)
90 searching. Right truncation and masking in terms are supported, as
91 well as full regular expressions.
97 Supports multiple concrete syntaxes
98 for record exchange (depending on the configuration): GRS-1, SUTRS,
99 XML, ISO2709 (*MARC). Records can be mapped between record syntaxes
100 and schema on the fly.
106 Supports approximate matching in registers (ie. spelling mistakes,
123 Protocol facilities: Init, Search, Retrieve, Delete, Browse and Sort.
129 Piggy-backed presents are honored in the search-request.
135 Named result sets are supported.
140 Easily configured to support different application profiles, with
141 tables for attribute sets, tag sets, and abstract syntaxes.
142 Additional tables control facilities such as element mappings to
143 different schema (eg., GILS-to-USMARC).
149 Complex composition specifications using Espec-1 are partially
150 supported (simple element requests only).
156 Element Set Names are defined using the Espec-1 capability of the
157 system, and are given in configuration files as simple element
158 requests (and possibly variant requests).
164 Some variant support (not fully implemented yet).
170 Zebra runs on most Unix-like systems as well as Windows NT - a binary
171 distribution for Windows NT is available.
182 <title>Future Work</title>
185 These are some of the plans that we have for the software in the near
186 and far future, approximately ordered after their relative importance.
188 asterisk will be implemented before the
196 *Complete the support for variants.
202 *Finalize the data element <emphasis>include</emphasis> facility
203 to support multimedia data elements in records.
209 Add more sophisticated relevance ranking mechanisms.
210 Add support for soundex and stemming.
211 Add relevance <emphasis>feedback</emphasis> support.
217 Complete EXPLAIN support.
223 Add support for very large records by implementing segmentation and/or
230 Support the Item Update extended service of the protocol.
236 We want to add a management system that allows you to
237 control your databases and configuration tables from a graphical
245 Programmers thrive on user feedback. If you are interested in a
246 facility that you don't see mentioned here, or if there's something
247 you think we could do better, please drop us a mail.
248 If you think it's all really neat, you're welcome to drop us a line
249 saying that, too. You'll find contact info at the end of this file.
254 <!-- Keep this comment at the end of the file
259 sgml-minimize-attributes:nil
260 sgml-always-quote-attributes:t
263 sgml-parent-document: "zebra.xml"
264 sgml-local-catalogs: nil
265 sgml-namecase-general:t