Potential users of TSNLP results or NLP developers who want to build on and extend the TSNLP test suites should refer to the User Manual first before looking at the wealth of publications and project reports. However, in several cases the User Manual does not duplicate the full information comprised by a token project report but give a reference instead; thus, the complete set of TSNLP project reports is preserved for technical details and as background information. The TSNLP User Manual comes in three volumes
Volume 1 contains a background chapter in which some of the factors which have influenced the design of the project are sketched. The methodology chapter gives a step by step account of how one can go about writing core data, that is, data that cover central phenomena of a language and that are intended to be applicable to a wide range of applications. The customisation chapter describes how the core data can be customized to a particular application (and sketches how it could be customized to a particular domain or text type). The chapter on testing gives an example of how the test suite can be applied to a real life evaluation scenario.
Volume 2 contains a description of the annotation scheme on which the data was constructed, the construction tool tsct(1) used to create the data, the database tsdb(1) on which the data is mounted, and the automated import and consistency checking procedure from tsct(1) to tsdb(1).
Volume 2b documents the test suite generation (AutoTSG) and lexical replacement tools.
Volume 3 contains the detailed documentation that accompanies the data, and which is intended to make the data more accessible to users. It also contains the category and function labels used in the English, French, and German test data with examples for each language, in addition to the vocabulary list used for the test data by the three languages.
Please note that the title page for most of the TSNLP user manual volumes contains colour PostScript causing older versions of ghostview(1) to report an error. Even if your version of ghostview(1) does not support colour you can browse the entire documents, (except for the title page) by advancing to the following pages; besides, the documents print without problems on all black and white printers that we have access to.
University of Essex, UK
This volume consists of four chapters. The Background chapter
discusses the factors which influenced the design of the project.
The Methodology chapter gives a step by step account of how one can go
about writing core data, that is, data that cover central phenomena of
a language and that are intended to be applicable to a wide range of
applications.
The Customisation chapter describes how the core data can be customised
to a particular application (and sketches how it could be customised to
a particular domain or text type).
The chapter on Testing gives an example of how the test suite can be
applied to a real life evaluation scenario.
Available:
`.ps'
file and
`.bib'
entry.
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH
Because the test data construction proper as well as the customization
and application of a multi-purpose test suite to a specific NLP
system or domain are laborious, cost-intensive and error-prone tasks,
TSNLP put strong emphasis on supplying suitable special-purpose
technology to facilitate both the development as well as the usage of
the TSNLP test data.
The TSNLP core technology designed to support both developers
of additional or new test data and users who plan to apply the TSNLP
data to a token system or domain.
It comprises the following software packages:
Available:
`.ps'
file and
`.bib'
entry.
University of Essex, UK
This volume of the TSNLP user manual describes the tools developed
during the project, to automate the process of test item construction.
The volume contains three chapters:
Available:
`.ps'
file and
`.bib'
entry.
Istituto Dalle Molle per gli Studii Semantici e Cognitivi (ISSCO)
This volume of the TSNLP user manual describes the TSNLP test data
available through the TSNLP database. The documentation explains how
the various phenomena have been treated in the three languages
English, French and German. Not all the phenomena have been covered
to the same extent in all languages and in some cases, they have not
been covered at all in one or more language.
The document provides an overview of the number of existing test
sentences for the different phenomena.
Furthermore, it includes two annexes which present:
Available:
`.ps'
file and
`.bib'
entry.
Background, Methodology, Customization, and Testing.
Lorna Balkan, Frederik Fouvry, Sylvie Regnier-Prost (editors).Core Test Suite Technology.
Stephan Oepen, Frederik Fouvry, Klaus Netter, Tom Fettig,
Fred Oberhauser.
Since both the test suite construction tool and the test suite database
crucially build on the TSNLP annotation schema (the formal
specification of properties and values used in classifying and
organizing TSNLP test data), the abstract annotation schema is
reviewed in section 2 and then related to its implementations in
tsct(1) and tsdb(1).
Test Suite Tools.
Frederik Fouvry (editor).
Chapters 1 and 3, and this introduction are written by Doug Arnold,
chapter 2 is written by Martin Rondell. The revisions (this volume is
a slightly revised and updated version of the
TSNLP deliverable WP 5.1)
are made by Frederik Fouvry.
As regards the Prolog and Lisp code, the code for the engine is due to
Dave Moffat (who did the origininal implementation), Doug Arnold, and
Martin Rondell.
The GUI code is by Martin Rondell, apart from the code dealing with the
viewing of trees.
The Lisp code for the lexical replacement tool is due to Doug Arnold
and Frederik Fouvry.
Data Documentation.
Sabine Lehmann, Dominique Estival, Kirsten Falkedal, Hervé Compagnion,
Lorna Balkan, Frederik Fouvry, Judith Baur, Judith Klein.
The complete set of TSNLP test data can be accessed through the TSNLP
home page
http://www.delph-in.net/tsnlp/.
[objective]
[consortium]
[staff]
[publications]
[construction tool]
[database]
[TSNLP home]
last modified: 21-may-96