Besides, he project has produced a relatively extensive bibliography of work on NLP evaluation and diagnosis that (among others) can be searched on-line through the Essex Linguistic Bibliography server. Search results can be retrieved in several different formats including HTML, Refer, LaTeX, and BibTex.
Language Research Engineering Convention, London (July 1994).
This paper describes the LRE project TSNLP (Test Suites for Natural
Language Processing), which is concerned with some central issues in
the design and use of test suites.
The project combines theoretical
research with practical implementations, aiming to provide generally
usable tools and test data together with reports discussing
the theoretical background. The paper begins by setting
out the motivation, aims, and present state of the project,
then examines the methodological issues behind it.
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
Translation and the Computer, 16th ASLIB Conference (November 1994).
Available:
`.ps' file and
`.dvi' file.
Cranfield International Conference on Machine Translation
(November 1994)
Available:
`.ps' file and
`.dvi' file.
Conference on Linguistic Databases, Groningen (March 1994)
We present recent results from the LRE project TSNLP (Test Suites
for Natural Language Processing) which is concerned with central issues
in the design and use of test suites.
The paper focusses on (i) the motivation for test suites in
comparison to (annotated) corpora, (ii) the theory-neutral
annotation schema developed in TSNLP; and (iii) the construction of
a linguistic database as a (virtual) meta test suite to ease
applications in diagnosis and evaluation.
Available: `.ps' file, `.dvi' file, and `.bib' entry.
Fourth International Conference on Cognitive Science of Natural
Language Processing, Dublin (1995)
Test suites have long been accepted as a useful evaluation tool in
Natural Language Processing, since they provide a more or less
systematic collection of specially constructed linguistic examples
(e.g. sentences) with annotations and other information. However,
existing test suites tend to be relatively unsystematic and lack
generality (having been constructed with particular systems in
mind). Moreover, there is no established methodology which a system
developer or other evaluator can follow in constructing a test suite
of their own.
The paper describes the goals and achievements to date of the TSNLP
project (Test Suites for Natural Language Processing), an LRE project
funded by the CEC, which seeks to address these problems. In
particular, the project aims to produce realistic and general
guidelines for test suite construction, and to construct
substantial test data in three languages (English, French, and
German). The bulk of the data (several thousand test items) will cover
``core'' syntactic phenomena and will be suitable for testing any
syntactic-based system, but some application- specific data (for
parsers, grammar checkers and controlled language checkers) will also
be written.
To enhance their use and reusability, the data is being mounted onto a
database, for ease of access and manipulation. The construction
methodology and the test suites will be validated by testing a number
of NLP products.
The paper concentrates on design issues (eg. the need for
systematicity in both well-formed and ill-formed data, the
``exhaustive'' coverage of closed classes, and consistency of
annotation across languages) and discusses the tools that have been
designed to aid and semi-automate the construction process (namely a
generation tool and lexical replacement tool).
All the results of the project, including actual test suites are, or
will be, in the public domain.
Available:
`.ps' file and
`.dvi' file.
Language Engineering Convention, London (October 1995).
Available:
`.ps' file and
`.dvi' file.
Workshop on Controlled Language Applications (CLAW),
Leuven (March 1996).
TSNLP produced guidelines and methodology for test suite writing, as
well as a substantial amount of annotated test data. To test the
guidelines, the created data, and the methodology for controlled
language test suites, they were used to test the controlled language
checker deveoped during the SECC project (Simplified English
Controlled Language Checker). We report here on what test suites for
controlled language should contain, how we realised the actual test
suite, and on the conclusions we could draw from all this.
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
Traitement Automatique du Langage Naturel (TALN) Conference,
Marseille (May 1996).
Le nombre d'applications dans le domaine du TALN n'a cessé
d'augmenter lors de ces dernières années.
Ce développement va de pair avec une demande croissante d'outils pour
évaluer ces applications.
Le projet TSNLP répond à cette demande en proposant une méthodologie et
des outils pour l'évaluation à l'aide de jeux de phrases-test.
Mis à part une méthodologie élaborée pour la construction de matériel
de test, TSNLP a créé la plus grand base de données de jeux de
phrases-test actuellement disponible pour le français, l'anglais et
l'allemand.
En outre, ce projet a développé des outils qui facilitent la
construction, le stockage et l'accès aux données.
Les résultats de TSNLP seront publiques. Le projet propose ainsi
des ressources linguistiques qui pourraient devenir une proposition de
standard pour un modèle d'évaluation pour tout utilisateur
d'applications dans le domaine du TALN.
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
Le traitement automatique du langage et les applications
industrielles
Test suites are a useful evaluation tool for developers and users of
NLP products. The TSNLP project sets new standards for test suite
design. The paper gives an overview of the TSNLP design and
methodology and describes how the TSNLP data and methodology can be
used in practice to provide a reliable assessment method of the
linguistic capabilities of NLP products.
Available:
`.ps'
file and
`.bib' entry.
COLING, Kopenhagen (August 1996).
The growing language technology industry needs measurement
tools to allow researchers, engineers, managers, and customers
to track development, evaluate and assure quality, and assess
suitability for a variety of applications.
The TSNLP (Test Suites for Natural Language Processing)
project has investigated various aspects of the construction,
maintenance and application of systematic test suites as
diagnostic and evaluation tools for NLP applications.
The paper summarizes the motivation and main results of
TSNLP: besides the solid methodological foundation of the
project, TSNLP has produced substantial (i.e. larger than
any existing general test suites) multi-purpose and multi-user
test suites for three European languages together with a
set of specialized tools that facilitate the construction,
extension, maintenance, retrieval, and customization of the
test data.
The publicly available results of TSNLP represent a valuable
linguistic resource that has the potential of providing a wide-spread
pre-standard diagnostic and evaluation tool for both developers and
users of NLP applications.
Available:
`.ps' file and
`.bib' entry.
John Nerbonne (editor): Linguistic Databases. CSLI Lecture
Notes (forthcoming).
The objective of the TSNLP project is to construct test suites in three
different languages building on a common basis and methodology.
Specifically, TSNLP addresses a range of issues related to the
construction and use of test suites.
The main goals of the project are to:
In the present paper the authors take the opportunity to present some
of the recent outcome of TSNLP to the community of language
technology developers as well as to potential users of NLP
systems.
Accordingly, the presentation puts emphasis on practical aspects of
applicability and plausibility rather than on theoretically demanding
research topics; the TSNLP results presented are of both
methodological and technological interest.
Available:
`.ps' file,
`.dvi' file (no graphics), and
`.bib' entry.
3rd ALEP User Group Workshop, Saarbrücken (February 1997).
A recent addition to the ALEP grammar engineering platform is
described: the test suite apparatus and test data produced in the TSNLP
project have been seamlessly integrated with the ALEP task executor.
The resulting test suite extension to ALEP is well-suited to substitute
for the existing naive testing environment, greatly increases testing
and report generation flexibility and fixes several (previously
unknown) errors in the timing and coverage measures computed by the
test suite processor.
For downward compatibility the previous testing functionality is
preserved in the current ALEP version (3.2 as of jan-97).
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, March 1994
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, September 1994
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, September 1994
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, September 1994
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, December 1994
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, January 1995
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, April 1995
Available:
`.ps' file and
`.bib' entry.
DFKI Saarbrücken, June 1995
Design parameters and key desiderata of the linguistic database used to
store, maintain, and retrieve TSNLP test data are laid out, viz.
Besides, this reports documents the test suite construction tool
(called tsct(1)) used in the writing of TSNLP test
data and sketches the automatic import procedure from tsct(1) into
tsdb(1) format.
Available:
`.ps' file,
`.dvi' file (no graphics), and
`.bib' entry.
ISSCO, Université de Genève, December 1995
Available:
`.ps' file,
`.dvi' file, and
`.bib' entry.
University of Essex, November 1995
Available:
`.ps' file and
`.bib' entry.
TSNLP. Test Suites for Natural Language Processing.
Lorna Balkan, Klaus Netter, Doug Arnold, Siety Meijer. Test Suites for Natural Language Processing.
Lorna Balkan, Doug Arnold, Siety Meijer. Test Suites: Some Issues in their Use and Design.
Lorna Balkan. TSNLP --- Test Suites for Natural Language Processing.
Stephan Oepen, Klaus Netter, Judith Klein. Test Suites for NLP.
Lorna Balkan, Douglas Arnold, Frederik Fouvry. Test Suites for Evaluation in Natural Language Engineering.
Lorna Balkan, Douglas Arnold, Frederik Fouvry. Test Suites for Controlled Language Checkers.
Frederik Fouvry, Lorna Balkan. TSNLP --- Des jeux de phrases-test pour l'évaluation
d'applications dans le domaine TALN.
Sabine Lehmann, Dominique Estival, Stephan Oepen. Test Suites for Quality Evaluation of NLP Products.
Frederik Fouvry and Lorna Balkan.
Natural Language Processing and Industrial Applications,
Moncton, New-Brunswick, Canada (June 1996).
TSNLP --- Test Suites for Natural Language Processing.
Sabine Lehmann, Stephan Oepen,
Sylvie Regnier-Prost, Klaus Netter, Vornika Lux, Judith Klein,
Kirsten Falkedal, Frederik Fouvry, Dominique Estival, Eva Dauphin,
Hervé Compagnion, Judith Baur, Lorna Balkan, Doug Arnold. TSNLP --- Test Suites for Natural Language Processing.
Stephan Oepen, Klaus Netter, Judith Klein.
Both the methodology and test data developed currently are validated in
a testing and application phase (section 1.2.4).
Towards Systematic Testing and Diagnosis.
Integrating TSNLP and ALEP
Stephan Oepen, Marius Groenendijk. Analysis of Existing Test Suites (D-WP1).
Dominique Estival,
Kirsten Falkedal, Lorna Balkan, Eva Dauphin, Siety Meijer,
Klaus Netter, Stephan Oepen, Sylvie Regnier-Prost. Test Suite Design: Guidelines and Methodology (D-WP2.1a).
Lorna Balkan, Siety Meijer,
Doug Arnold, Dominique Estival, Kirsten Falkedal, Sabine Lehmann,
Sylvie Regnier-Prost, Eva Dauphin. Issues in Test Suite Design (D-WP2.1b).
Lorna Balkan, Siety Meijer,
Doug Arnold, Eva Dauphin, Dominique Estival, Kirsten Falkedal,
Sabine Lehmann, Klaus Netter, Sylvie Regnier-Prost. Test Suite Design: Annotation Scheme (D-WP2.2).
Dominique Estival, Kirsten Falkedal, Sabine Lehmann,
Lorna Balkan, Siety Meijer, Doug Arnold,
Sylvie Régnier-Prost, Eva Dauphin,
Klaus Netter, Stephan Oepen. Design and Implementation of Test Suite Tools (D-WP5.1).
Doug Arnold, Martin Rondell, Frederik Fouvry. Corpus-Based Test Suite Generation (D-WP5.2).
Lorna Balkan, Frederik Fouvry. Checking Coverage Against Corpora (D-WP3.2).
Eva Dauphin, Veronika Lux, Sylvie Regnier-Prost,
Doug Arnold, Lorna Balkan, Frederik Fouvry,
Judith Klein, Klaus Netter, Stephan Oepen,
Dominique Estival, Kirsten Falkedal, Sabine Lehmann. The TSNLP Database: From tsct(1) to tsdb(1) (D-WP6.1).
Stephan Oepen,
Klaus Netter, Judith Baur, Tom Fettig,
Judith Klein, Fred Oberhauser.
Because the TSNLP database (called
tsdb(1)) takes a plain relational
approach, two parallel implementations could be carried out: (i) a
small and portable home-grown relational database engine in ANSI C and
(ii) a version building on the commercially available software package
MS FoxPro and its graphical interface capabilities.
The Construction of Test Material (D-WP3.1).
Dominique Estival, Kirsten Falkedal, Sabine Lehmann, Hervé Compagnion
Lorna Balkan, Doug Arnold, Frederik Fouvry,
Judith Klein, Judith Baur, Klaus Netter, Stephan Oepen,
Sylvie Regnier-Prost, Eva Dauphin, Véronika Lux. Testing and Customisation of Test Items (D-WP4).
Eva Dauphin, Veronika Lux, Sylvie Regnier-Prost,
Lorna Balkan, Frederik Fouvry, Kirsten Falkedal, Stephan Oepen,
Doug Arnold, Judith Klein, Klaus Netter,
Dominique Estival, Sabine Lehmann.
[objective]
[consortium]
[staff]
[construction tool]
[database]
[TSNLP home]
last modified: 11-jun-96
(oe@cl.dfki.uni-sb.de)