TSNLP Publications, Reports, and Bibliography


Conference and Book Articles

Presentations on TSNLP in general and on various of its work packages in particular were given at the following conferences; additionally, there is a forthcoming book contribution on the project.


TSNLP Project Reports

TSNLP project reports (or deliverables) document the individual TSNLP work packages in detail. Please note that not all of the deliverables listed could be made available to the public yet (partly because of confidentiality considerations); moreover, all of the documents that can be obtained from this page can only be perceived as preliminary versions that will undergo revision before the end of the project. In case you want to include references to one of these reports in a publication of your own, please make sure (i) to contact the principal author(s) beforehand and (ii) to give a clear indication of the prefinal status of the work reported.


TSNLP Bibliography

Bibliographical references for the TSNLP papers and project reports are provided as BibTeX entries with the individual publications (see below). Additionally, the complete and up-to-date TSNLP bibliographical database is available in BibTeX format as the single file
tsnlp.bib; please, make sure to always use these references when quoting from or referring to any of the TSNLP publications.

Besides, he project has produced a relatively extensive bibliography of work on NLP evaluation and diagnosis that (among others) can be searched on-line through the Essex Linguistic Bibliography server. Search results can be retrieved in several different formats including HTML, Refer, LaTeX, and BibTex.


TSNLP. Test Suites for Natural Language Processing.

Lorna Balkan, Klaus Netter, Doug Arnold, Siety Meijer.

Language Research Engineering Convention, London (July 1994).

This paper describes the LRE project TSNLP (Test Suites for Natural Language Processing), which is concerned with some central issues in the design and use of test suites. The project combines theoretical research with practical implementations, aiming to provide generally usable tools and test data together with reports discussing the theoretical background. The paper begins by setting out the motivation, aims, and present state of the project, then examines the methodological issues behind it.

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Test Suites for Natural Language Processing.

Lorna Balkan, Doug Arnold, Siety Meijer.

Translation and the Computer, 16th ASLIB Conference (November 1994).

Available: `.ps' file and `.dvi' file.


Test Suites: Some Issues in their Use and Design.

Lorna Balkan.

Cranfield International Conference on Machine Translation (November 1994)

Available: `.ps' file and `.dvi' file.


TSNLP --- Test Suites for Natural Language Processing.

Stephan Oepen, Klaus Netter, Judith Klein.

Conference on Linguistic Databases, Groningen (March 1994)

We present recent results from the LRE project TSNLP (Test Suites for Natural Language Processing) which is concerned with central issues in the design and use of test suites. The paper focusses on (i) the motivation for test suites in comparison to (annotated) corpora, (ii) the theory-neutral annotation schema developed in TSNLP; and (iii) the construction of a linguistic database as a (virtual) meta test suite to ease applications in diagnosis and evaluation.

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Test Suites for NLP.

Lorna Balkan, Douglas Arnold, Frederik Fouvry.

Fourth International Conference on Cognitive Science of Natural Language Processing, Dublin (1995)

Test suites have long been accepted as a useful evaluation tool in Natural Language Processing, since they provide a more or less systematic collection of specially constructed linguistic examples (e.g. sentences) with annotations and other information. However, existing test suites tend to be relatively unsystematic and lack generality (having been constructed with particular systems in mind). Moreover, there is no established methodology which a system developer or other evaluator can follow in constructing a test suite of their own.

The paper describes the goals and achievements to date of the TSNLP project (Test Suites for Natural Language Processing), an LRE project funded by the CEC, which seeks to address these problems. In particular, the project aims to produce realistic and general guidelines for test suite construction, and to construct substantial test data in three languages (English, French, and German). The bulk of the data (several thousand test items) will cover ``core'' syntactic phenomena and will be suitable for testing any syntactic-based system, but some application- specific data (for parsers, grammar checkers and controlled language checkers) will also be written.

To enhance their use and reusability, the data is being mounted onto a database, for ease of access and manipulation. The construction methodology and the test suites will be validated by testing a number of NLP products.

The paper concentrates on design issues (eg. the need for systematicity in both well-formed and ill-formed data, the ``exhaustive'' coverage of closed classes, and consistency of annotation across languages) and discusses the tools that have been designed to aid and semi-automate the construction process (namely a generation tool and lexical replacement tool).

All the results of the project, including actual test suites are, or will be, in the public domain.

Available: `.ps' file and `.dvi' file.


Test Suites for Evaluation in Natural Language Engineering.

Lorna Balkan, Douglas Arnold, Frederik Fouvry.

Language Engineering Convention, London (October 1995).

Available: `.ps' file and `.dvi' file.


Test Suites for Controlled Language Checkers.

Frederik Fouvry, Lorna Balkan.

Workshop on Controlled Language Applications (CLAW), Leuven (March 1996).

TSNLP produced guidelines and methodology for test suite writing, as well as a substantial amount of annotated test data. To test the guidelines, the created data, and the methodology for controlled language test suites, they were used to test the controlled language checker deveoped during the SECC project (Simplified English Controlled Language Checker). We report here on what test suites for controlled language should contain, how we realised the actual test suite, and on the conclusions we could draw from all this.

Available: `.ps' file, `.dvi' file, and `.bib' entry.


TSNLP --- Des jeux de phrases-test pour l'évaluation d'applications dans le domaine TALN.

Sabine Lehmann, Dominique Estival, Stephan Oepen.

Traitement Automatique du Langage Naturel (TALN) Conference, Marseille (May 1996).

Le nombre d'applications dans le domaine du TALN n'a cessé d'augmenter lors de ces dernières années. Ce développement va de pair avec une demande croissante d'outils pour évaluer ces applications. Le projet TSNLP répond à cette demande en proposant une méthodologie et des outils pour l'évaluation à l'aide de jeux de phrases-test.

Mis à part une méthodologie élaborée pour la construction de matériel de test, TSNLP a créé la plus grand base de données de jeux de phrases-test actuellement disponible pour le français, l'anglais et l'allemand. En outre, ce projet a développé des outils qui facilitent la construction, le stockage et l'accès aux données.

Les résultats de TSNLP seront publiques. Le projet propose ainsi des ressources linguistiques qui pourraient devenir une proposition de standard pour un modèle d'évaluation pour tout utilisateur d'applications dans le domaine du TALN.

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Test Suites for Quality Evaluation of NLP Products.

Frederik Fouvry and Lorna Balkan.

Le traitement automatique du langage et les applications industrielles
Natural Language Processing and Industrial Applications,
Moncton, New-Brunswick, Canada (June 1996).

Test suites are a useful evaluation tool for developers and users of NLP products. The TSNLP project sets new standards for test suite design. The paper gives an overview of the TSNLP design and methodology and describes how the TSNLP data and methodology can be used in practice to provide a reliable assessment method of the linguistic capabilities of NLP products.

Available: `.ps' file and `.bib' entry.


TSNLP --- Test Suites for Natural Language Processing.

Sabine Lehmann, Stephan Oepen,
Sylvie Regnier-Prost, Klaus Netter, Vornika Lux, Judith Klein,
Kirsten Falkedal, Frederik Fouvry, Dominique Estival, Eva Dauphin,
Hervé Compagnion, Judith Baur, Lorna Balkan, Doug Arnold.

COLING, Kopenhagen (August 1996).

The growing language technology industry needs measurement tools to allow researchers, engineers, managers, and customers to track development, evaluate and assure quality, and assess suitability for a variety of applications.

The TSNLP (Test Suites for Natural Language Processing) project has investigated various aspects of the construction, maintenance and application of systematic test suites as diagnostic and evaluation tools for NLP applications. The paper summarizes the motivation and main results of TSNLP: besides the solid methodological foundation of the project, TSNLP has produced substantial (i.e. larger than any existing general test suites) multi-purpose and multi-user test suites for three European languages together with a set of specialized tools that facilitate the construction, extension, maintenance, retrieval, and customization of the test data.

The publicly available results of TSNLP represent a valuable linguistic resource that has the potential of providing a wide-spread pre-standard diagnostic and evaluation tool for both developers and users of NLP applications.

Available: `.ps' file and `.bib' entry.


TSNLP --- Test Suites for Natural Language Processing.

Stephan Oepen, Klaus Netter, Judith Klein.

John Nerbonne (editor): Linguistic Databases. CSLI Lecture Notes (forthcoming).

The objective of the TSNLP project is to construct test suites in three different languages building on a common basis and methodology. Specifically, TSNLP addresses a range of issues related to the construction and use of test suites. The main goals of the project are to:

Both the methodology and test data developed currently are validated in a testing and application phase (section 1.2.4).

In the present paper the authors take the opportunity to present some of the recent outcome of TSNLP to the community of language technology developers as well as to potential users of NLP systems. Accordingly, the presentation puts emphasis on practical aspects of applicability and plausibility rather than on theoretically demanding research topics; the TSNLP results presented are of both methodological and technological interest.

Available: `.ps' file, `.dvi' file (no graphics), and `.bib' entry.


Towards Systematic Testing and Diagnosis. Integrating TSNLP and ALEP

Stephan Oepen, Marius Groenendijk.

3rd ALEP User Group Workshop, Saarbrücken (February 1997).

A recent addition to the ALEP grammar engineering platform is described: the test suite apparatus and test data produced in the TSNLP project have been seamlessly integrated with the ALEP task executor.

The resulting test suite extension to ALEP is well-suited to substitute for the existing naive testing environment, greatly increases testing and report generation flexibility and fixes several (previously unknown) errors in the timing and coverage measures computed by the test suite processor. For downward compatibility the previous testing functionality is preserved in the current ALEP version (3.2 as of jan-97).

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Analysis of Existing Test Suites (D-WP1).

Dominique Estival,
Kirsten Falkedal, Lorna Balkan, Eva Dauphin, Siety Meijer,
Klaus Netter, Stephan Oepen, Sylvie Regnier-Prost.

University of Essex, March 1994

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Test Suite Design: Guidelines and Methodology (D-WP2.1a).

Lorna Balkan, Siety Meijer,
Doug Arnold, Dominique Estival, Kirsten Falkedal, Sabine Lehmann,
Sylvie Regnier-Prost, Eva Dauphin.

University of Essex, September 1994

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Issues in Test Suite Design (D-WP2.1b).

Lorna Balkan, Siety Meijer,
Doug Arnold, Eva Dauphin, Dominique Estival, Kirsten Falkedal,
Sabine Lehmann, Klaus Netter, Sylvie Regnier-Prost.

University of Essex, September 1994

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Test Suite Design: Annotation Scheme (D-WP2.2).

Dominique Estival, Kirsten Falkedal, Sabine Lehmann,
Lorna Balkan, Siety Meijer, Doug Arnold,
Sylvie Régnier-Prost, Eva Dauphin, Klaus Netter, Stephan Oepen.

University of Essex, September 1994

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Design and Implementation of Test Suite Tools (D-WP5.1).

Doug Arnold, Martin Rondell, Frederik Fouvry.

University of Essex, December 1994

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Corpus-Based Test Suite Generation (D-WP5.2).

Lorna Balkan, Frederik Fouvry.

University of Essex, January 1995

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Checking Coverage Against Corpora (D-WP3.2).

Eva Dauphin, Veronika Lux, Sylvie Regnier-Prost,
Doug Arnold, Lorna Balkan, Frederik Fouvry,
Judith Klein, Klaus Netter, Stephan Oepen,
Dominique Estival, Kirsten Falkedal, Sabine Lehmann.

University of Essex, April 1995

Available: `.ps' file and `.bib' entry.


The TSNLP Database: From tsct(1) to tsdb(1) (D-WP6.1).

Stephan Oepen,
Klaus Netter, Judith Baur, Tom Fettig, Judith Klein, Fred Oberhauser.

DFKI Saarbrücken, June 1995

Design parameters and key desiderata of the linguistic database used to store, maintain, and retrieve TSNLP test data are laid out, viz.

Because the TSNLP database (called
tsdb(1)) takes a plain relational approach, two parallel implementations could be carried out: (i) a small and portable home-grown relational database engine in ANSI C and (ii) a version building on the commercially available software package MS FoxPro and its graphical interface capabilities.

Besides, this reports documents the test suite construction tool (called tsct(1)) used in the writing of TSNLP test data and sketches the automatic import procedure from tsct(1) into tsdb(1) format.

Available: `.ps' file, `.dvi' file (no graphics), and `.bib' entry.


The Construction of Test Material (D-WP3.1).

Dominique Estival, Kirsten Falkedal, Sabine Lehmann, Hervé Compagnion
Lorna Balkan, Doug Arnold, Frederik Fouvry,
Judith Klein, Judith Baur, Klaus Netter, Stephan Oepen,
Sylvie Regnier-Prost, Eva Dauphin, Véronika Lux.

ISSCO, Université de Genève, December 1995

Available: `.ps' file, `.dvi' file, and `.bib' entry.


Testing and Customisation of Test Items (D-WP4).

Eva Dauphin, Veronika Lux, Sylvie Regnier-Prost,
Lorna Balkan, Frederik Fouvry, Kirsten Falkedal, Stephan Oepen,
Doug Arnold, Judith Klein, Klaus Netter, Dominique Estival, Sabine Lehmann.

University of Essex, November 1995

Available: `.ps' file and `.bib' entry.


[objective] [consortium] [staff] [construction tool] [database] [TSNLP home]
last modified: 11-jun-96 (oe@cl.dfki.uni-sb.de)