this page is under construction; links may be broken (or missing);

a brief summary of the [incr tsdb()] approach and package;

more information and user-supplied advice on the DELPH-IN wiki.

[incr tsdb()]  user manual, background reading, and bibliography;

obtaining the [incr tsdb()] package (data and software).

Common Lexicalized constraint-based grammars (e.g. implementations within the HPSG framework) with wide grammatical and lexical coverage exhibit a large conceptual and computational complexity. As (worst case) complexity theory accounts cannot accurately predict the practical behaviour of parsers and generators based on unification grammars, system developers and grammar writers alike need to rely on emprical data in diagnosis and evaluation.

At the same time, there is little existing methodology (nor available reference data and tools) to facilitate empirical assessment and progress evaluation as part of the regular development cycle. Hence, isolated case studies, introspection, and intuitions still play a crucial role in typical large-scale development efforts. Yet, subtle decisions in the system implementation or unexpected interaction within the grammar can have drastic effects on the overall system performance.

The [incr tsdb()] package implements a novel methodology that makes the precise and systematic empirical study of system competence and performance a focal point in system and grammar development. This approach can be seen as an adaption of the profiling metaphor (known from software development) to constraint-based language processing systems. Based on

developers are enabled to obtain an accurate snapshot of current system behaviour (a profile) with minimal effort. Profiles can then be analysed and visualized at highly variable granularity, reflecting different aspects of system competence and performance. Since profiles are stored in a database, comparison to earlier versions or among different parameter settings is straightforward.

The profiling methodology and tool was developed in close cooperation with grammar and system development efforts at CSLI Stanford and DFKI Saarbrücken. The software (in source code) and data are made available to the general public, free of royalties, for academic or other non-commercial use, including deployment in corporate environments. The [incr tsdb()] developers hope to contribute to a commonly-accepted (pre-standard) diagnostic and evaluation methodology and technology that will facilitate system diagnosis and comparison and thus enable researchers to evaluate and exchange methods and technology.

last modified: 22-aug-99 (