This page collects a number of pointers to on-going activities, research groups, and general topics that relate to the DELPH-IN effort in one way or another. We provide both lists to DELPH-IN members' projects as well as other ongoing R&D programme working towards precise, practical natural language processing.

Selected Projects

This is a small and somewhat arbitrary selection of projects designed to give a brief introduction to what has been and is being done within DELPH-IN. It will be changed from time to time. A fuller list is given below.

  • LOGON 2002–2006. Oslo, Bergen, NTNU. The consortium developed a Norwegian to English machine translation system, based on a semantic transfer approach and using MRS and DELPH-IN technology for transfer and generation. (Funding: NRC)

DELPH-IN Member Projects

This is a reverse chronological listing (by final year) of DELPH-IN members' projects on related topics. The selected projects listed above are duplicated here. Funded projects show their funding source as far as possible.

  • AGGREGATION: 2012–2015. UW. Investigating the automatic creation of Matrix-derived grammars on the basis of collections of interlinear glossed text. (Funding: NSF Documenting Endangered Languages grant)
  • Deependance: 2012–2014. LT-Lab, DFKI The aim of this project is to improve existing methodology for generic deep linguistic analysis, i.e. the syntactic and semantic analysis needed for many language technology applications. A dependency grammar model will be developed that extends the representations of successful data-driven dependency parsing schemes by additional elements of linguistic and cognitive sophistication such as a typed feature system, explicit soft constraints, the use of both semantic and syntactic dependencies and means for incrementally produced partial results. (Funding: BMBF contract 01IW11003)
  • Developing an MT System through Deep Language Processing: 2011–2013. Kyunghee, NTU, UW. Work on developing the grammars to allow for semantic-transfer based MT between English, Japanese and Korean (Funding: NRFK).
  • Revealing Meaning Using Multiple Languages: 2010–2012. NTU, NICT. This project looked at using one language to disambiguate the other in bitexts, using MRS-based transfer and wordnets to find equivalences. (Funding: JSPS, NTU, Erasmus)
  • The Grammar Matrix: Computational Linguistic Typology — 2007–2012. UW. Develop the Grammar Matrix core grammar and customization system to support the development of new DELPH-IN style grammars. (Funding: NSF CAREER grant)
  • Automatically determining meaning by comparing a text to its translation: 2009-2011. NTU This project looked at ways of modeling meaning across different languages. It was followed up by Revealing Meaning (Funding: NTU)
  • Online Linguistic Exploration: Deeper, Faster, Broader Language Documentation — 2009–2011. Melbourne. Development of means for fast-tracking language resource development, and visualising language resources (incl. treebank search). (Funding: Australian Research Council
  • Multilingual Unsupervised Parse Selection: 2009–2010. Melbourne. Exploration of unsupervised models for parse selection, targeted at languages without treebanks. (Funding: Microsoft Research)
  • SciBorg: Extracting the Science from Scientific Publications 2005–2009 Cambridge. (Funding: EPSRC)
  • Information Delivery from Segmented Textual Data Streams: 2006–2008. Melbourne. Applications of NLP to web user forum analysis to improve information access, including through the use of supertagging and parsing. (Funding: Australian Research Council)
  • Utterance-level interface for DELPH-IN: Cambridge. We looked at standardisation of the text interface, allowing for markup and ambiguity in tokenisation. (Funding: Boeing)
  • Scalable Japanese Analysis: 2006–2008 NTT, Melbourne Investigating methods of deep lexical acquisition for Japanese. (Funding: NTT)
  • HANDON 2007. Oslo, Bergen, CSLI. This was a follow up project to LOGON, investigating scalability. (Funding: NRC)
  • Typology of Prepositions and their Semantic Equivalents: 2007. NTNU, NICT. Investigation into the representation of prepositions, with a view to implementation in the Norsource and Jacy grammars. (Funding: NRC)
  • JaNoGram: Japanese and Norwegian Computational Linguistics: 2006. NTNU, NICT. Investigation into Japanese-Norwegian MT, using the transfer system developed in LOGON. (Funding: NTNU)
  • LOGON 2002–2006. Oslo, Bergen, NTNU. The consortium developed a Norwegian to English machine translation system, based on a semantic transfer approach and using MRS and DELPH-IN technology for transfer and generation. (Funding: NRC)
  • Robust Precise Japanese Parsing: 2005–2006 NTT, DFKI. Grammar engineering to increase the robustness of Jacy (Funding: NTT)
  • Scalable Deep Language Processing: 2005–2006 NTT, Melbourne .Investigating methods of deep lexical acquisition, particularly looking at MWEs. (Funding: NTT)
  • Contrastive Study of Syntax and Semantics between Korean and Japanese and Feasibility of Porting and Cross-Development of Grammars between LFG/XLE and HPSG/LKB Frameworks: 2004–2006 Kyunghee, Waseda. (Funding: JSPS, NSFK)
  • Stochastic Parsing with Rich Grammars: 2004–2005 CSLI, NTT. Developing the original parse ranking models. (Funding: NTT)
  • Modeling politeness in a Greek HPSG: 2003–2004. Cambridge. Integrating pragmatic insights with HPSG was a follow-on project to this. (Funding: British Academy funded small research)
  • Deep Thought: 2002–2004. Saarbrücken, NTNU (Norway), Sussex, Cambridge, CELI (Italy) and Xtramind (Germany). Hybrid Deep and Shallow Methods for Knowledge-intensive Information Extraction. This project led to the development of [RmrsTop|Robust Minimal Recursion Semantics] and the [HogTop|Heart of Gold]. (Funding: EU)
  • Multiword expressions: 2001–2004. CSLI, NTT. The aims of the project were to acquire and formally represent multiword expressions, including idioms, compound nouns, phrasal verbs and collocations. The results are incorporated into the DELPH-IN work in a variety of ways. It led to a series of workshops that are still continuing. (Funding: NTT; NSF)
  • WhiteBoard 2000–2002. DFKI Saarbrücken. Basic research into architectures and methodologies for the combination of ‘deep’ and ‘shallow’ approaches to natural language analysis; building an XML-based software environment for multi-layer linguistic annotation. (Funding: Federal Ministry of Education and Research, Germany)

Related Activities

This lists some of the many projects with similar goals, it is by no menas a comprehensive list. Projects are listed in alphabetic order.

  • The Attribute Logic Engine (ALE), developed by Bob Carpenter and Gerald Penn since the early 1990s: one of the early wide-spread computational tools (based on Prolog) for the development of typed feature structure grammars and still in active use in several research efforts.
  • The Algorithms for Linguistic Processing (ALPINO) project at Groningen University (The Netherlands): building a development and processing environment for HPSG implementations, a comprehensive grammar of Dutch, a dependency treebank (of Dutch newspaper text), and related technology.
  • The Core Grammar Project (CoreGram) is a multilingual grammar engineering project that develops HPSG grammars for several typologically diverse languages that share a common core. The system is open-source and can be downloaded as a bootable CD Rom with the grammar development system, the test environment and grammars for German, Chinese, Danish, Maltese and Persian.
  • The MiLCA project, involving Tübingen (Germany), Ohio State (US), and Toronto (Canada) Universities, among others: developing an extension to ALE (see above) as a development environment for HPSG grammars using ‘rich’ constraints and porting the LinGO ERG into this formalism; focusing on linguistic adequacy more than on processing efficiency.
  • The Natural Language Theory and Technology (NLTT) group at the Palo Alto Research Center (PARC) and associated partners: working within the LFG framework but in several ways similar to DELPH-IN; developing the XLE grammar development and processing software and, in the Parallel Grammar (ParGram) project, implementing grammars of several languages; NLTT and ParGram resources are not publicly available, though.
  • The Robust Accurate Statistical Parsing (RASP) project at Cambridge and Sussex Universities (UK): integrating and extending several strands of research on robust statistical parsing and automated grammar and lexicon induction, in order to develop and distribute a new, parsing toolkit for English.

