draft schedule of course and exercise topics;
background information on the
LinGO project at
CSLI Stanford;
obtaining the
LKB package
(source code and binaries for certain platforms);
From machine translation to speech recognition and web-based search engines, a wide range of applications demand increasing accuracy and robustness from natural language processing. Meeting these demands requires hand-built, linguistic grammars of human languages (combined with sophisticated statistical processing methods).
In this course we will introduce the fundamental concepts of formal and computational models of natural language grammar and gain practical experience in grammar implementation, i.e. draw on a combination of contemporary grammatical theory and hands-on engineering skills. We will work in the framework of unification-based (or constraint-based) grammar and acquire a solid understanding of the formalism of typed feature structures and their use for linguistic description. Selected chapters from Sag, Wasow, & Bender (2003) will provide the linguistic background knowledge, while we will use the Linguistic Knowledge Builder (LKB; Copestake, 2001) as the implementation environment.
A combination of lectures and in-class exercises will enable students to investigate the implementation of constraints in morphology, syntax, and semantics, working within a unification-based lexicalist framework. While most of the course work will focus on developing small grammars for English, we will apply our jointly acquired grammar engineering expertise to another language towards the end of the term.
Some preliminary knowledge of syntactic theory and phrase structure grammar will be helpful, but no prior programming skills are required. There will be four hands-on exercises assigned throughout the course (see the draft schedule below) that will form the basis for joint laboratory sessions; we will try to not complete each of the exercises during the laboratory hours, but instead expect students to continue implementation work individually outside of class hours. The expected time to complete each assignment should be between two and six hours per exercise, and students will be asked to submit their solutions to each assignment electronically.
Exercises will be graded and contribute substantially towards the final course assessment; exercise results will be complemented by a 90-minute written exam in December (exact date tba). In addition to the four regular exercises that are part of the course schedule below, there will be one optional exercise towards the end of the course, essentially asking students to adapt our implemented grammar (of English) at the time for Swedish. Completion and submission of the additional, Swedish exercise will be a prerequisite to consideration of a VG grade.
Date | Time | Room | Topic |
---|---|---|---|
Thu, October 21 | 10:00 – 12:00 | E230 | Lecture: Course Overview and Motivation; Phrase Structure Grammar |
Thu, October 21 | 13:00 – 15:00 | E230 | Lecture: Formal Syntax — Unification-Based Grammar |
Wed, November 3 | 09:00 – 10:00 | Mac | Lecture: Structured Categories |
Wed, November 3 | 10:00 – 12:00 | Mac | Laboratory: Exercise 2 (due Thu, November 4; 18:00 h) |
Thu, November 4 | 09:00 – 12:00 | E230 | Lecture: Agreement, Government, Modification |
Thu, November 4 | 15:00 – 17:00 | Mac | Laboratory: Exercise 2 (due Thu, November 4; 18:00 h) |
Fri, November 5 | 09:00 – 10:00 | Mac | Lecture: Generalisations in Typed Feature Structures |
Fri, November 5 | 10:00 – 12:00 | Mac | Laboratory: Exercise 3 (due Fri, November 19; 18:00 h) |
Thu, November 11 | 10:00 – 12:00 | Mac | Laboratory: Exercise 3 (due Fri, November 19; 18:00 h) |
Tue, December 7 | 09:00 – 10:00 | Mac | Lecture: Lexical Rules and Morphology |
Tue, December 7 | 10:00 – 12:00 | Mac | Laboratory: Exercise 4 (due Wed, December 9; 18:00 h) |
Wed, December 8 | 09:00 – 10:00 | Mac | Lecture: Semantics in Typed Feature Structures |
Wed, December 8 | 10:00 – 12:00 | Mac | Laboratory: Exercise 5 (due Tue, December 14; 18:00 h) |
Wed, December 8 | 13:00 – 15:00 | Mac | Laboratory: Exercise 5 (due Tue, December 14; 18:00 h) |
Thu, December 9 | 09:00 – 12:00 | E230 | Lecture: Natural Language Processing; Summary |
Mon, December 13 | 10:00 – 12:00 | Mac | Laboratory: Exercise 5 (due Tue, December 14; 18:00 h) |
Sat, December 18 | 09:00 – 12:00 | Written Exam: 120 Minutes, Nice & Simple | |
Fri, December 24 | 12:00 | Submission Deadline for (Optional) Swedish Exercise | |
Mon, January 24 | 09:00 – 12:00 | Written Exam (2nd Slot): 120 Minutes, Nice & Simple | |
Slides | Grammar | Exercises | Solution |
---|---|---|---|
Overview | no grammar | Excercise | Solution |
Categories | Grammar | Exercise | Solution |
Typed Feature Structures | Grammar | Exercise | Solution |
Lexical Rules | Grammar | Exercise | Solution |
Sample Exam | Exercise | ||
Basic & Optional Data | Exercise |
The remaining two references — viz. Shieber (1986) and Pollard & Sag (1994) — serve mainly for historic completeness and, thus, constitute optional reading.