A System for Modularly Constructing Efficient Natural Language Processors

A system based on a general top-down parsing algorithm has been developed which allows language processors to be created as executable specifications of arbitrary attribute grammars. Attribute grammars allow the declarative definition of languages. Syntax is defined through context-free grammar rules, and meaning is defined by associated semantic rules. The executable specifications are highly modular. Innovative techniques enable the efficient accommodation of left-recursive rules, ambiguity, and arbitrary semantic dependencies. A new technique allows parses to be pruned by arbitrary semantic constraints. This new technique is useful in modelling natural-language phenomena by imposing unification-like restrictions, and accommodating long-distance and cross-serial dependencies, which cannot be handled by context-free rules alone.

Keywords: Parser combinators, Lazy evaluation, Top-down parsing, Attribute grammars, Natural-language processing

  • Prototype Haskell Implementation
  • Related Attribute Types
  • Sample Natural Language Processing Application
  • Example Applications