Top-down Parser Combinators for Ambiguous Left-recursive grammars

In functional and logic programming, parsers can be built as modular executable specifications of grammars, using parser combinators and definite clause grammars respectively. These techniques are based on top-down backtracking search. Commonly used implementations are inefficient for ambiguous languages, cannot accommodate left-recursive grammars, and require exponential space to represent parse trees for highly ambiguous input. Memoization is known to improve efficiency, and work by other researchers has had some success in accommodating left recursion.

We integrate Norvig's technique with aspects of existing techniques for dealing with left recursion. In particular:

  • we make use of the length of the remaining input as does Kuno (1965)
  • we keep a record of how many times each parser is applied to each input position in a way that is similar to the use of cancellation sets by Nederhof and Koster (1993)
  • we integrate memoization with a technique for dealing with left recursion as does Johnson (1995)

To accommodate left-recursion in general,

  • our algorithm stores 'left-recursion counts' in the memotable, encapsulates the memoization process (in a programming construct called a monad) for refraining redundant computations and curtails an ever-growing left-recursive parse by imposing depth restrictions.
  • Our method includes a new technique for accommodating indirect left recursion which ensures correct reuse of stored results created through curtailment of left-recursive parsers by comparing its computed-context with current-context.
  • We also modify the memoization process so that the memotable represents the potentially exponential number of parse trees in a compact polynomial sized form as a 'densely forest of branches' using a technique derived from the chart parsing methods of Kay (1980) and Tomita (1986).

Papers, describing the algorithm and related work, can be found here. The proof of termination and formal complexity analysis of the algorithm are also documented . This page has a demonstration of Haskell implementation of the algorithm. Some experimental results are listed in this page.