Sage Journals: Discover world-class research

Abstract

Recently, different theoretical learning results have been found for a variety of context-free grammar subclasses through the use of distributional learning [1]. However, these results are still not extended to probabilistic grammars. In this work, we give a practical algorithm, with some proven properties, that learns a subclass of probabilistic grammars from positive data. A minimum satisfiability solver is used to direct the search towards small grammars. Experiments on well-known context-free languages and artificial natural language grammars give positive results. Moreover, our analysis shows that the type of grammars induced by our algorithm are, in theory, capable of modelling context-free features of natural language syntax. One of our experiments shows that our algorithm can potentially outperform the state-of-the-art in unsupervised parsing on the WSJ10 corpus.

Keywords

Grammatical Inference Probabilistic Context-Free Grammars Minimum Satisfiability Congruential Grammars Small Grammars

Get full access to this article

View all access options for this article.

Grammatical Inference of PCFGs Applied to Language Modelling and Unsupervised Parsing

Abstract

Keywords

Get full access to this article