Package nltk :: Package parse
[hide private]
[frames] | no frames]

Package parse

source code

Classes and interfaces for producing tree structures that represent the internal organization of a text. This task is known as parsing the text, and the resulting tree structures are called the text's parses. Typically, the text is a single sentence, and the tree structure represents the syntactic structure of the sentence. However, parsers can also be used in other domains. For example, parsers can be used to derive the morphological structure of the morphemes that make up a word, or to derive the discourse structure for a set of utterances.

Sometimes, a single piece of text can be represented by more than one tree structure. Texts represented by more than one tree structure are called ambiguous texts. Note that there are actually two ways in which a text can be ambiguous:

However, the parser module does not distinguish these two types of ambiguity.

The parser module defines ParserI, a standard interface for parsing texts; and two simple implementations of that interface, ShiftReduceParser and RecursiveDescentParser. It also contains three sub-modules for specialized kinds of parsing:

Submodules [hide private]

Classes [hide private]
FeatureIncrementalBottomUpLeftCornerChartParser
LeftCornerChartParser
IncrementalTopDownChartParser
FeatureIncrementalBottomUpChartParser
FeatureEarleyChartParser
FeatureBottomUpChartParser
EarleyChartParser
IncrementalBottomUpChartParser
FeatureTopDownChartParser
BottomUpChartParser
A ChartParser using a bottom-up parsing strategy.
IncrementalLeftCornerChartParser
TopDownChartParser
A ChartParser using a top-down parsing strategy.
IncrementalBottomUpLeftCornerChartParser
BottomUpLeftCornerChartParser
A ChartParser using a bottom-up left-corner parsing strategy.
FeatureBottomUpLeftCornerChartParser
FeatureIncrementalChartParser
SteppingChartParser
A ChartParser that allows you to step through the parsing process, adding a single edge at a time.
FeatureIncrementalTopDownChartParser
IncrementalChartParser
An incremental chart parser implementing Jay Earley's parsing algorithm:
LongestChartParser
A bottom-up parser for PCFGs that tries longer edges before shorter ones.
UnsortedChartParser
A bottom-up parser for PCFGs that tries edges in whatever order.
BottomUpProbabilisticChartParser
An abstract bottom-up parser for PCFGs that uses a Chart to record partial results.
RandomChartParser
A bottom-up parser for PCFGs that tries edges in random order.
SteppingRecursiveDescentParser
A RecursiveDescentParser that allows you to step through the parsing process, performing a single operation at a time.
RecursiveDescentParser
A simple top-down CFG parser that parses texts by recursively expanding the fringe of a Tree, and matching it against a text.
SteppingShiftReduceParser
A ShiftReduceParser that allows you to setp through the parsing process, performing a single operation at a time.
ShiftReduceParser
A simple bottom-up CFG parser that uses two operations, "shift" and "reduce", to find a single parse for a text.
FeatureChartParser
InsideChartParser
A bottom-up parser for PCFGs that tries edges in descending order of the inside probabilities of their trees.
ChartParser
A generic chart parser.
ViterbiParser
A bottom-up PCFG parser that uses dynamic programming to find the single most likely parse for a text.
ProbabilisticProjectiveDependencyParser
A probabilistic, projective dependency parser.
ProjectiveDependencyParser
A projective, rule-based, dependency parser.
NaiveBayesDependencyScorer
A dependency scorer built around a MaxEnt classifier.
ProbabilisticNonprojectiveParser
A probabilistic non-projective dependency parser.
NonprojectiveDependencyParser
A non-projective, rule-based, dependency parser.
ParserI
A processing class for deriving trees that represent possible structures for a sequence of tokens.
DependencyGraph
A container for the nodes and labelled edges of a dependency structure.
MaltParser
Functions [hide private]
 
load_parser(grammar_url, trace=0, parser=None, chart_class=None, beam_size=0, **load_args)
Load a grammar from a file, and build a parser based on that grammar.
source code
XDigraph
nx_graph(self)
Convert the data in a nodelist into a networkx labeled directed graph.
source code
Function Details [hide private]

load_parser(grammar_url, trace=0, parser=None, chart_class=None, beam_size=0, **load_args)

source code 

Load a grammar from a file, and build a parser based on that grammar. The parser depends on the grammar format, and might also depend on properties of the grammar itself.

The following grammar formats are currently supported:

Parameters:
  • grammar_url (str) - A URL specifying where the grammar is located. The default protocol is "nltk:", which searches for the file in the the NLTK data package.
  • trace (int) - The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.
  • parser - The class used for parsing; should be ChartParser or a subclass. If None, the class depends on the grammar format.
  • chart_class - The class used for storing the chart; should be Chart or a subclass. Only used for CFGs and feature CFGs. If None, the chart class depends on the grammar format.
  • beam_size (int) - The maximum length for the parser's edge queue. Only used for probabilistic CFGs.
  • load_args - Keyword parameters used when loading the grammar. See data.load for more information.