nltk: NLTK -- the Natural Language Toolkit -- is a suite of open source
Python modules, data sets and tutorials supporting research and
development in natural language processing.
nltk.app.wordnet_app: A WordNet Browser application which launches the default browser
(if it is not already running) and opens a new tab with a
connection to http://localhost:port/ .
nltk.classify: Classes and interfaces for labeling tokens with category labels (or
class
labels).
nltk.classify.api: Interfaces for labeling tokens with category labels (or class
labels).
nltk.classify.decisiontree: A classifier model that decides which label to assign to a token on
the basis of a tree structure, where branches correspond to
conditions on feature values, and leaves correspond to label
assignments.
nltk.classify.mallet: A set of functions used to interface with the external Mallet machine
learning package.
nltk.classify.maxent: A classifier model based on maximum entropy modeling framework.
nltk.classify.megam: A set of functions used to interface with the external megam
maxent optimization package.
nltk.corpus.reader.ycoe: Corpus reader for the York-Toronto-Helsinki Parsed Corpus of Old
English Prose (YCOE), a 1.5 million word syntactically-annotated
corpus of Old English prose texts.
nltk.data: Functions to find and load NLTK resource
files, such as corpora, grammars, and saved processing objects.
nltk.decorators: Decorator module by Michele Simionato <michelesimionato@libero.it>
Copyright Michele Simionato, distributed under the terms of the BSD License (see below).
nltk.lazyimport: Helper to enable simple lazy module import.
nltk.metrics: Classes and methods for scoring processing modules.
nltk.metrics.agreement: Implementations of inter-annotator agreement coefficients surveyed by Artstein
and Poesio (2007), Inter-Coder Agreement for Computational Linguistics.
nltk.misc.sort: This module provides a variety of list sorting algorithms, to
illustrate the many different algorithms (recipes) for solving a
problem, and how to analyze algorithms experimentally.
nltk.parse.earleychart: Data classes and parser implementations for incremental
chart parsers, which use dynamic programming to efficiently parse a
text.
nltk.parse.featurechart: Extension of chart parsing implementation to handle grammars with
feature structures as nodes.
nltk.sem: This package contains classes for representing semantic structure
in formulas of first-order logic and for evaluating such formulas
in set-theoretic models.
nltk.sem.logic: A version of first order predicate logic, built on top of the typed
lambda calculus.
nltk.sem.relextract: Code for extracting relational triples from the ieer and conll2002
corpora.
nltk.sem.util: Utility functions for batch-processing sentences: parsing and
extraction of the semantic representation of the root node of the
the syntax tree, followed by evaluation of the semantic
representation in a first-order model.
nltk.sourcedstring: X{Sourced strings} are strings that are annotated with information
about the location in a document where they were originally found.
nltk.stem: Interfaces used to remove morphological affixes from words, leaving
only the word stem.
nltk.tag.crf: An interface to Mallet's Linear Chain Conditional Random Field
(LC-CRF) implementation.
nltk.tag.hmm: Hidden Markov Models (HMMs) largely used to assign the correct
label sequence to sequential data or assess the probability of a
given label and data sequence.
nltk.tag.hunpos: A module for interfacing with the HunPos open-source POS-tagger.
nltk.tag.sequential: Classes for tagging sentences sequentially, left to right.
nltk.tokenize.regexp: Tokenizers that divide strings into substrings using regular
expressions that can match either tokens or separators between
tokens.