type EMIBMModel1
source code
object --+
|
EMIBMModel1
This class contains implementations of the Expectation Maximization
algorithm for IBM Model 1. The algorithm runs upon a sentence-aligned
parallel corpus and generates word alignments in aligned sentence
pairs.
The process is divided into 2 main stages. Stage 1: Studies
word-to-word translation probabilities by collecting evidence of a
English word been the translation of a foreign word from the parallel
corpus.
Stage 2: Based on the translation probabilities from Stage 1,
generates word alignments for aligned sentence pairs.
|
|
__init__(self,
aligned_sents,
convergent_threshold=0.01,
debug=False)
Initialize a new EMIBMModel1. |
source code
|
|
|
|
train(self)
The train() function implements Expectation Maximization training
stage that learns word-to-word translation probabilities. |
source code
|
|
|
|
aligned(self)
Returns a list of AlignedSents with Alignments calculated using
IBM-Model 1. |
source code
|
|
__init__(self,
aligned_sents,
convergent_threshold=0.01,
debug=False)
(Constructor)
| source code
|
Initialize a new EMIBMModel1.
- Parameters:
aligned_sents (list of AlignedSent objects) - The parallel text corpus.Iteratable containing AlignedSent
instances of aligned sentence pairs from the corpus.
convergent_threshold (float) - The threshold value of convergence. An entry is considered
converged if the delta from old_t to new_t is less than this
value. The algorithm terminates when all entries are converged.
This parameter is optional, default is 0.01
- Overrides:
object.__init__
|
|
The train() function implements Expectation Maximization training
stage that learns word-to-word translation probabilities.
- Returns:
- Number of iterations taken to converge
|