Phrase to phrase joint probability model for statistical machine translation
First Claim
Patent Images
1. A method comprising:
- receiving a phrase {right arrow over (e)}i in a first language; and
generating a joint probability model from a parallel corpus, the generating based on at least one generated pair of phrases ({right arrow over (e)}i, {right arrow over (f)}i) wherein {right arrow over (e)}i comprises a first number of words and {right arrow over (f)}i comprises a second number of words, the first number being different from the second number.
1 Assignment
0 Petitions
Accused Products
Abstract
A machine translation (MT) system may utilize a phrase-based joint probability model. The model may be used to generate source and target language sentences simultaneously. In an embodiment, the model may learn phrase-to-phrase alignments from word-to-word alignments generated by a word-to-word statistical MT system. The system may utilize the joint probability model for both source-to-target and target-to-source translation applications.
101 Citations
25 Claims
-
1. A method comprising:
-
receiving a phrase {right arrow over (e)}i in a first language; and generating a joint probability model from a parallel corpus, the generating based on at least one generated pair of phrases ({right arrow over (e)}i, {right arrow over (f)}i) wherein {right arrow over (e)}i comprises a first number of words and {right arrow over (f)}i comprises a second number of words, the first number being different from the second number. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 24)
-
-
9. A method comprising:
generating a phrase-to-phrase probabilistic dictionary from a parallel corpus using word-for-word alignments in the parallel corpus and a phrase-based model generated from the parallel corpus, the generating based on a generated pair of phrases ({right arrow over (e)}i, {right arrow over (f)}i wherein {right arrow over (e)}i comprises a first number of words and {right arrow over (f)}i comprises a second number of words, the first number being different from the second number. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 25)
-
19. A method comprising:
-
(1) receiving an input string including a plurality of words in a first language; from an initial hypothesis in a second language, (2) selecting a sequence from said plurality of words in the input string, (3) selecting a possible phrase translation in the second language for said selected sequence, (4) attaching the possible phrase translation to the current hypothesis to produce an updated hypothesis, (5) marking the words in said selected sequence as translated, (6) storing the hypothesis sequence in a stack, and (7) updating a probability cost of the updated hypothesis; (8) repeating steps (2) to (7) based on a size of the stack to produce one or more possible translations for the input string; and (9) selecting one of said possible translations in the stack having a highest probability. - View Dependent Claims (20, 21, 22, 23)
-
Specification