WORD PROBABILITY DETERMINATION
First Claim
Patent Images
1. A computer-implemented method, comprising:
- identifying a word corpus;
associating a word probability value with each word in the word corpus;
identifying a sentence;
determining candidate segmentations of the sentence based on the word corpus; and
iteratively adjusting the associated word probability value for each word in the word corpus based on the associated word probability values and the candidate segmentations.
2 Assignments
0 Petitions
Accused Products
Abstract
A word corpus is identified and a word probability value is associated with each word in the word corpus. A sentence is identified, candidate segmentations of the sentence are determined based on the word corpus, and the associated probability value for each word in the word corpus is iteratively adjusted based on the probability values associated with the words and the candidate segmentations.
191 Citations
25 Claims
-
1. A computer-implemented method, comprising:
-
identifying a word corpus; associating a word probability value with each word in the word corpus; identifying a sentence; determining candidate segmentations of the sentence based on the word corpus; and iteratively adjusting the associated word probability value for each word in the word corpus based on the associated word probability values and the candidate segmentations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer-implemented method comprising:
-
determining word probability values associated with words of a word corpus; determining candidate segmentations of sentences of documents in a document corpus; iteratively determining a segmentation probability value for each candidate segmentation of each sentence based on the word probability values associated with the words in the candidate segmentation; and iteratively adjusting the word probability value for each word based on the segmentation probability values for the candidate segmentations that include the word. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A method, comprising:
-
establishing a dictionary that includes words and associated word probability values that are determined using an iterative process, the iterative process including iteratively determining segmentation probability values for candidate segmentations of sentences of documents, and iteratively adjusting the word probability values for the word based on the segmentation probability values; and providing an input method editor which is configured to select words from the dictionary.
-
-
23. A system, comprising:
-
a data store to store a word corpus and a document corpus; a processing engine stored in computer readable medium and comprising instructions executable by a processing device that upon such execution cause the processing device to; associate a word probability value with each word in the word corpus; determine candidate segmentations of each sentence of each document in the document corpus based on the word corpus; and iteratively adjust the associated word probability value for each word in the word corpus based on the associated word probability values and the candidate segmentations. - View Dependent Claims (24)
-
-
25. A system, comprising:
-
means for associating a word probability value with words in a word corpus; means for identifying sentences in a plurality of documents; means for determining candidate segmentations of each of the sentences based on the word corpus; and means for iteratively adjusting the associated word probability value for each word in the word corpus based on the associated word probability values and the candidate segmentations.
-
Specification