×

Technique for Searching Out New Words That Should Be Registered in Dictionary For Speech Processing

  • US 20080162118A1
  • Filed: 12/14/2007
  • Published: 07/03/2008
  • Est. Priority Date: 12/15/2006
  • Status: Active Grant
First Claim
Patent Images

1. A system for searching out a new word to be newly registered in a dictionary included in a segmentation device for segmenting an inputted text into a plurality of words, the system comprising:

  • a segmentation candidate generating unit for generating a plurality of segmentation candidates by inputting a training text into the segmentation device to cause the segmentation device to segment the training text into words, the segmentation candidates containing mutually different combinations of words resulting from the segmentation of the training text, and being associated with certainty factors of the results of the segmentation;

    a sum calculating unit for computing a likelihood that each word is a new word by summing up the certainty factors associated with the plurality of segmentation candidates containing the word; and

    a searching unit for searching combinations of words contained in at least any one of the segmentation candidates and containing words with which the entire training text can be written, in order to find out a combination that minimizes an information entropy of words assuming that each word belonging to the combinations appears in the training text at a frequency according to the likelihood corresponding to the word, and thereafter for outputting the found-out combination as the combination of words including the new word.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×