Indexing with translation model for feature regularization
First Claim
Patent Images
1. A system for providing clean data for indexing, said system comprising:
- a recognizer which recognizes words;
an indexing database; and
a translator, having been trained automatically, which accepts textual input, having originated from said recognizer, and which is adapted to automatically improve the quality of said textual input for entry into said indexing database, whereby errors produced by said recognizer and which are detrimental to indexing performance are reduced, and wherein, immediately prior to reconfiguration, said textual input appears as a feature-extracted transformation of at least one word recognized by said recognizer.
2 Assignments
0 Petitions
Accused Products
Abstract
An audio indexing system including, in addition to a speech recognition subsystem for converting the audio information into a textual form and an indexing subsystem for extracting the features to be used for searching and browsing, a statistical machine translation model, trained on a parallel or comparable corpus of automatically and by-hand transcribed data, for processing the output of the speech recognition system.
51 Citations
19 Claims
-
1. A system for providing clean data for indexing, said system comprising:
-
a recognizer which recognizes words;
an indexing database; and
a translator, having been trained automatically, which accepts textual input, having originated from said recognizer, and which is adapted to automatically improve the quality of said textual input for entry into said indexing database, whereby errors produced by said recognizer and which are detrimental to indexing performance are reduced, and wherein, immediately prior to reconfiguration, said textual input appears as a feature-extracted transformation of at least one word recognized by said recognizer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of providing clean data for indexing, said method comprising the steps of:
-
providing an indexing database;
providing a recognizer which recognizes words; and
providing an automatically trained translator which accepts textual input having originated from said recognizer;
said method further comprising the steps of;
with said recognizer, recognizing words; and
with said translator, accepting textual input having originated from said recognizer, and automatically improving the quality of said textual input for entry into said indexing database, whereby errors produced by said recognizer and which are detrimental to indexing performance are reduced, and wherein, immediately prior to reconfiguration, said textual input appears as a feature-extracted transformation of at least one word recognized by said recognizer. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for providing clean data for indexing, said method comprising the steps of:
-
providing an indexing database;
providing a recognizer which recognizes words; and
providing an automatically trained translator which accepts textual input having originated from said recognizer;
said method further comprising the steps of;
with said recognizer, recognizing words; and
with said translator, accepting textual input having originated from said recognizer, and automatically improving the quality of said textual input for entry into said indexing database, whereby errors produced by said recognizer and which are detrimental to indexing performance are reduced, and wherein, immediately prior to reconfiguration, said textual input appears as a feature-extracted transformation of at least one word recognized by said recognizer.
-
Specification