INDEXING AND SEARCHING AUDIO USING TEXT INDEXERS
First Claim
1. A computer-implemented method for processing audio content, comprising:
- generating word lattices for the audio content, the word lattices containing a plurality of word candidates and nodes that are connections between the word candidates; and
indexing the word lattices using a text indexer of a full-text search engine such that the audio content is contained in a full-text index as indexed audio content.
2 Assignments
0 Petitions
Accused Products
Abstract
A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.
25 Citations
20 Claims
-
1. A computer-implemented method for processing audio content, comprising:
-
generating word lattices for the audio content, the word lattices containing a plurality of word candidates and nodes that are connections between the word candidates; and indexing the word lattices using a text indexer of a full-text search engine such that the audio content is contained in a full-text index as indexed audio content. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for fitting a word lattice into a word slot of a text indexer of a full-text search engine, comprising:
-
generating a word lattice that is a weighted directed acyclic graph having arcs that represent a plurality of word candidates with associated confidence levels and nodes that are connections between the word candidates; finding time boundaries for each of the plurality of word candidates and using these time boundaries as anchor points; and binning the plurality of word candidates by aligning each node in the lattice to its nearest anchor point in time to enable the indexing of the word lattice by the text indexer. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A full-text lattice indexing and searching system for indexing and searching audio content of a document using a text indexer of a full-text search engine, comprising:
-
an indexing module that uses standard and custom functionality to place a word lattice in a searchable index, the indexing module further comprising; a custom index function that uses a time-anchored lattice expansion technique to enable a word lattice of the audio content to be indexed by the text indexer; and a searching module that uses standard and custom functionality to search the searchable index for indexed word lattices. - View Dependent Claims (17, 18, 19, 20)
-
Specification