Indexing and searching audio using text indexers
First Claim
1. A computer-implemented method for processing audio content, comprising:
- generating word lattices for the audio content, the word lattices containing a plurality of word candidates and nodes that represent connections between the word candidates and encoding times;
expanding a decorated query that was decorated with desired confidence levels into a plurality of decorated queries corresponding to the desired confidence levels; and
indexing the word lattices using a text indexer of a full-text search engine and the plurality of decorated queries such that the audio content is contained in a full-text index as indexed audio content.
2 Assignments
0 Petitions
Accused Products
Abstract
A full-text lattice indexing and searching system and method for indexing word lattices using a text indexer to enable enhance searching of audio content. The system and method utilize a Time-Anchored Lattice Expansion (TALE) method that represents word lattices such that they can be indexed with existing text indexers with little or no modification. Embodiments of system and method include an indexing module for generating and indexing word lattices based on audio content and a searching module for allowing searching of a full-text index containing indexed word lattices. The indexing module includes a custom IFilter and a custom Wordbreaker. Embodiments of the searching module include an ExpandQuery function for decorating an input query and a custom Stemmer. Embodiments of the searching module also include a GenerateSnippets module that extracts information from the indexed word lattices to enable the creation of clickable snippets.
24 Citations
20 Claims
-
1. A computer-implemented method for processing audio content, comprising:
-
generating word lattices for the audio content, the word lattices containing a plurality of word candidates and nodes that represent connections between the word candidates and encoding times; expanding a decorated query that was decorated with desired confidence levels into a plurality of decorated queries corresponding to the desired confidence levels; and indexing the word lattices using a text indexer of a full-text search engine and the plurality of decorated queries such that the audio content is contained in a full-text index as indexed audio content. - View Dependent Claims (4, 5, 11, 12, 16, 17, 18, 19, 20)
-
-
2. A computer-implemented method for processing audio content, comprising:
-
generating word lattices for the audio content, the word lattices containing a plurality of word candidates and nodes that are connections between the word candidates; inputting a query from a user containing at least one search term and a desired confidence level; and decorating the query with the desired confidence level and all confidence levels better than the desired confidence level to obtain a decorated query; expanding the decorated query into a plurality of decorated queries based on the desired confidence level; and obtaining search results corresponding to the decorated query and the desired confidence level and displaying the search results to the user. - View Dependent Claims (6, 7, 8, 9, 13, 14)
-
-
3. A computer-implemented method for processing audio content, comprising:
-
generating word lattices for the audio content, the word lattices containing a plurality of word candidates and nodes that are connections between the word candidates; indexing the word lattices using a text indexer of a full-text search engine such that the audio content is contained in a full-text index as indexed audio content; inputting a query from a user containing at least one search term and a desired confidence level; searching the indexed audio content to obtain search results corresponding to the query and the desired confidence level; decorating the query with the desired confidence level and all confidence levels better than the desired confidence level by appending to the search term a suffix corresponding to the desired confidence level and all confidence levels better than the desired confidence level to obtain a decorated query; inputting the decorated query into a Wordbreaker of the full-text search engine; and expanding the decorated query into a plurality of decorated queries corresponding whose number is equal to a sum of the desired confidence level plus a number of all confidence levels better than the desired confidence level. - View Dependent Claims (10, 15)
-
Specification