Method and system for retrieving documents with spoken queries
First Claim
1. A computer implemented method for indexing and retrieving documents stored in a database, comprising the steps of:
- extracting a document feature vector from each of a plurality of documents;
indexing each of the plurality of documents according the associated document feature vector;
converting, using a processor, a spoken query to an intermediate representation representing possible sequential combinations of terms in the spoken query, the intermediate representation is selected from a group consisting of a lattice of terms, an n-best list, or combination thereof;
generating a query certainty vector from the intermediate representation; and
comparing the query certainty vector to each of the document feature vectors to retrieve a ranked result set of documents.
1 Assignment
0 Petitions
Accused Products
Abstract
A method indexes and retrieves documents stored in a database. A document feature vector is extracted from each document and the documents are then indexed according to the feature vectors. A spoken query is converted to an intermediate representation representing likelihoods of possible sequential combinations of terms in the spoken query. A query certainty vector is generated from the intermediate representation. Other information is acquired. The other information is combined with the query certainty vector. The query vector and the other information are then compared to each of the document feature vectors to retrieve a ranked result set of documents.
40 Citations
13 Claims
-
1. A computer implemented method for indexing and retrieving documents stored in a database, comprising the steps of:
-
extracting a document feature vector from each of a plurality of documents; indexing each of the plurality of documents according the associated document feature vector; converting, using a processor, a spoken query to an intermediate representation representing possible sequential combinations of terms in the spoken query, the intermediate representation is selected from a group consisting of a lattice of terms, an n-best list, or combination thereof; generating a query certainty vector from the intermediate representation; and comparing the query certainty vector to each of the document feature vectors to retrieve a ranked result set of documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for indexing and retrieving documents, comprising:
-
a plurality of documents, each document having an associated document feature vector; a database indexing each of the plurality of documents according the associated document feature vector; a speech recognition engine for converting a spoken query to an intermediate representation representing possible sequential combinations of terms in the spoken query; a query module, comprising a processor, the query module configured for generating a query certainty vector from the intermediate representation; and a comparator configured to compare the query certainty vector to each of the document feature vectors to retrieve a ranked result set of documents.
-
Specification