Concept based cross media indexing and retrieval of speech documents

US 20070299838A1
Filed: 06/01/2007
Published: 12/27/2007
Est. Priority Date: 06/02/2006
Status: Active Grant

First Claim

Patent Images

1. A method of cross media indexing, registering and retrieving speech documents comprising the steps of:

registering a set of training documents;

pre-processing each training document;

constructing a terms-phonemes/document matrix from the training document metadata where a row is created for term and each phoneme in the training documents and a column is created for each training document;

normalizing entries in the terms-phonemes/document matrix;

computing a concept vector space from the training documents by computing from the terms-phonemes/document matrix;

computing vectors for new documents and adding the vectors to the vector space;

searching the computed vector space for vectors that are close to a vector computed for a query term or phoneme; and

providing a list of those speech and/or text documents with the highest values.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Indexing, searching, and retrieving the content of speech documents (including but not limited to recorded books, audio broadcasts, recorded conversations) is accomplished by finding and retrieving speech documents that are related to a query term at a conceptual level, even if the speech documents does not contain the spoken (or textual) query terms. Concept-based cross-media information retrieval is used. A term-phoneme/document matrix is constructed from a training set of documents. Documents are then added to the matrix constructed from the training data. Singular Value Decomposition is used to compute a vector space from the term-phoneme/document matrix. The result is a lower-dimensional numerical space where term-phoneme and document vectors are related conceptually as nearest neighbors. A query engine computes a cosine value between the query vector and all other vectors in the space and returns a list of those term-phonemes and/or documents with the highest cosine value.

Citations

13 Claims

1. A method of cross media indexing, registering and retrieving speech documents comprising the steps of:
- registering a set of training documents;
  
  pre-processing each training document;
  
  constructing a terms-phonemes/document matrix from the training document metadata where a row is created for term and each phoneme in the training documents and a column is created for each training document;
  
  normalizing entries in the terms-phonemes/document matrix;
  
  computing a concept vector space from the training documents by computing from the terms-phonemes/document matrix;
  
  computing vectors for new documents and adding the vectors to the vector space;
  
  searching the computed vector space for vectors that are close to a vector computed for a query term or phoneme; and
  
  providing a list of those speech and/or text documents with the highest values.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. A method as set forth in claim 1, wherein said pre-processing comprises creating a record for each training document including creating metadata for each training document.
  - 3. A method as set forth in claim 1, wherein said pre-processing comprises transcribing phonetically each speech document into an intermediate representative language;
    - converting each document from native format to UTF-8 format;
      
      segmenting each document; and
      
      enqueuing each document for cataloging.
  - 4. A method as set forth in claim 3, wherein said segmenting comprises tokenizing each phonetic transcription and converted text so that counts for index terms and phonemes are obtained.
  - 5. A method as set forth in claim 1, wherein said computing a concept vector space comprises using a Singular Value Decomposition technique.
  - 6. A method as set forth in claim 1, wherein said computing vectors for new documents and adding the vectors to the vector space comprises creating for each document by summing the term-phoneme vectors for words and phonemes the document contains, each term-phoneme vector weighted by its respective word or phoneme count.
  - 7. A method as set forth in claim 1, wherein said searching the computed vector space for vectors that are close to a vector computed for query terms or phonemes comprises computing a cosine value between a query vector and all other vectors in the space, and returning in a list of textual and/or speech documents with the highest cosine values.

8. A system for cross media indexing, registering and retrieving speech documents comprising the steps of:
- document collection means for registering a set of training documents, preparing the set of training documents for cataloging; and
  
  indexing the set of training documents, including document terms and phonemes;
  
  pre-processor for pre-processing each training document and computing vectors forming a concept-vector space from the training documents by computing vectors from the set of training documents;
  
  terms-phonemes/document matrix constructed from the training document metadata where a row is created for each term and each phoneme in the training documents and a column is created for each training document, and entries are normalized in the terms-phonemes/document matrix;
  
  singular value decomposition means for computing a vector space from the terms-phonemes/document matrix;
  
  said pre-processor also pre-processing each new document and computing vectors from the new documents and adding the vectors to the vector space; and
  
  query engine for searching the computed vector space for vectors that are close to a vector computed for one or more query terms or phonemes; and
  
  providing a list of those textual and/or speech documents with the highest values.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. A system as set forth in claim 8, wherein said pre-processor creates a record for each training document including creating metadata for each training document.
  - 10. A system as set forth in claim 8, wherein said preprocessor transcribes phonetically each speech document into an intermediate representative language;
    - converts each document from native format to UTF-8 format;
      
      segments each document; and
      
      enques each document for cataloging.
  - 11. A system as set forth in claim 10, wherein said preprocessor segments each document by tokenizing each phonetic transcription and converted text so that counts for index terms and phonemes are obtained.
  - 12. A system as set forth in claim 8, wherein said pre-processor further computes vectors for new documents and adds the vectors to the vector space for each document by summing the term or phoneme vectors for words or phonemes the document contains, each term and phoneme vector being weighted by its respective word or phoneme count.
  - 13. A system as set forth in claim 8, wherein said search engine searches the computed vector space for vectors that are close to a vector computed for a query term or phoneme by computing a cosine value between a query vector and all other vectors in the space, and returning in a list of textual and/or speech documents with the highest cosine values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nytell Software LLC (Intellectual Ventures LLC)
Original Assignee
Telcordia Licensing Co LLC (Telefonaktiebolaget LM Ericsson)
Inventors
Bassu, Devasis, Egan, Dennis, Behrens, Clifford

Granted Patent

US 7,716,221 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 16/3343 using phonetics

G06F 16/685 using automatically derived...

Concept based cross media indexing and retrieval of speech documents

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Concept based cross media indexing and retrieval of speech documents

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links