×

Speech recognition training method for audio and video file indexing on a search engine

  • US 7,272,558 B1
  • Filed: 03/23/2007
  • Issued: 09/18/2007
  • Est. Priority Date: 12/01/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method for indexing audio/video documents through the use of a search engine, the method comprising:

  • providing, to the search engine, a source of training documents comprising textual content;

    the search engine retrieving at least some of the training documents from the source of training documents;

    the search engine extracting the textual content from the retrieved training documents;

    the search engine indexing the textual content;

    training a speech recognition profile using the indexed textual content;

    providing, to the search engine, a source for the audio/video documents each of which comprise an associated audio content;

    the search engine retrieving at least some of the audio/video documents from the source of documents;

    the search engine extracting the associated content from the audio/video documents;

    converting the associated audio content into transcriptions using the trained speech recognition profile;

    the search engine indexing the transcriptions thereby resulting in an indexing of the audio/video documents; and

    saving the indexed transcriptions;

    wherein the training of the speech recognition profile comprises using summary sentences and comparing the number of sentences to a threshold to determine if all sentences will be kept for the training.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×