×

Subtitle generation and retrieval combining document with speech recognition

  • US 7,739,116 B2
  • Filed: 01/23/2006
  • Issued: 06/15/2010
  • Est. Priority Date: 12/21/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. An apparatus for recognizing speech in a presentation to generate a subtitle corresponding to the speech, said apparatus comprising:

  • a text extraction unit that receives presentation text and its attributes from a presentation document, and stores said text and attributes in the text attribute database on a page-by-page basis, wherein the attributes comprise a title, character size, character underlining, or boldface character;

    a morphological analysis unit that morphologically analyzes the presentation text stored in the text attribute database, decomposes said presentation text into words, and stores the words in a word attribute database;

    a common keyword generation unit that extracts the words and their attributes from the word attribute database, determines whether or not a word has been successfully extracted, initializes attribute weights of the words and extracts the attribute weights from an attribute weight database and sums them if it is determined that the word extraction is successful, extracts keywords that are found in the presentation document and assigns weights to the keywords, then selects as an additional keyword to add to the keyword database any word that has been determined, based on time and attribute weight, to represent a high level of importance among the words contained in the presentation;

    a dictionary registration unit that adds the keywords registered in a keyword database to a dictionary database that is consulted at time of speech recognition;

    a voice recognition unit that recognizes the speech in the presentation in consultation with the dictionary database by;

    acquiring correspondence between a lapse of time from a start of the presentation and a result of voice recognition every moment, stores a correspondence between the time and the result of voice recognition in a subtitle database;

    a page-time recording unit that detects a page-changing event and stores the events as timestamps in a page-time database;

    a common keyword regeneration unit that initializes the keyword database, extracts a word, an attribute of the word and information about the page where the word appeared from the word attribute database, and further assigns weight depending on a number of times the keyword appeared as the voice in the presentation;

    a display control unit that reads a correspondence between the time and the result of speech recognition from the subtitle database, and displays said correspondence on a subtitle candidate display region, causes keywords stored in the keyword database, presentation text stored in the text attribute database, and a master subtitle stored in a master subtitle database to cooperate together for display as a subtitle to the presentation, and accesses the page-time database and specifies the page corresponding to the result of voice recognition on the basis of the time information;

    a display unit comprising;

    the subtitle candidate display region, a common keyword list display region, a presentation text display region, and a master subtitle display region;

    a speaker note generation unit that generates speaker notes from subtitles stored in the subtitle database and embeds them in presentation documents;

    the text attribute database;

    the word attribute database that stores the words obtained as a result of the decomposition performed by the morphological analysis unit, and their attributes;

    the attribute weight database that stores presentation word attributes and their assigned weights;

    the keyword database that stores the weighted words as keywords;

    the dictionary database;

    the subtitle database that stores, together with the time, the result of speech-recognition as the subtitle;

    the page-time database that records a time that the page is turned and a time when the next page is turned, and calculates the weight of the keywords in the page based on a duration during which the page in question is displayed in the presentation, when it is determined that extraction of the word has been successful; and

    a master subtitle database that stores master subtitles on a page-by-page basis.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×