Subtitle generation and retrieval combining document processing with voice processing

US 8,155,969 B2
Filed: 08/11/2009
Issued: 04/10/2012
Est. Priority Date: 12/21/2004
Status: Active Grant

First Claim

Patent Images

1. An apparatus for retrieving a character string, said apparatus comprising:

a central processing unit configured for;

recognizing speech in a presentation to produce subtitles;

initializing a keyword database;

extracting keywords from document data used in the presentation;

adding the extracted keywords to a voice recognition dictionary; and

assigning word attribute weights to the keywords extracted from the presentation in accordance with the word attributes of said keywords, wherein said word attributes describe how the keywords appear in the document data;

recording a time that the page is turned and a time when a next page is turned to determine time spent on the page;

calculating a weight of the keywords on a page based on a combination of;

a duration during which said page is displayed in the presentation when it is determined that the extraction of the keyword has been successful; and

the word attribute weights assigned to the keywords, the word attributes comprising at least one of;

a title, character size, character underlining, and boldface character;

storage for storing;

the subtitles;

the keywords;

associated information of the subtitles and the keywords;

the word attributes read from a word attribute database; and

the time spent on the page when it is determined that the extraction of the keywords has been successful, wherein the time spent on the page is stored as a timestamp;

the word attribute database storing the word attributes; and

the central processing unit used as a retrieval device for retrieving, by use of the associated information, the character string from text data composed of the subtitles and the keywords.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus for retrieving a character string includes: storage for storing text data obtained by recognizing a voice in a presentation, second text data extracted from document data used in the presentation, and associated information of the first text data and the second text data. The apparatus also includes a retrieval unit for retrieving, by use of the associated information, the character string from text data composed from the first text data and the second text data.

18 Citations

14 Claims

1. An apparatus for retrieving a character string, said apparatus comprising:
- a central processing unit configured for;
  
  recognizing speech in a presentation to produce subtitles;
  
  initializing a keyword database;
  
  extracting keywords from document data used in the presentation;
  
  adding the extracted keywords to a voice recognition dictionary; and
  
  assigning word attribute weights to the keywords extracted from the presentation in accordance with the word attributes of said keywords, wherein said word attributes describe how the keywords appear in the document data;
  
  recording a time that the page is turned and a time when a next page is turned to determine time spent on the page;
  
  calculating a weight of the keywords on a page based on a combination of;
  
  a duration during which said page is displayed in the presentation when it is determined that the extraction of the keyword has been successful; and
  
  the word attribute weights assigned to the keywords, the word attributes comprising at least one of;
  
  a title, character size, character underlining, and boldface character;
  
  storage for storing;
  
  the subtitles;
  
  the keywords;
  
  associated information of the subtitles and the keywords;
  
  the word attributes read from a word attribute database; and
  
  the time spent on the page when it is determined that the extraction of the keywords has been successful, wherein the time spent on the page is stored as a timestamp;
  
  the word attribute database storing the word attributes; and
  
  the central processing unit used as a retrieval device for retrieving, by use of the associated information, the character string from text data composed of the subtitles and the keywords.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The apparatus according to claim 1, further comprising a display for displaying the result of retrieval conducted by the retrieval device, together with the associated information about the result of retrieval.
  - 3. The apparatus of claim 1 wherein the central processing unit is further configured for setting a dictionary which belongs to a category suitable for the keyword as the dictionary to be consulted at a time of recognizing the speech.
  - 4. The apparatus of claim 1 wherein the central processing unit is further configured for displaying the keyword that has been extracted together with the subtitle.
  - 5. The apparatus of claim 4 wherein the subtitle is displayed together with information about a specific page where the subtitle appears.
  - 6. The apparatus of claim 5 wherein the information is text data contained in the specific page.
  - 7. The apparatus of claim 5 wherein the information concerns speech generated with reference to a specific page in the past.
  - 8. The apparatus of claim 7 wherein the central processing unit is further configured for embedding the specific subtitle in the specific page of the document.
  - 9. The apparatus of claim 7 wherein the central processing unit is further configured for retrieving character strings, with a retrieval target range extended from the specific subtitle to text data contained in the specific page.
  - 10. The apparatus of claim 1, wherein calculating the weight of the keywords on the page is further based on a number of times the keyword appeared in the speech of the presentation.
  - 11. The apparatus of claim 1 wherein the central processing unit is further configured for registering the subtitle that has been created so that the subtitle can be consulted at the presentation.

12. A non-transitory storage device comprising a program product which allows a computer to realize:
- using a processor device configured to perform;
  
  recognizing speech in a presentation to produce subtitles;
  
  extracting keywords from document data used in the presentation;
  
  adding the extracted keywords to a voice recognition dictionary; and
  
  assigning word attribute weights to the keywords extracted from the presentation based on word attributes read from a word attribute database, wherein said word attributes comprise at least one of;
  
  a title, character size, character underlining, and boldface character;
  
  recording a time that the page is turned and a time when a next page is turned to determine time spent on the page;
  
  storing the time spent as a timestamp;
  
  calculating a weight of the keywords on a page based on a combination of;
  
  a duration during which said page is displayed in the presentation when it is determined that the extraction of the keyword has been successful; and
  
  the word attribute weights assigned to the keywords;
  
  a function of determining, among the subtitles obtained by recognizing speech generated with reference to a predetermined document, a specific subtitle obtained by recognizing the speech generated with reference to a specific page of the document; and
  
  a function of storing a correspondence between the specific subtitle and the specific page.
- View Dependent Claims (13, 14)
- - 13. The program product according to claim 12, further allowing the computer to realize a function of displaying the specific subtitle together with specific information about the specific page.
  - 14. The program product according to claim 12, further allowing the computer to realize a function of retrieving character strings, with the retrieval target range extended from the specific subtitle to text data contained in the specific page.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Operating Company (Cerence Inc.)
Original Assignee
International Business Machines Corporation
Inventors
Miyamoto, Kohtaroh, Nagishi, Noriko, Arakawa, Kenichi
Primary Examiner(s)
Chawan, Vijay B
Assistant Examiner(s)
Borsetti, Greg

Application Number

US12/538,944
Publication Number

US 20100036664A1
Time in Patent Office

973 Days
Field of Search

704/3, 704/270, 704/246, 704/275, 707/3, 707/736, 707/748, 707/711, 707/725, 715/203, 715/204
US Class Current

704/270
CPC Class Codes

G06F 40/258   Heading extraction; Automat...

G10L 15/26   Speech to text systems G10L...

G10L 2015/088   Word spotting

H04N 21/4884   for displaying subtitles

H04N 5/44504   Circuit details of the addi...

Subtitle generation and retrieval combining document processing with voice processing

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

18 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Subtitle generation and retrieval combining document processing with voice processing

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links