APPARATUS FOR SYNCHRONOUSLY PROCESSING TEXT DATA AND VOICE DATA

US 20150379996A1
Filed: 06/29/2015
Published: 12/31/2015
Est. Priority Date: 06/30/2014
Status: Active Grant

First Claim

Patent Images

1. An apparatus for synchronously processing text data and voice data,comprising:

a storing unit for storing text data constituted by a plurality of phrases and voice data of the text data;

a text data dividing section for dividing the text data stored in the storing unit into the phrases and storing the divided text data, with identifiers which respectively correspond to the divided text data and indicate the division order, in the storing unit;

a text data phoneme converting section for phonemically converting the divided text data, phrase by phrase, to obtain text data phoneme conversion values and storing the text data phoneme conversion values, which respectively correspond to the phrases, in the storing unit;

a text data phoneme conversion accumulated value calculating section for calculating accumulated values of the text data phoneme conversion value of each phrase of the divided text data and storing the accumulated values, which respectively correspond to the phrases of the divided text data, in the storing unit;

a voice data dividing section for extracting a silent segment, from the voice data, on the basis of a predetermined silent segment decision datum, dividing the voice data in the extracted silent segment, and storing the divided voice data, with identifiers which respectively correspond to the divided voice data and indicate the division order, in the storing unit;

a reading data phoneme converting section for phonemically converting the divided voice data, which have been divided division range by division range, to obtain voice data phoneme conversion values and storing the voice data phoneme conversion values, which respectively correspond to the division ranges, in the storing unit;

a voice data phoneme conversion accumulated value calculating section for calculating accumulated values of the voice data phoneme conversion value of each division range of the divided voice data and storing the accumulated values, which respectively correspond to the division ranges of the divided voice data, in the storing unit;

a phrase corresponding data producing section for extracting the nearest approximate values of the voice data phoneme accumulated values with respect to the text data phoneme conversion accumulated values corresponding to the phrases of the divided text data, and producing phrase corresponding data, in which the voice data phoneme conversion accumulated values respectively corresponding to the phrases of the divided text data are associated with identifiers indicating playback order of the phrases of the divided text data; and

an output section for outputting the corresponding phrases of the text data and the divided voice data, which correspond to each other, on the basis of the phrase corresponding data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The apparatus for synchronously processing text data and voice data, comprises: a storing unit for storing text data and voice data; a text data dividing section for dividing the text data; a text data phoneme converting section for phonemically converting the divided text data; a text data phoneme conversion accumulated value calculating section for calculating accumulated values of text data phoneme conversion values; a voice data dividing section for dividing the voice data; a reading data phoneme converting section for phonemically converting the divided voice data; a voice data phoneme conversion accumulated value calculating section for calculating accumulated values of voice data phoneme conversion values; a phrase corresponding data producing section for producing phrase corresponding data; and an output section for synchronously outputting the text data and the divided voice data.

16 Citations

View as Search Results

6 Claims

1. An apparatus for synchronously processing text data and voice data,comprising:
- a storing unit for storing text data constituted by a plurality of phrases and voice data of the text data;
  
  a text data dividing section for dividing the text data stored in the storing unit into the phrases and storing the divided text data, with identifiers which respectively correspond to the divided text data and indicate the division order, in the storing unit;
  
  a text data phoneme converting section for phonemically converting the divided text data, phrase by phrase, to obtain text data phoneme conversion values and storing the text data phoneme conversion values, which respectively correspond to the phrases, in the storing unit;
  
  a text data phoneme conversion accumulated value calculating section for calculating accumulated values of the text data phoneme conversion value of each phrase of the divided text data and storing the accumulated values, which respectively correspond to the phrases of the divided text data, in the storing unit;
  
  a voice data dividing section for extracting a silent segment, from the voice data, on the basis of a predetermined silent segment decision datum, dividing the voice data in the extracted silent segment, and storing the divided voice data, with identifiers which respectively correspond to the divided voice data and indicate the division order, in the storing unit;
  
  a reading data phoneme converting section for phonemically converting the divided voice data, which have been divided division range by division range, to obtain voice data phoneme conversion values and storing the voice data phoneme conversion values, which respectively correspond to the division ranges, in the storing unit;
  
  a voice data phoneme conversion accumulated value calculating section for calculating accumulated values of the voice data phoneme conversion value of each division range of the divided voice data and storing the accumulated values, which respectively correspond to the division ranges of the divided voice data, in the storing unit;
  
  a phrase corresponding data producing section for extracting the nearest approximate values of the voice data phoneme accumulated values with respect to the text data phoneme conversion accumulated values corresponding to the phrases of the divided text data, and producing phrase corresponding data, in which the voice data phoneme conversion accumulated values respectively corresponding to the phrases of the divided text data are associated with identifiers indicating playback order of the phrases of the divided text data; and
  
  an output section for outputting the corresponding phrases of the text data and the divided voice data, which correspond to each other, on the basis of the phrase corresponding data.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The apparatus according to claim 1,further comprising:
    - a duplicately-associated phoneme conversion accumulated value extracting section for detecting existence of duplicate association of the voice data phoneme conversion accumulated values in the phrase corresponding data; and
      
      a resetting section for resetting the phrase corresponding data so as to eliminate the duplicate association of the voice data phoneme conversion accumulated values in the phrase corresponding data,wherein the resetting section defines all of the divided voice data, whose voice data phoneme conversion accumulated values are duplicately-associated, as resetting segment data when the duplicate association of the voice data phoneme conversion accumulated values is detected,the resetting section performs;
      
      a process of extracting a second silent segment, from the resetting segment data, on the basis of a second silent segment decision datum whose condition is more restricted than that of the silent segment decision datum;
      
      a process of producing second divided voice data, which are obtained by dividing the resetting segment data on the basis of a result of extracting the second silent segment;
      
      a process of calculating second phoneme conversion values, which are obtained by phonemically converting the divided segments of the second divided voice data, and calculating a voice data phoneme conversion accumulated value of the resetting segment data, which is accumulated, in division order, in the resetting segment data;
      
      a process of extracting the nearest approximate values of the voice data phoneme conversion accumulated values of the resetting segment data with respect to the text data phoneme conversion accumulated values corresponding to the phrases of the divided text data in the resetting segment data, and making the extracted values correspond to the phrases of the divided text data in the resetting segment data;
      
      a process of producing phrase corresponding data of the resetting segment, in which accumulated values of the divided voice data phoneme conversion in the resetting segment data are respectively corresponded to the phrases of the divided text data in the resetting segment data; and
      
      a process of producing corrected phrase corresponding data by integrating the phrase corresponding data with the phrase corresponding data of the resetting segment on the basis of the identifiers in the divided text data.
  - 3. The apparatus according to claim 2,further comprising a forcible processing section,wherein, when the duplicately-associated phoneme conversion accumulated value extracting section detects the duplicate association of the voice data phoneme conversion accumulated values of the resetting segment data in the corrected phrase corresponding data, the forcible processing segment performs:
    - a process of producing forcible processing object data, which include the voice data phoneme conversion accumulated value of the resetting segment data from which the duplicate association has been detected by the duplicately-associated phoneme conversion accumulated value extracting section, the second divided voice data being corresponded thereto and the divided text data;
      
      a process of calculating a total value of the text data phoneme conversion values of the divided text data of the forcible processing object data, and calculating a ratio of the text data phoneme conversion values of the divided text data of the forcible processing object data to the total value;
      
      a process of forming forcibly-divided segments in the second voice data according to the calculated ratio of the text data phoneme conversion values, and calculating voice data phoneme conversion accumulated values of the resetting segments in the forcibly-divided segments;
      
      a process of extracting voice data phoneme conversion accumulated values of the forcible process object data, each of which is the nearest to the text data phoneme conversion accumulated values of the phrases of the divided text data in the forcible process object data, and making the extracted values correspond to the phrases of the divided text data in the forcible process object data;
      
      a process of producing phrase corresponding data of the forcible process object data, in which the voice data phoneme conversion accumulated values in the forcible process object data are respectively associated with the phrases of the divided text data in the forcible process object data; and
      
      a process of producing forcibly-corrected phrase corresponding data by integrating the phrase corresponding data, the phrase corresponding data in the resetting segments and the phrase corresponding data in the forcible process object data on the basis of the identifiers in the divided text data.
  - 4. The apparatus according to claim 1,wherein the voice data phoneme conversion accumulated value calculating section converts the voice data into text data once by voice recognition processing, and phonemically converts the text data of the voice data.
  - 5. The apparatus according to claim 2,wherein the voice data phoneme conversion accumulated value calculating section converts the voice data into text data once by voice recognition processing, and phonemically converts the text data of the voice data.
  - 6. The apparatus according to claim 3,wherein the voice data phoneme conversion accumulated value calculating section converts the voice data into text data once by voice recognition processing, and phonemically converts the text data of the voice data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Shinano Kenshi Kabushiki Kaisha
Original Assignee
Shinano Kenshi Kabushiki Kaisha Address 386-0498 Japan
Inventors
KODAIRA, Tomoki, NISHIZAWA, Tatsuo

Granted Patent

US 9,679,566 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 25/48   specially adapted for parti...

G10L 25/87   Detection of discrete point...

APPARATUS FOR SYNCHRONOUSLY PROCESSING TEXT DATA AND VOICE DATA

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

16 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

APPARATUS FOR SYNCHRONOUSLY PROCESSING TEXT DATA AND VOICE DATA

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

16 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links