Speech retrieval method, speech retrieval apparatus, and program for speech retrieval apparatus

US 9,626,958 B2
Filed: 05/27/2016
Issued: 04/18/2017
Est. Priority Date: 04/21/2014
Status: Active Grant

First Claim

Patent Images

1. A method for speech retrieval, comprising:

detecting one or more coinciding segments for speech data by comparing a character string of a recognition result and a character string of a keyword, the keyword being designated by the character string and a phoneme string or a syllable string;

calculating an evaluation value of each of the one or more coinciding segments using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string recognized in each of the one or more coinciding segments and that is a recognition result of phoneme speech recognition, wherein the phoneme string or the syllable string associated with each of the coinciding segments is a phoneme string or a syllable string associated with a segment in which a start and an end of the segment is expanded by a predetermined time; and

outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.

Citations

13 Claims

1. A method for speech retrieval, comprising:
- detecting one or more coinciding segments for speech data by comparing a character string of a recognition result and a character string of a keyword, the keyword being designated by the character string and a phoneme string or a syllable string;
  
  calculating an evaluation value of each of the one or more coinciding segments using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string recognized in each of the one or more coinciding segments and that is a recognition result of phoneme speech recognition, wherein the phoneme string or the syllable string associated with each of the coinciding segments is a phoneme string or a syllable string associated with a segment in which a start and an end of the segment is expanded by a predetermined time; and
  
  outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method according to claim 1, wherein the recognition result of word speech recognition includes words as recognition units performed for the speech data.
  - 3. The method according to claim 1, wherein the recognition result of phoneme speech recognition includes phonemes or syllables as recognition units performed for the speech data.
  - 4. The method according to claim 1, wherein calculating comprises comparing a phoneme string or a syllable string that is an N-best recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for speech data associated with each of the detected one or more coinciding segments and the phoneme string of the keyword to set a rank of the coinciding N-best recognition result as the evaluation value.
  - 5. The method according to claim 1, wherein calculating comprises setting, as the evaluation value, an edit distance between a phoneme string or a syllable string that is a 1-best recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for speech data associated with each of the detected one or more coinciding segments and the phoneme string or the syllable string of the keyword.
  - 6. The method according to claim 5, wherein the edit distance is a distance matched by matching based on dynamic programming.
  - 7. The method according to claim 1, further comprising performing word speech recognition of the speech data to be retrieved, with words as recognition units.
  - 8. The method according to claim 1, further comprising performing phoneme speech recognition of the speech data associated with each of the detected one or more coinciding segments, with phonemes or syllables as recognition units.
  - 9. The method according to claim 1, further comprising performing phoneme speech recognition of the speech data to be retrieved, with phonemes or syllables as recognition units.
  - 10. The method of claim 1, wherein the calculating the evaluation value of each of the one or more coinciding segments further includes using the character string of the keyword to evaluate the character string in each of the detected one or more coinciding segments.
  - 11. The method of claim 1, further comprising adjusting the predetermined threshold to alter at least one of a precision value and a recall value of the output segment, the precision value being positively correlated with the predetermined threshold and the recall value being negatively correlated with the predetermined threshold.
  - 12. The method of claim 11, wherein the precision value is a ratio of retrieval results satisfying a retrieval request to all documents satisfying the retrieval request.
  - 13. The method of claim 11, wherein the recall value is a ratio of retrieval results satisfying a retrieval request to all retrieval results.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SINOEAST CONCEPT LIMITED (Tencent Holdings Limited)
Original Assignee
SINOEAST CONCEPT LIMITED (Tencent Holdings Limited)
Inventors
Kurata, Gakuto, Nagano, Tohru, Nishimura, Masafumi
Primary Examiner(s)
ALBERTALLI, BRIAN LOUIS

Application Number

US15/167,683
Publication Number

US 20160275940A1
Time in Patent Office

326 Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/04   Segmentation; Word boundary...

G10L 15/08   Speech classification or se...

G10L 15/187   Phonemic context, e.g. pron...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 2015/027   Syllables being the recogni...

G10L 2015/088   Word spotting

G10L 25/51   for comparison or discrimin...

Speech retrieval method, speech retrieval apparatus, and program for speech retrieval apparatus

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Speech retrieval method, speech retrieval apparatus, and program for speech retrieval apparatus

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links