Generating speech data collection prompts

US 8,700,396 B1
Filed: 10/08/2012
Issued: 04/15/2014
Est. Priority Date: 09/11/2012
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, at a computer system, a request to generate a textual prompt to provide to a user for generating speech data in a particular language;

in response to receiving the request, determining frequencies of occurrence of linguistic features of the particular language in one or more corpora that are associated with the particular language, wherein the one or more corpora include content that was generated by people who use the particular language and that reflects current use of the particular language;

identifying, by the computer system, quantities of speech samples that include the linguistic features from a repository of previously recorded speech samples;

weighting the frequencies of occurrence of the linguistic features based on the quantities of speech samples that include the linguistic features, wherein the weighting generates weighted frequencies for the linguistic features, wherein a first linguistic feature is determined to have a weighted frequency that is greater than a weighted frequency for a second linguistic feature as a result of the computer system executing computer code that includes both of the following conditions and determining that one or more of the following conditions are satisfied;

(i) the first linguistic feature has a same or greater frequency of occurrence in the one or more corpora and has fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature, and(ii) the first linguistic feature has a greater frequency of occurrence in the one or more corpora and has the same or fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature;

generating, by the computer system, one or more textual prompts based on the weighted frequencies for the linguistic features, wherein each of the one or more textual prompts comprises a combination of two or more of the linguistic features; and

providing, by the computer system, the generated one or more textual prompts.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This document generally describes computer technologies relating to generating speech data collection prompts, such as textual scripts and/or textual scenarios. Speech data collection prompts for a particular language can be generated based on a variety of factors, including the frequency with which linguistic elements (e.g., phonemes, syllables, words, phrases) in the particular language occur in one or more corpora of textual information associated with the particular language. Textual prompts can also and/or alternatively be generated based on statistics for previously recorded speech data.

128 Citations

View as Search Results

19 Claims

1. A computer-implemented method comprising:
- receiving, at a computer system, a request to generate a textual prompt to provide to a user for generating speech data in a particular language;
  
  in response to receiving the request, determining frequencies of occurrence of linguistic features of the particular language in one or more corpora that are associated with the particular language, wherein the one or more corpora include content that was generated by people who use the particular language and that reflects current use of the particular language;
  
  identifying, by the computer system, quantities of speech samples that include the linguistic features from a repository of previously recorded speech samples;
  
  weighting the frequencies of occurrence of the linguistic features based on the quantities of speech samples that include the linguistic features, wherein the weighting generates weighted frequencies for the linguistic features, wherein a first linguistic feature is determined to have a weighted frequency that is greater than a weighted frequency for a second linguistic feature as a result of the computer system executing computer code that includes both of the following conditions and determining that one or more of the following conditions are satisfied;
  
  (i) the first linguistic feature has a same or greater frequency of occurrence in the one or more corpora and has fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature, and(ii) the first linguistic feature has a greater frequency of occurrence in the one or more corpora and has the same or fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature;
  
  generating, by the computer system, one or more textual prompts based on the weighted frequencies for the linguistic features, wherein each of the one or more textual prompts comprises a combination of two or more of the linguistic features; and
  
  providing, by the computer system, the generated one or more textual prompts.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The computer-implemented method of claim 1, wherein the request identifies a particular user to whom the request pertains;
    - the method further comprising;
      
      identifying, by the computer system, one or more characteristics of the particular user'"'"'s voice from a speech sample for the particular user; and
      
      selecting, from the repository of previously recorded speech samples, a subset of the previously recorded speech samples that include voices that have one or more characteristics that match, within a threshold value, the one or more characteristics of the particular user'"'"'s voice;
      
      wherein the quantities of speech samples are identified from the subset of the repository of previously recorded speech samples.
  - 3. The computer-implemented method of claim 2, wherein the one or more characteristics of the particular user'"'"'s voice include one or more of:
    - a pitch of the particular user'"'"'s voice, a vocal tract length of the particular user'"'"'s voice, an accent of the particular user with which the particular user speaks, and a cadence with which the particular user speaks.
  - 4. The computer-implemented method of claim 2, wherein the generated one or more textual prompts are provided to a computing device that is associated with the particular user.
  - 5. The computer-implemented method of claim 1, wherein the request identifies a particular acoustic environment to which the request pertains;
    - the method further comprising;
      
      selecting, from the repository of previously recorded speech samples, a subset of the previously recorded speech samples that were recorded in acoustic environments that match, within a threshold value, the particular acoustic environment;
      
      wherein the quantities of speech samples are identified from the subset of the repository of previously recorded speech samples.
  - 6. The computer-implemented method of claim 5, wherein the particular acoustic environment comprises a mobile telephone device into which a user is speaking and from which audio signals are being received.
  - 7. The computer-implemented method of claim 1, wherein generating the one or more textual prompts comprises:
    - repeatedly performing the following until the one or more textual prompts have been generated;
      
      selecting a combination of candidate linguistic features from the linguistic features based on the weighted frequencies; and
      
      grammar checking and spell checking the combination of candidate linguistic features, wherein the combination of candidate linguistic features is identified as one of the one or more textual prompts when the combination of candidate linguistic features passes the grammar checking and the spell checking.
  - 8. The computer-implemented method of claim 7, wherein the combination of candidate linguistic features are selected based on the candidate linguistic features having weighted frequencies that are at a threshold level or greater.
  - 9. The computer-implemented method of claim 7, wherein the combination of candidate linguistic features are selected based on the candidate linguistic features having weighted frequencies that are greatest among the weighted frequencies for the linguistic features that have not yet been considered in combination together.
  - 10. The computer-implemented method of claim 1, wherein the linguistic features include one or more of:
    - phonemes, syllables, words, and phrases.
  - 11. The computer-implemented method of claim 1, wherein the one or more textual prompts comprise one or more textual scripts that are generated for users to read aloud without modification when providing a speech sample.
  - 12. The computer-implemented method of claim 1, wherein the one or more textual prompts comprise one or more scenarios that include incomplete information regarding the one or more scenarios so that users providing speech samples from the one or more scenarios ad lib at least a portion of the speech samples.
  - 13. The computer-implemented method of claim 1, wherein the one or more corpora include, at least, a corpus of search query logs that include user-generated search queries in the particular language.
  - 14. The computer-implemented method of claim 1, wherein the one or more corpora include, at least, a corpus of electronic documents that include text in the particular language.
  - 15. The computer-implemented method of claim 1, wherein the one or more corpora include, at least, a corpus of user-generated textual content on one or more social networks, the user-generated textual content being in the particular language.
  - 16. The computer-implemented method of claim 1, wherein the one or more corpora includes information that identifies amounts of time that have elapsed since portions of the content were added to the one or more corpora, and wherein the frequencies of occurrence of the linguistic features are weighted further based on the amounts of time.
  - 17. The computer-implemented method of claim 1, further comprising selecting the one or more corpora from among a plurality of corpora based on amounts of time that have elapsed since portions of the content were added to the one or more corpora.

18. A computer system comprising:
- one or more computing devices;
  
  an interface of the one or more computing devices that is programmed to receive requests to generate a textual prompt to provide to a user for generating speech data in a particular language;
  
  one or more corpora that are accessible to the one or more computing devices and that include content that was generated by people who use the particular language and that reflects current use of the particular language;
  
  a frequency module that is installed on the one or more computing devices and that is programmed to determine frequencies of occurrence of linguistic features of the particular language in the one or more corpora;
  
  a repository of previously recorded speech samples that are accessible to the one or more computing devices and that is separate from the one or more corpora;
  
  a quantity module that is installed on the one or more computing devices and that is programmed to identify quantities of speech samples that include the linguistic features from the repository of previously recorded speech samples;
  
  a weighting module that is installed on the one or more computing devices and that is programmed to weight the frequencies of occurrence of the linguistic features based on the quantities of speech samples that include the linguistic features, wherein the weighting generates weighted frequencies for the linguistic features; and
  
  a textual prompt generator that is installed on the one or more computing devices and that is programmed to generate one or more textual prompts based on the weighted frequencies for the linguistic features, wherein each of the one or more textual prompts comprises a combination of two or more of the linguistic features,wherein the weighting module is further programmed to generate a weighted frequency for a first linguistic feature that is greater than a weighted frequency for a second linguistic feature as a result of executing computer code that includes both of the following conditions and determining that one or more of the following conditions are satisfied;
  
  (i) the first linguistic feature has a same or greater frequency of occurrence in the one or more corpora and has fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature, and (ii) the first linguistic feature has a greater frequency of occurrence in the one or more corpora and has the same or fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature.

19. A computer program product embodied in a non-transitory computer-readable storage device storing instructions that, when executed, cause a computer system with one or more processors to perform operations comprising:
- receiving a request to generate a textual prompt to provide to a user for generating speech data in a particular language;
  
  in response to receiving the request, determining frequencies of occurrence of linguistic features of the particular language in one or more corpora that are associated with the particular language, wherein the one or more corpora include content that was generated by people who use the particular language and that reflects current use of the particular language;
  
  identifying quantities of speech samples from a repository of previously recorded speech samples that include the linguistic features;
  
  weighting the frequencies of occurrence of the linguistic features based on the quantities of speech samples that include the linguistic features, wherein the weighting generates weighted frequencies for the linguistic features, wherein a first linguistic feature is determined to have a weighted frequency that is greater than a weighted frequency for a second linguistic feature as a result of executing computer code that includes both of the following conditions and determining that one or more of the following conditions are satisfied;
  
  (i) the first linguistic feature has a same or greater frequency of occurrence in the one or more corpora and has fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature, and(ii) the first linguistic feature has a greater frequency of occurrence in the one or more corpora and has the same or fewer speech samples in the repository of previously recorded speech samples than the second linguistic feature;
  
  generating one or more textual prompts based on the weighted frequencies for the linguistic features, wherein each of the one or more textual prompts comprises a combination of two or more of the linguistic features; and
  
  providing the generated one or more textual prompts.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Mengibar, Pedro J. Moreno, Weinstein, Eugene
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
KOVACEK, DAVID M

Application Number

US13/647,021
Time in Patent Office

554 Days
Field of Search

704 1- 10, 704231-245, 704246-257, 704258-261, 704270-2701, 704/276, 704E17001-E17014, 704E15001-E1505, 704E19001-E19049, 704E13001-E13014
US Class Current

704/235
CPC Class Codes

G10L 13/027   Concept to speech synthesis...

G10L 15/063   Training

G10L 2015/0638   Interactive procedures

Generating speech data collection prompts

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

128 Citations

19 Claims

Specification

Use Cases

Quick Links

Others

Generating speech data collection prompts

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

128 Citations

19 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others