CONTENT SELECTION USING SPEECH RECOGNITION

US 20080130699A1
Filed: 12/05/2006
Published: 06/05/2008
Est. Priority Date: 12/05/2006
Status: Abandoned Application

First Claim

Patent Images

1. A method used with a wireless communication device for selecting a content file from a set of content files using speech recognition, the method comprising:

establishing a set of tagged text items wherein each tagged text item is uniquely associated with one content file of the set of content files;

receiving at least one audible utterance from a user;

identifying a set of phonemes associated with the received audible utterance;

generating a phoneme lattice based on the identified set of phonemes;

generating a phoneme lattice statistical model based on the phoneme lattice;

assigning a score to each tagged text item in a subset of the set of tagged text items based on the phoneme lattice statistical model; and

presenting one or more of the tagged text items having a score that is above a threshold.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are a method and wireless device for selecting a content file using speech recognition. The method includes establishing a set of tagged text items wherein each tagged text item is uniquely associated with one content file of the set of content files. At least one audible utterance (226) is received (804) from a user. A phoneme lattice (302) is generated (808) based on the audible utterance (226). A phoneme lattice statistical model is generated (810) based on the phoneme lattice (302). A score is assigned (1008) to the tagged text items based on probabilistic estimates in the phoneme lattice statistical model. A list of high scoring tagged text items is presented (1014) so that a selection of a content file may be made. A word lattice (402) and a word lattice statistical model are also used in some embodiments

66 Citations

View as Search Results

20 Claims

1. A method used with a wireless communication device for selecting a content file from a set of content files using speech recognition, the method comprising:
- establishing a set of tagged text items wherein each tagged text item is uniquely associated with one content file of the set of content files;
  
  receiving at least one audible utterance from a user;
  
  identifying a set of phonemes associated with the received audible utterance;
  
  generating a phoneme lattice based on the identified set of phonemes;
  
  generating a phoneme lattice statistical model based on the phoneme lattice;
  
  assigning a score to each tagged text item in a subset of the set of tagged text items based on the phoneme lattice statistical model; and
  
  presenting one or more of the tagged text items having a score that is above a threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the subset of the set of tagged text items is the entire set of tagged text items.
  - 3. The method of claim 2, wherein the score assigned to each tagged text item is determined from an estimated probability, p(x_lx₂. . . x_M|L)=p(x₁|L)p(x₂|x₁,L) . . . p(x_M|x_M−
    - 1, . . . x_M+1−
      
      N,L), where p(x₁x₂. . . x_M|L) is the estimated probability that a tagged text item having a phoneme string x₁x₂. . . x_Moccurred in the utterance from which phoneme lattice (L) was generated, and is determined from the probabilistic estimates p(x₁|L), p(x₂|x₁,L), . . . p(x_M|x_M−
      
      1, . . . x_M+1−
      
      N,L) included in the phoneme lattice statistical model.
  - 4. The method of claim 1, wherein the subset of the set of tagged text items is determined by:
    - generating a set of indexing N-grams from the set of tagged text items;
      
      wherein each indexing N-gram is a subset of at least one of the tagged text items.assigning a score to each indexing N-gram in the set of indexing N-grams based on the phoneme lattice statistical model; and
      
      including in the subset of the tagged text items those tagged text items that include indexing N-grams having an assigned score greater than a first threshold.
  - 5. The method of claim 4, wherein each indexing N-gram in the set of indexing N-grams is unique and is a sequential subset of at least one tagged text item.
  - 6. The method of claim 4, wherein assigning a score to each indexing N-gram in a set of indexing N-grams further comprises:
    - transcribing each indexing N-gram into a corresponding phoneme string; and
      
      assigning a score to each indexing N-gram based on probabilistic estimates obtained from the phoneme lattice statistical model.
  - 7. The method of claim 6, wherein the score assigned to each indexing N-gram is determined from an estimated probability, p(x₁x₂. . . x_N|L)=p(x₁|L)p(x₂|x₁,L) . . . p(x_N|x_N−
    - 1, . . . x_N−
      
      M,L), where p(x₁x₂. . . x_N|L) is the estimated probability that an indexing N-gram having a phoneme string x₁x₂. . . x_Noccurred in the utterance from which phoneme lattice (L) was generated, and is determined from the probabilistic estimates p(x₁|L), p(x₂|x₁,L), . . . p(x_M|x_M−
      
      1. . . x_M+1−
      
      N,L) included in the phoneme lattice statistical model.

8. A method used with a wireless communication device for selecting a content file from a set of content files, the method comprising:
- establishing a set of tagged text items wherein each tagged text item is uniquely associated with one content file of the set of content files;
  
  generating a set of indexing N-grams from the set of tagged text items;
  
  receiving at least one audible utterance from a user;
  
  generating a phoneme lattice based on the received at least one audible utterance;
  
  generating a phoneme lattice statistical model based on the phoneme lattice;
  
  assigning a score to each indexing N-gram in the set of indexing N-grams based on the phoneme lattice statistical model;
  
  determining a subset of the set of indexing N-grams, wherein the indexing N-grams in the subset have an assigned score greater than a first threshold;
  
  generating a word lattice based on the subset of indexing N-grams;
  
  generating a word lattice statistical model based on the word lattice;
  
  assigning a score to each tagged text item in a subset of the set of tagged text items, wherein the subset comprises tagged test items that are associated with the subset of indexing N-grams, and wherein the score assigned to each tagged text item is based on the word lattice statistical model; and
  
  presenting one or more of the tagged text items having scores above a second threshold.
- View Dependent Claims (9, 10, 11, 12)
- - 9. The method of claim 8, wherein each indexing N-gram in the set of indexing N-grams is unique and is a sequential subset of at least one tagged text item.
  - 10. The method of claim 8, wherein assigning a score to each indexing N-gram in a set of indexing N-grams further comprises:
    - transcribing each N-gram into a corresponding phoneme string; and
      
      assigning a score to each indexing N-gram based on probabilistic estimates obtained from the phoneme lattice statistical model.
  - 11. The method of claim 8, wherein the score assigned to each indexing N-gram is determined from an estimated probability, p(x_lx₂. . . x₁|L)=p(x₁|L)p(x₂|x₁, L) . . . p(x_M|X_M−
    - 1, . . . x_M+1−
      
      N,L), where p(x₁x₂. . . x_M|L) is the estimated probability that an indexing N-gram having a phoneme string x₁x₂. . . x_Moccurred in the utterance from which phoneme lattice (L) was generated, and is determined from probabilistic estimates p(x₁|L), p(x₂|x₁,L), . . . , p(x_M|x_M−
      
      1, . . . x_M+1−
      
      N,L) included in the phoneme lattice statistical model.
  - 12. The method of claim 8, wherein the score assigned to each tagged text item is determined from an estimated probability p(x₁x₂. . . x_M|W)=p(x₁|W)p(x₂|x₁,W) . . . p(x_M|x_M−
    - 1, . . . x_M+1−
      
      N,W), where p(x₁x₂. . . x_M|W) is the estimated probability that tagged text item having a word string x₁x₂. . . x_Moccurred in the utterance from which word lattice (W) was generated, and is determined from the probabilistic estimates p(x_l|W), p(x₂|x₁, W), . . . , p(x_M|x_M−
      
      1, . . . x_M+1−
      
      N,W) of the word lattice statistical model.

13. A wireless communication device comprising:
- a memory;
  
  a processor communicatively coupled to the memory; and
  
  a speech responsive search engine communicatively coupled to the memory and the processor, the speech responsive search engine for;
  
  establishing a set of tagged text items wherein each tagged text item is uniquely associated with one content file of the set of content files;
  
  receiving at least one audible utterance from a user;
  
  identifying a set of phonemes associated with the received audible utterance;
  
  generating a phoneme lattice based on the identified set of phonemes;
  
  creating a phoneme lattice statistical model based on the phoneme lattice;
  
  assigning a score to each tagged text item in a subset of the set of tagged text items based on the phoneme lattice statistical model; and
  
  presenting one or more of the tagged text items having a score that is above a threshold.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The wireless communication device of claim 13, wherein the subset of the set of tagged text items is the entire set of tagged text items.
  - 15. The wireless communication device of claim 13, wherein the score assigned to each tagged text item is determined from an estimated probability, p(x_lx₂. . . x₁|L)=p(x₁|L)p(x₂|x₁,L) . . . p(x_M|x_M−
    - 1, . . . x_M+1−
      
      N,L), where p(x₁x₂. . . x_M|L) is the estimated probability that a tagged text item having a phoneme string x₁x₂. . . x_Moccurred in the utterance from which phoneme lattice (L) was generated, and is determined from the probabilistic estimates p(x₁|L), p(x₂|x₁, L), . . . , p(x_M|x_M−
      
      1, . . . x_M+1−
      
      N,L) included in the phoneme lattice statistical model.
  - 16. The wireless communication device of claim 13, wherein the subset of the set of tagged text items is determined by:
    - generating a set of indexing N-grams from the set of tagged text items;
      
      wherein each indexing N-gram is a subset of at least one of the tagged text items.assigning a score to each indexing N-gram in the set of indexing N-grams based on the phoneme lattice statistical model;
      
      including in the subset of the tagged text items those tagged text items that include indexing N-grams having an assigned score greater than a first threshold.
  - 17. The wireless communication device of claim 16, wherein each indexing N-gram in the set of indexing N-grams is unique and is a sequential subset of at least one tagged text item.
  - 18. The wireless communication device of claim 16, wherein assigning a score to each indexing N-gram in a set of indexing N-grams further comprises:
    - transcribing each indexing N-gram into a corresponding phoneme string; and
      
      assigning a score to each indexing N-gram based on probabilistic estimates obtained from the phoneme lattice statistical model.
  - 19. The wireless communication device of claim 18, wherein the score assigned to each indexing N-gram is determined from an estimated probability, p(x₁x₂. . . x_N|L)=p(x₁|L)p (x₂|x₁,L) . . . p(x_N|x_N−
    - 1, . . . x_N−
      
      M,L), where p(x₁x₂. . . x_N|L) is the estimated probability that an indexing N-gram having a phoneme string x₁x₂. . . x_Noccurred in the utterance from which phoneme lattice (L) was generated, and is determined from the probabilistic estimates p(x₁|L), p(x₂|x₁,L), . . . p(x_M|x_M−
      
      1. . . x_M+1−
      
      N,L) included in the phoneme lattice statistical model.
  - 20. The wireless communication device of claim 18, wherein the score assigned to each tagged text item in the subset of tagged text items is determined from an estimated probability, p(x_lx₂. . . x_M|L)=p(x₁|L)p(x₂|x₁L) . . . p(x_M|x_M−
    - 1, . . . x_M+1−
      
      N,L), where p(x₁x₂. . . x_M|L) is the estimated probability that a tagged text item having a phoneme string x₁x₂. . . x_Moccurred in the utterance from which phoneme lattice (L) was generated, and is determined from the probabilistic estimates p(x₁|L), p(x₂|x₁,L), . . . , p(x_M|x_M−
      
      1, . . . x_M+1−
      
      N,L) included in the phoneme lattice statistical model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Motorola Mobility, Inc. (Lenovo Group Ltd.)
Original Assignee
Motorola, Inc. (Motorola Solutions, Inc.)
Inventors
Cheng, Yan M., Ma, Changxue C.

Application Number

US11/566,832
Publication Number

US 20080130699A1
Time in Patent Office

Days
Field of Search
US Class Current

372/50.12
CPC Class Codes

G06F 16/433 using audio data

CONTENT SELECTION USING SPEECH RECOGNITION

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

66 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

CONTENT SELECTION USING SPEECH RECOGNITION

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

66 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links