Systems and methods for hands-free voice control and voice search

US 8,700,399 B2
Filed: 07/06/2010
Issued: 04/15/2014
Est. Priority Date: 07/06/2009
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, in a processor, an acoustic input signal; and

processing the acoustic input signal with a plurality of acoustic recognition processes to recognize a predetermined target sound within the acoustic input signal, the acoustic input signal being temporally divided into a plurality of frames, the plurality of acoustic recognition processes being implemented using a single Viterbi search of a plurality of states corresponding to acoustic units of an acoustic model of the predetermined target sound, the plurality of states including initial states and final states of the acoustic model, whereinthe initial states are reset on each frame if a score for the initial state on a previous frame is below a first threshold,said score is calculated for a plurality of said states on each frame, the calculating comprising increasing the score by a per frame offset so that scores of different durations are comparable, anda result is generated when a score of a final state increases above a second threshold.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one embodiment the present invention includes a method comprising receiving an acoustic input signal and processing the acoustic input signal with a plurality of acoustic recognition processes configured to recognize the same target sound. Different acoustic recognition processes start processing different segments of the acoustic input signal at different time points in the acoustic input signal. In one embodiment, initial states in the recognition processes may be configured on each time step.

30 Citations

View as Search Results

21 Claims

1. A method comprising:
- receiving, in a processor, an acoustic input signal; and
  
  processing the acoustic input signal with a plurality of acoustic recognition processes to recognize a predetermined target sound within the acoustic input signal, the acoustic input signal being temporally divided into a plurality of frames, the plurality of acoustic recognition processes being implemented using a single Viterbi search of a plurality of states corresponding to acoustic units of an acoustic model of the predetermined target sound, the plurality of states including initial states and final states of the acoustic model, whereinthe initial states are reset on each frame if a score for the initial state on a previous frame is below a first threshold,said score is calculated for a plurality of said states on each frame, the calculating comprising increasing the score by a per frame offset so that scores of different durations are comparable, anda result is generated when a score of a final state increases above a second threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein the score for an initial state is reset to a predetermined value on each frame if the score is less than the first threshold before calculating a current score for the initial state based on a received acoustic unit for a current frame.
  - 3. The method of claim 2 wherein said predetermined value is a constant.
  - 4. The method of claim 2 wherein different initial states are reset to different predetermined values.
  - 5. The method of claim 1 wherein the offset is a constant.
  - 6. The method of claim 1 wherein different states have different associated offsets.
  - 7. The method of claim 1 wherein the plurality of acoustic recognition processes operate on multiple words in parallel.

8. A non-transitory computer readable storage medium having stored thereon program code executable by a processor, said program code comprising:
- code that causes the processor to receive an acoustic input signal; and
  
  code that causes the processor to process the acoustic input signal with a plurality of acoustic recognition processes to recognize a predetermined target sound within the acoustic input signal, the acoustic input signal being temporally divided into a plurality of frames, the plurality of acoustic recognition processes being implemented using a single Viterbi search of a plurality of states corresponding to acoustic units of an acoustic model of the predetermined target sound, the plurality of states including initial states and final states of the acoustic model, whereinthe initial states are reset on each frame if a score for the initial state on a previous frame is below a first threshold,said score is calculated for a plurality of said states on each frame, the calculating comprising increasing the score by a per frame offset so that scores of different durations are comparable, anda result is generated when a score of a final state increases above a second threshold.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The non-transitory computer readable medium of claim 8 wherein the score for an initial state is reset to a predetermined value on each frame if the score is less than the first threshold before calculating a current score for the initial state based on a received acoustic unit for a current frame.
  - 10. The non-transitory computer readable medium of claim 9 wherein the predetermined value is a constant.
  - 11. The non-transitory computer readable medium of claim 9 wherein different initial states are reset to different predetermined values.
  - 12. The non-transitory computer readable medium of claim 8 wherein the offset is a constant.
  - 13. The non-transitory computer readable medium of claim 8 wherein different states have different associated offsets.
  - 14. The non-transitory computer readable medium of claim 8 wherein the plurality of acoustic recognition processes operate on multiple words in parallel.

15. A computer system comprising:
- a processor; and
  
  a non-transitory computer readable medium having stored thereon instructions that, when executed by the processor, causes the processor to;
  
  receive an acoustic input signal; and
  
  process the acoustic input signal with a plurality of acoustic recognition processes to recognize a predetermined target sound within the acoustic input signal, the acoustic input signal being temporally divided into a plurality of frames, the plurality of acoustic recognition processes being implemented using a single Viterbi search of a plurality of states corresponding to acoustic units of an acoustic model of the predetermined target sound, the plurality of states including initial states and final states of the acoustic model, whereinthe initial states are reset on each frame if a score for the initial state on a previous frame is below a first threshold,said score is calculated for a plurality of said states on each frame, the calculating comprising increasing the score by a per frame offset so that scores of different durations are comparable, anda result is generated when a score of a final state increases above a second threshold.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The computer system of claim 15 wherein the score for an initial state is reset to a predetermined value on each frame if the score is less than the first threshold before calculating a current score for the initial state based on a received acoustic unit for a current frame.
  - 17. The computer system of claim 16 wherein the predetermined value is a constant.
  - 18. The computer system of claim 16 wherein different initial states are reset to different predetermined values.
  - 19. The computer system of claim 15 wherein the offset is a constant.
  - 20. The computer system of claim 15 wherein different states have different associated offsets.
  - 21. The computer system of claim 15 wherein the plurality of acoustic recognition processes operate on multiple words in parallel.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sensory Incorporated
Original Assignee
Sensory Incorporated
Inventors
Vermeulen, Pieter J., Shaw, Jonathan, Mozer, Todd F.
Primary Examiner(s)
WOZNIAK, JAMES S

Application Number

US12/831,051
Publication Number

US 20110166855A1
Time in Patent Office

1,379 Days
Field of Search

704/231, 704/236, 704/242, 704/251, 704/254, 704/248
US Class Current

704/242
CPC Class Codes

G10L 15/22   Procedures used during a sp...

G10L 15/32   Multiple recognisers used i...

G10L 2015/081   Search algorithms, e.g. Bau...

G10L 2015/088   Word spotting

Systems and methods for hands-free voice control and voice search

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

30 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for hands-free voice control and voice search

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

30 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links