Method For Processing Speech Data For A Distributed Recognition System

US 20080215327A1
Filed: 05/19/2008
Published: 09/04/2008
Est. Priority Date: 11/12/1999
Status: Active Grant

First Claim

Patent Images

1. A method of processing speech data from an utterance for a distributed speech query recognition system comprising the steps of:

establishing a network connection between a server computing system and a client device suitable for transporting a streaming communication;

receiving a continuous speech byte data stream containing speech data processed by a first component of the distributed speech query recognition system situated in the client device;

wherein said speech data is characterized by a form and data content representing only a partial recognition of an utterance;

further wherein said data stream includes NULL data used to identify a silence in speech data from said client device;

further processing said speech data at a second component of the distributed speech query recognition system situated at said server computing system to generate additional speech related content and complete recognition of words in said speech data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech signal information is formatted, processed and transported in accordance with a format adapted for TCP/IP protocols used on the Internet and other communications networks. NULL characters are used for indicating the end of a voice segment. The method is useful for distributed speech recognition systems such as a client-server system, typically implemented on an intranet or over the Internet based on user queries at his/her computer, a PDA, or a workstation using a speech input interface.

103 Citations

View as Search Results

20 Claims

1. A method of processing speech data from an utterance for a distributed speech query recognition system comprising the steps of:
- establishing a network connection between a server computing system and a client device suitable for transporting a streaming communication;
  
  receiving a continuous speech byte data stream containing speech data processed by a first component of the distributed speech query recognition system situated in the client device;
  
  wherein said speech data is characterized by a form and data content representing only a partial recognition of an utterance;
  
  further wherein said data stream includes NULL data used to identify a silence in speech data from said client device;
  
  further processing said speech data at a second component of the distributed speech query recognition system situated at said server computing system to generate additional speech related content and complete recognition of words in said speech data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The method of claim 1, further including a step of processing said words using a natural language engine at said server computing system to determine a meaning said utterance.
  - 3. The method of claim 1, wherein speech recognition tasks required by the distributed speech recognition system for recognizing words are allocated to said server computing device on a connection by connection basis.
  - 4. The method of claim 1, wherein speech recognition tasks used by the distributed speech recognition system for recognizing words are allocated to said client computing device on a connection by connection basis.
  - 5. The method of claim 1, further including a step:
    - transmitting a spoken answer or response in the form of answer speech data from said server computing system to said client device in response to a spoken query presented at said client device.
  - 6. The method of claim 1, wherein said speech data only includes NULL data during periods of silence.
  - 7. The method of claim 1, wherein said NULL data information is appended to an end of speech data in said speech byte stream.
  - 8. The method of claim 7, wherein said NULL data information is a single NULL character.
  - 9. The method of claim 1, wherein said speech processing is initiated by depressing a dedicated button on said client device.
  - 10. The method of claim 1, wherein said client computing device is used to formulate a speech based query to an Internet based search engine.
  - 11. The method of claim 1, wherein an amount of said speech data is configured in response to a real-time performance requirement set for the distributed speech recognition system during a speech utterance session.
  - 12. The method of claim 1, wherein said words in said speech data are recognized in real time.
  - 13. The method of claim 2, wherein said meaning of said words is recognized in real time.
  - 14. The method of claim 13, wherein said meaning is determined before a speech utterance representing said sentence is completed.
  - 15. The method of claim 12, further including performing a query operation based on said meaning to identify an answer to said words in real-time.
  - 16. The method of claim 13 wherein a confidence threshold can be specified for determining said meaning.
  - 17. The method of claim 12 further including a step:
    - specifying that a natural language engine should return multiple results for the speech data from different servers.
  - 18. The method of claim 1 further including a step:
    - calibrating speech and silence components of said speech data.

19. A method of processing speech data for a distributed speech query recognition system comprising the steps of:
- establishing a network connection between a server computing system and a client device suitable for transporting a streaming communication;
  
  receiving a data stream containing speech vector data from the client device, said speech vector data representing acoustic features of speech data and being characterized by a form and data content insufficient to recognize words;
  
  wherein said data stream includes NULL data information used to identify a silence in speech data from said client device;
  
  further processing said speech vector data at said server computing system to generate additional speech feature related content and identify words in said speech data.

20. A method of processing speech data for a distributed speech query recognition system comprising the steps of:
- establishing a network connection suitable for transporting a streaming communication between a server computing system and a client device;
  
  configuring speech processing operations to be performed by said client device and server computing system respectively;
  
  wherein said speech processing operations are automatically configured based on computing capabilities of said client device and server computing system respectively, and such that said server computing system supports a number of client devices having different computing capabilities;
  
  receiving a data stream containing speech vector data from the client device, said speech vector data representing acoustic features of speech data and being characterized by a data content insufficient to recognize words;
  
  wherein said data stream includes at least some NULL data used to identify a silence in speech data from said client device;
  
  further processing said speech vector data at said server computing system to generate additional speech feature related content and identify words in said speech data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Ian M. Bennett
Inventors
Bennett, Ian M.

Granted Patent

US 7,672,841 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/251
CPC Class Codes

G06F 16/243   Natural language query form...

G06F 16/24522   Translation of natural lang...

G06F 16/3344   using natural language anal...

G06F 40/216   using statistical methods

G06F 40/237   Lexical tools

G06F 40/30   Semantic analysis

G06F 40/42   Data-driven translation

G06F 40/44   Statistical methods, e.g. p...

G09B 5/04   with audible presentation o...

G09B 7/00   Electrically-operated teach...

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/285   Memory allocation or algori...

G10L 15/30   Distributed recognition, e....

H04M 2250/74   with voice recognition mean...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99935   Query augmenting and refini...

Method For Processing Speech Data For A Distributed Recognition System

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

103 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Method For Processing Speech Data For A Distributed Recognition System

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

103 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links