Speech recognition and transcription among users having heterogeneous protocols

US 9,142,217 B2
Filed: 06/27/2013
Issued: 09/22/2015
Est. Priority Date: 11/27/2001
Status: Expired due to Term

First Claim

Patent Images

1. A system for facilitating the exchange of speech recognition and transcription among users, the system comprising:

(a) at least one system transaction manager and at least one post processing manager, both using a uniform system protocol wherein the transaction manager is i) adapted to receive a speech information request from at least one user employing a first user legacy protocol and flag the information request as requiring post processing, and, ii) configured to route a response to the speech information request to one or more users employing a second user legacy protocol, the speech information request comprised of spoken text and commands, including spoken commands, wherein the response comprises at least a transcription of spoken text and the post processed information requested, and wherein the post processing manager is configured to i) receive structured transcription from a speech recognition and/or transcription engine, ii) operate upon the transcribed response, including spoken commands in accordance with the speech information request, and, iii) rout the requested response to a post processing application if specified in the speech information request;

(b) at least one application service adapter configured to provide bi-directional communication conversion between the first user legacy protocol and the uniform system protocol and between the second user legacy protocol and the uniform system protocol and capable of bi-directional communication with the system transaction manager; and

,(c) at least one speech recognition and/or transcription engine communicating with the system transaction manager, wherein the speech recognition and/or transcription engine is configured to receive the flagged speech information request containing spoken text and commands, including spoken commands, from the system transaction manager to generate a transcription in response to the speech information request and to route the response comprised of transcribed spoken text and commands, including transcribed spoken commands to the post processing manager.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for facilitating free form dictation, including directed dictation and constrained recognition and/or structured transcription among users having heterogeneous protocols for generating, transcribing, and exchanging recognized and transcribed speech. The system includes a system transaction manager having a “system protocol,” to receive a speech information request from an authorized user. The speech information request is generated using a user interface capable of bi-directional communication with the system transaction manager and supporting dictation applications. A speech recognition and/or transcription engine (ASR), in communication with the system transaction manager, receives the speech information request, generates a transcribed response, and transmits the response to the system transaction manager. The system transaction manager routes the response to one or more of the users. In another embodiment, the system employs a virtual sound driver for streaming free form dictation to any ASR.

36 Citations

View as Search Results

41 Claims

1. A system for facilitating the exchange of speech recognition and transcription among users, the system comprising:
- (a) at least one system transaction manager and at least one post processing manager, both using a uniform system protocol wherein the transaction manager is i) adapted to receive a speech information request from at least one user employing a first user legacy protocol and flag the information request as requiring post processing, and, ii) configured to route a response to the speech information request to one or more users employing a second user legacy protocol, the speech information request comprised of spoken text and commands, including spoken commands, wherein the response comprises at least a transcription of spoken text and the post processed information requested, and wherein the post processing manager is configured to i) receive structured transcription from a speech recognition and/or transcription engine, ii) operate upon the transcribed response, including spoken commands in accordance with the speech information request, and, iii) rout the requested response to a post processing application if specified in the speech information request;
  
  (b) at least one application service adapter configured to provide bi-directional communication conversion between the first user legacy protocol and the uniform system protocol and between the second user legacy protocol and the uniform system protocol and capable of bi-directional communication with the system transaction manager; and
  
  ,(c) at least one speech recognition and/or transcription engine communicating with the system transaction manager, wherein the speech recognition and/or transcription engine is configured to receive the flagged speech information request containing spoken text and commands, including spoken commands, from the system transaction manager to generate a transcription in response to the speech information request and to route the response comprised of transcribed spoken text and commands, including transcribed spoken commands to the post processing manager.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The system of claim 1 wherein said first user legacy protocol is the same as or different than the second user legacy protocol.
  - 3. The system of claim 1 wherein the speech information request is received by the transaction manager through an applications portal, wherein the request is flagged for post processing.
  - 4. The system of claim 1 wherein at least one embedded non-spoken command in the speech information request directs the post processing manager to flag the request for post processing.
  - 5. The system of claim 1 wherein at least one command in the speech information request directs the post processing manager to use a post processing application to generate content in response to a spoken command.
  - 6. The system of claim 1 wherein at least one command in the speech information request directs the speech recognition and/or transcription engine to load a specific vocabulary.
  - 7. The system of claim 1 wherein the post processing manager processes the flagged information request in real time or near real time.
  - 8. The system of claim 1 wherein the speech information request comprises previously transcribed formatted spoken text.
  - 9. The system of claim 1 wherein the at least one post processor i) operates on the transcribed spoken command, including executing the spoken commands to generate the information requested in the information request, and ii) routes the requested response to the transaction manager.
  - 10. The system of claim 1 wherein said at least one application service adapter comprises i) a first user application service adapter, the first user application service adapter communicating with at least one of the users that employ the first user legacy protocol and with the system transaction manager, and (ii) a second user application service adapter, the second user application service adapter communicating with the one or more users that employ the second user legacy protocol and with the system transaction manager.
  - 11. The system of claim 1 further comprising at least one speech recognition service adapter, wherein the speech recognition service adapter communicates with one or more speech recognition and/or transcription engines that employ at least one legacy protocol and with the system transaction manager, wherein the speech recognition and service adapter is configured to receive the flagged speech information request from the system transaction manager, route the request to a speech recognition and/or transcription engine, and to route the response generated by the speech recognition and/or transcription engine to the post processing manager.

12. A method of providing transcribed speech, including spoken text, embedded and spoken commands among users, the method comprising:
- (a) generating a speech information request wherein the speech information request is comprised of spoken text and commands, including spoken commands obtained using a first user legacy protocols;
  
  (b) routing the speech information request through a user application service adapter capable of bi-directional communication with a system transaction manager using a uniform system protocol, wherein the system transaction manager comprises at least one post processing manager;
  
  (c) flagging information requests requiring post processing;
  
  (d) generating a response to the speech information request using at least one speech recognition and/or transcription engine, the response comprised of a transcription of spoken text and spoken commands;
  
  (e) routing the requested response to a the post processing manager, wherein the post processing manager is configured to receive the transcription from a speech recognition and/or transcription engine;
  
  (f) operating upon the transcribed speech, including spoken commands in accordance with the speech information request;
  
  (g) routing the requested response to a post processing application, if designated in the speech information request;
  
  (h) generating a post processed response wherein the response is comprised of transcribed spoken text and content in accordance with the request, including transcribed spoken commands; and
  
  , (i) routing the post processed response to a user having a second legacy protocol through a user application service adapter capable of bi-directional communication with a system transaction manager.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. The method of claim 12 wherein the first user legacy protocol is the same as or different than the second user legacy protocol.
  - 14. The method of claim 12 wherein the speech information request is received by the transaction manager through an application portal, wherein the request is flagged for post processing.
  - 15. The method of claim 12 wherein the speech information request further comprises at least one embedded command specifying that the speech information request be flagged for post processing.
  - 16. The method of claim 12 wherein at least one command in the speech information request directs the post processing manager to use a post processing application to generate content.
  - 17. The method of claim 12 wherein at least one command in the speech information request directs the speech recognition and/or transcription engine to load a specific vocabulary to constrain recognition.
  - 18. The method of claim 12 wherein the post processing manager processes the flagged information request in real time or near real time.

19. A system for facilitating the exchange of streamed speech recognition and transcription among users, the system comprising:
- (a) at least one system transaction manager using a uniform system protocol, including at least one post processing manager, wherein transaction manager is i) adapted to receive a streamed speech information request from at least one user employing a first user legacy protocol and flag the information request as requiring post processing, and, ii) configured to route a requested response to a speech information request to one or more users employing a second user legacy protocol, the speech information request comprised of spoken text and commands, including spoken commands, wherein the requested response comprises a transcription of spoken text and the post processed information requested, and wherein the post processing manager is configured to i) receive structured transcription from a speech recognition and/or transcription engine, ii) operate upon the transcribed speech, including spoken commands in accordance with the speech information request, and, iii) rout the requested response to a post processing application, if designated in the speech information request;
  
  (b) at least one application service adapter configured to provide bi-directional communication between the first user legacy protocol and the uniform system protocol, and between the second user legacy protocol and the uniform system protocol, and capable of bi-directional communication with the system transaction manager; and
  
  ,(c) at least one speech recognition and/or transcription engine communicating with the system transaction manager, wherein the speech recognition and/or transcription engine is configured to receive the flagged streamed speech information request containing spoken text and commands, including spoken commands, from the system transaction manager, to generate a transcription in response to the speech information request and to route the response comprised of transcribed spoken text and transcribed spoken commands to the post processing manager.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
- - 20. The system of claim 19 wherein said first user legacy protocol is the same as or different than the second user legacy protocol.
  - 21. The system of claim 19 wherein the speech information request is received by the transaction manager through a post processing applications portal, wherein the request is flagged for post processing.
  - 22. The system of claim 19 wherein at least one embedded command in the speech information request directs the post processing manager to flag the request for post processing.
  - 23. The system of claim 19 wherein at least one command in the speech information request directs the post processing manager to use a post processing application to generate content in response to the command.
  - 24. The system of claim 19 wherein at least one command in the speech information request directs the speech recognition and/or transcription engine to load a specific vocabulary.
  - 25. The system of claim 19 wherein the post processing manager processes the flagged information request in real time or near real time.
  - 26. The system of claim 19 wherein the speech information request comprises previously transcribed formatted spoken text.
  - 27. The system of claim 19 wherein the at least one post processor i) operates on the transcribed spoken command, including executing the spoken commands to generate the information requested in the information request, and ii) routes the requested response to the transaction manager.
  - 28. The system of claim 19 wherein said at least one application service adapter comprises i) a first user application service adapter, the first user application service adapter communicating with at least one of the users that employ the first user legacy protocol and with the system transaction manager, and (ii) a second user application service adapter, the second user application service adapter communicating with one or more users that employ the second user legacy protocol and with the system transaction manager.
  - 29. The system of claim 19 further comprising at least one speech recognition service adapter, wherein the speech recognition service adapter communicates with one or more speech recognition and/or transcription engines that employ at least one user legacy protocol and with the system transaction manager, wherein the speech recognition and service adapter is configured to receive the flagged speech information request from the system transaction manager, route the request to a speech recognition and/or transcription engine, and to route the response generated by the speech recognition and/or transcription engine to the post processing manager.

30. A system for facilitating streamed speech recognition and/or structured transcription among users having heterogeneous system protocols the system comprising:
- (a) at least one system transaction manager and at least one post processing manager, both using a uniform system protocol, wherein the transaction manager is i) adapted to receive a streamed speech information request from at least one user employing a first user legacy protocol, and flag the information request requiring post processing, and, ii) configured to route a requested response to one or more users employing a second user legacy protocol, the speech information request comprised of free form dictation of speech, including spoken text and commands, including spoken commands, wherein the requested response comprises a transcription of spoken text and the post processed information requested, and, wherein the post processing manager is configured to i) receive structured transcription from a speech recognition and/or transcription engine, ii) operate upon the transcribed speech, including spoken commands in accordance with the speech information request, and, iii) rout the requested response to a post processing application, if requested in the speech information request;
  
  (b) a user interface, including an application service adapter configured to provide bi-directional conversion between the first user legacy protocol and the uniform system protocol and between the second user legacy protocol and the uniform system protocol, and, capable of bi-directional communication with the system transaction manager and supporting dictation applications, including prompts to direct user dictation in response to user system protocol commands and system transaction manager commands, the user interface being in bi-directional communication with the system transaction manager; and
  
  (c) at least one speech recognition and/or transcription engine for constrained speech recognition, communicating with the system transaction manager, wherein the speech recognition and/or transcription engine is configured to receive the flagged speech information request containing spoken text and commands, including spoken commands from the system transaction manager, to generate a structured transcription in response to the speech information request and to route the response comprised of structured transcribed spoken text and transcribed spoken commands to the post processing manager.
- View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41)
- - 31. The system of claim 30 wherein said first user legacy protocol is the same as or different than the second user legacy protocol.
  - 32. The system of claim 30 wherein the requested response comprises the transcription of the structured spoken text and the transcribed spoken commands.
  - 33. The system of claim 30 wherein the streamed speech information request is received by the transaction manager through an application portal wherein the request is flagged for post processing.
  - 34. The system of claim 30 wherein the streamed speech information request further comprises an embedded command specifying that the streamed speech information request be flagged for post processing.
  - 35. The system of claim 30 wherein at least one embedded command in the streamed speech information request directs the post processing manager to use a post processing application to generate content in response to a spoken command.
  - 36. The system of claim 30 wherein at least one embedded command in the streamed speech information request directs the speech recognition and/or transcription engine to load a specific vocabulary to constrain recognition.
  - 37. The system of claim 30 wherein the post processing manager processes the flagged information request in real time or near real time.
  - 38. The system of claim 30 further comprising a cache in communication with the system transaction manager for i) receiving selected portions of the speech information request and retaining such portions in response to embedded and/or system transaction manager commands, and, ii) transmitting the retained portions to a system component, including the speech recognition and/or transcription engine on command.
  - 39. The system of claim 30 further comprising an audio pre-process service adaptor in communication with the system transaction manager and the speech recognition and/or transcription engine for streaming the audio component of a free formed dictation speech information request to the speech recognition and/or transcription engine, wherein the audio pre-process service adapter outputs the speech information request in a data format which is compatible with the speech recognition and/or transcription engine input format.
  - 40. The system of claim 30, wherein said at least one application service adapter comprises i) a first user application service adapter, the first user application service adapter communicating with the at least one of the users that employ the first user legacy protocol and with the system transaction manager, and ii) a second user application service adapter, the second user application service adapter communicating with the one or more users that employ the second user legacy protocol and with the system transaction manager.
  - 41. The system of claim 30 further comprising at least one speech recognition service adapter, wherein the speech recognition service adapter communicates with one or more speech recognition and/or transcription engines that employ at least one user legacy protocol and with the system transaction manager, wherein the speech recognition and service adapter is configured to receive the speech information request from the system transaction manager, route the request to a speech recognition and/or transcription engine, and to route the response generated by the speech recognition and/or transcription engine to the post processing manager.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Advanced Voice Recognition Systems, Inc.
Original Assignee
Advanced Voice Recognition Systems, Inc.
Inventors
Miglietta, Joseph H., Davis, Michael K.
Primary Examiner(s)
Chawan, Vijay B

Application Number

US13/928,383
Publication Number

US 20130339016A1
Time in Patent Office

817 Days
Field of Search

704/235, 704/270, 704/270.1, 704/9, 704/243, 704/254, 704/10, 704/272, 704/252, 704/257, 709/228, 709/230, 719/328
US Class Current

1/1
CPC Class Codes

G06F 16/685   using automatically derived...

G06F 40/40   Processing or translation o...

G10L 15/20   Speech recognition techniqu...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

H04M 3/4938   comprising a voice browser ...

Speech recognition and transcription among users having heterogeneous protocols

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

36 Citations

41 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition and transcription among users having heterogeneous protocols

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

36 Citations

41 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links