Methods and apparatus for proofing of a text input

US 9,236,045 B2
Filed: 05/23/2012
Issued: 01/12/2016
Est. Priority Date: 05/23/2011
Status: Active Grant

First Claim

Patent Images

1. A method for assisting a user verify accuracy of and/or correct text obtained by performing automatic speech recognition on speech input by the user, the method comprising:

using at least one computer hardware processor to perform;

receiving speech input by the user over a course of multiple user turns as a plurality of speech chunks, each of the plurality of speech chunks comprising speech spoken by the user during a respective single user turn, the plurality of speech chunks including a first speech chunk comprising data corresponding to at least two words spoken by the user;

converting, by performing automatic speech recognition, the plurality of speech chunks to a textual representation comprising a plurality of text chunks, each of the plurality of speech chunks corresponding to a respective one of the plurality of text chunks, the plurality of text chunks comprising a first text chunk corresponding to the first speech chunk and comprising at least two recognized words corresponding to the at least two words; and

for each text chunk in the plurality of text chunks;

automatically designating the text chunk of the plurality of text chunks as an active text chunk, whenever the text chunk corresponds to a last speech chunk input by the user; and

providing a visual presentation of the active text chunk and at least one other text chunk in the plurality of text chunks such that the active text chunk is visually presented differently than the at least one other text chunk to assist the user, when presented, in proofing the textual representation.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for presenting data input as a plurality of data chunks including a first data chunk and a second data chunk. The techniques include converting the plurality of data chunks to a textual representation comprising a plurality of text chunks including a first text chunk corresponding to the first data chunk and a second text chunk corresponding to the second data chunk, respectively, and providing a presentation of at least part of the textual representation such that the first text chunk is presented differently than the second text chunk to, when presented, assist a user in proofing the textual representation.

Citations

27 Claims

1. A method for assisting a user verify accuracy of and/or correct text obtained by performing automatic speech recognition on speech input by the user, the method comprising:
- using at least one computer hardware processor to perform;
  
  receiving speech input by the user over a course of multiple user turns as a plurality of speech chunks, each of the plurality of speech chunks comprising speech spoken by the user during a respective single user turn, the plurality of speech chunks including a first speech chunk comprising data corresponding to at least two words spoken by the user;
  
  converting, by performing automatic speech recognition, the plurality of speech chunks to a textual representation comprising a plurality of text chunks, each of the plurality of speech chunks corresponding to a respective one of the plurality of text chunks, the plurality of text chunks comprising a first text chunk corresponding to the first speech chunk and comprising at least two recognized words corresponding to the at least two words; and
  
  for each text chunk in the plurality of text chunks;
  
  automatically designating the text chunk of the plurality of text chunks as an active text chunk, whenever the text chunk corresponds to a last speech chunk input by the user; and
  
  providing a visual presentation of the active text chunk and at least one other text chunk in the plurality of text chunks such that the active text chunk is visually presented differently than the at least one other text chunk to assist the user, when presented, in proofing the textual representation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, further comprising:
    - designating another of the plurality of text chunks as the active text chunk in response to user input indicating that the user would like to select a different one of the plurality of text chunks to be the active text chunk; and
      
      modifying the visual presentation to highlight the newly designated active text chunk.
  - 3. The method of claim 1, further comprising deleting at least a portion of the active text chunk from the textual representation in response to receiving user input to delete the at least a portion of the active text chunk.
  - 4. The method of claim 1, further comprising replacing at least a portion of the active text chunk with different text converted from further speech input from the user in response to receiving user input to replace the at least a portion of the active text chunk.
  - 5. The method of claim 1, wherein the visual presentation includes a visual presentation of each of the plurality of text chunks.
  - 6. The method of claim 5, further comprising visually rendering the visual presentation to the user via a display.
  - 7. The method of claim 1, wherein the textual representation is formed, at least in part, of a plurality of words, the method further comprising:
    - designating one of the plurality of words as an active word in response to a user selecting a word mode;
      
      designating another of the plurality of words as the active word in response to user input indicating that the user would like to select a different one of the plurality of words to be the active word; and
      
      modifying the visual presentation to highlight the newly designated active word.
  - 8. The method of claim 1, wherein the textual representation is formed, at least in part, of a plurality of characters, the method further comprising:
    - designating one of the plurality of characters as an active character in response to a user selecting a character mode;
      
      designating another of the plurality of characters as the active character in response to user input indicating that the user would like to select a different one of the plurality of characters to be the active character; and
      
      modifying the visual presentation to highlight the newly designated active character.
  - 9. The method of claim 1, wherein the active text chunk comprises at least two words.

10. A system for assisting a user verify accuracy of and/or correct text obtained by performing automatic speech recognition on speech input by the user, the system comprising:
- at least one computer hardware processor configured to perform;
  
  receiving speech input by the user over a course of multiple user turns as a plurality of speech chunks, each of the plurality of speech chunks comprising speech spoken by the user during a respective single user turn, the plurality of speech chunks including a first speech chunk comprising data corresponding to at least two words spoken by the user;
  
  converting, by performing automatic speech recognition, the plurality of speech chunks to a textual representation comprising a plurality of text chunks, each of the plurality of speech chunks corresponding to a respective one of the plurality of text chunks, the plurality of text chunks comprising a first text chunk corresponding to the first speech chunk and comprising at least two recognized words corresponding to the at least two words; and
  
  for each text chunk in the plurality of text chunks;
  
  automatically designating the text chunk of the plurality of text chunks as an active text chunk, whenever the text chunk corresponds to a last speech chunk input by the user; and
  
  providing a visual presentation of the active text chunk and at least one other text chunk in the plurality of text chunks such that the active text chunk is visually presented differently than the at least one other text chunk to assist the user, when presented, in proofing the textual representation.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The system of claim 10, wherein the at least one computer hardware processor is configured to designate another of the plurality of text chunks as the active text chunk in response to user input indicating that the user would like to select a different one of the plurality of text chunks to be the active text chunk, and modifying the visual presentation to highlight the newly designated active text chunk.
  - 12. The system of claim 10, wherein the at least one computer hardware processor is configured to remove at least a portion of the active text chunk from the textual representation in response to receiving an indication from the user to delete the at least a portion of the active text chunk.
  - 13. The system of claim 10, wherein the at least one computer hardware processor is configured to replace at least a portion of the active text chunk in response to receiving user input to replace at least a portion of the active text chunk with different text converted from further data input from the user.
  - 14. The system of claim 10, wherein the at least one computer hardware processor is configured to generate a visual presentation of each of the plurality of text chunks.
  - 15. The system of claim 14, further comprising at least one display coupled to the at least one computer hardware processor to display the visual presentation to the user.
  - 16. The system of claim 10, wherein the textual representation is formed, at least in part, of a plurality of words, and wherein the at least one hardware processor is configured to designate one of the plurality of words as an active word in response to a user selecting a word mode, designate another of the plurality of words as the active word in response to user input indicating that the user would like to select a different one of the plurality of words to be the active word, and modify the visual presentation to highlight the newly designated active word.
  - 17. The system of claim 10, wherein the textual representation is formed, at least in part, of a plurality of characters, and wherein the at least one hardware processor is configured to designate one of the plurality of characters as an active character in response to a user selecting a character mode, designate another of the plurality of characters as the active character in response to user input indicating that the user would like to select a different one of the plurality of characters to be the active character, and modify the visual presentation to highlight the newly designated active character to the user.
  - 18. The system of claim 10, wherein the active text chunk comprises at least two words.

19. At least one non-transitory computer readable medium storing instructions that, when executed on at least one computer, cause the at least one computer to perform a method for assisting a user verify accuracy of and/or correct text obtained by performing automatic speech recognition on speech input by the user, the method comprising:
- receiving speech input by the user over a course of multiple user turns as a plurality of speech chunks, each of the plurality of speech chunks comprising speech spoken by the user during a respective single user turn, the plurality of speech chunks including a first speech chunk comprising data corresponding to at least two words spoken by the user;
  
  converting, by performing automatic speech recognition, the plurality of speech chunks to a textual representation comprising a plurality of text chunks, each of the plurality of speech chunks corresponding to a respective one of the plurality of text chunks, the plurality of text chunks comprising a first text chunk corresponding to the first speech chunk and comprising at least two recognized words corresponding to the at least two words; and
  
  for each text chunk in the plurality of text chunks;
  
  automatically designating the text chunk of the plurality of text chunks as an active text chunk, whenever the text chunk corresponds to a last speech chunk input by the user; and
  
  providing a visual presentation of the active text chunk and at least one other text chunk in the plurality of text chunks such that the active text chunk is visually presented differently than the at least one other text chunk to assist the user, when presented, in proofing the textual representation.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
- - 20. The at least one non-transitory computer readable medium of claim 19, the method further comprising:
    - designating another of the plurality of text chunks as the active text chunk in response to user input indicating that the user would like to select a different one of the plurality of text chunks to be the active text chunk; and
      
      modifying the visual presentation to highlight the newly designated active text chunk.
  - 21. The at least one non-transitory computer readable medium of claim 19, the method further comprising deleting at least a portion of the active text chunk from the textual representation in response to receiving user input to delete the at least a portion of the active text chunk.
  - 22. The at least one non-transitory computer readable medium of claim 19, the method further comprising replacing at least a portion of the active text chunk with different text converted from further speech input from the user in response to receiving user input to replace at least a portion of the active text chunk.
  - 23. The at least one non-transitory computer readable medium of claim 19, wherein the visual presentation includes a visual presentation of each of the plurality of text chunks.
  - 24. The at least one non-transitory computer readable medium of claim 23, further comprising visually rendering the visual presentation to the user via a display.
  - 25. The at least one non-transitory computer readable medium of claim 19, wherein the textual representation is formed, at least in part, of a plurality of words, the method further comprising:
    - designating one of the plurality of words as an active word in response to a user selecting a word mode;
      
      designating another of the plurality of words as the active word in response to user input indicating that the user would like to select a different one of the plurality of words to be the active word; and
      
      modifying the visual presentation to highlight the newly designated active word.
  - 26. The at least one non-transitory computer readable medium of claim 19, wherein the textual representation is formed, at least in part, of a plurality of characters, the method further comprising:
    - designating one of the plurality of characters as an active character in response to a user selecting a character mode;
      
      designating another of the plurality of characters as the active character in response to user input indicating that the user would like to select a different one of the plurality of characters to be the active character; and
      
      modifying the visual presentation to highlight the newly designated active character.
  - 27. The at least one non-transitory computer readable medium of claim 19, wherein the active text chunk comprises at least two words.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Labsky, Martin, Kleindienst, Jan, Macek, Tomas, Nahamoo, David, Curin, Jan, Koenig, Lars, Quast, Holger
Primary Examiner(s)
Jackson, Jakieda

Application Number

US13/478,930
Publication Number

US 20120310643A1
Time in Patent Office

1,329 Days
Field of Search

704/231, 704/235, 704/246, 704/251, 704/260, 379/88.01
US Class Current

1/1
CPC Class Codes

G06F 40/10   Text processing natural lan...

G06F 40/30   Semantic analysis

G10L 13/00   Speech synthesis; Text to s...

G10L 13/08   Text analysis or generation...

G10L 15/01   Assessment or evaluation of...

G10L 15/02   Feature extraction for spee...

G10L 15/06   Creation of reference templ...

G10L 15/14   using statistical models, e...

G10L 15/1822   Parsing for meaning underst...

G10L 15/26   Speech to text systems G10L...

G10L 15/28   Constructional details of s...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

G10L 17/00   Speaker identification or v...

G10L 21/06   Transformation of speech in...

Methods and apparatus for proofing of a text input

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for proofing of a text input

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links