Speech to text for assisted form completion

US 7,907,705 B1
Filed: 10/10/2006
Issued: 03/15/2011
Est. Priority Date: 10/10/2006
Status: Active Grant

First Claim

Patent Images

1. A method for capturing information from a live conversation between an operator and a customer, comprising:

designating a context for the live conversation between the operator and the customer wherein the context is a form viewed on a computer screen and actively in use by the operator;

setting, by the operator during the live conversation, a visual cue by physically moving a cursor to an information field in the form;

monitoring the live conversation in response to setting the visual cue, wherein the visual cue triggers the conversion of the live conversation to text;

recognizing at least one portion of the live conversation as a text portion after converting the live conversation to text;

interpreting one or more cues in the live conversation, wherein the one or more cues comprises at least the visual cue;

relating the one or more cues to the information field associated with the context for the live conversation; and

storing information obtained from the text portion of the live conversation into the information field, wherein the information obtained from the text portion comprises at least one word spoken after the one or more cues.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for capturing information from a live conversation between an operator and a customer, involving monitoring the live conversation between the operator and the customer, recognizing at least one portion of the live conversation as a text portion upon converting the live conversation to text, interpreting a cue in the live conversation, relating the cue to an information field associated with a context for the live conversation, and storing information obtained from the text portion into the information field, wherein the information obtained from the text portion includes at least one word spoken after the cue.

78 Citations

View as Search Results

25 Claims

1. A method for capturing information from a live conversation between an operator and a customer, comprising:
- designating a context for the live conversation between the operator and the customer wherein the context is a form viewed on a computer screen and actively in use by the operator;
  
  setting, by the operator during the live conversation, a visual cue by physically moving a cursor to an information field in the form;
  
  monitoring the live conversation in response to setting the visual cue, wherein the visual cue triggers the conversion of the live conversation to text;
  
  recognizing at least one portion of the live conversation as a text portion after converting the live conversation to text;
  
  interpreting one or more cues in the live conversation, wherein the one or more cues comprises at least the visual cue;
  
  relating the one or more cues to the information field associated with the context for the live conversation; and
  
  storing information obtained from the text portion of the live conversation into the information field, wherein the information obtained from the text portion comprises at least one word spoken after the one or more cues.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein the at least one word spoken after the visual cue is spoken by the customer.
  - 3. The method of claim 1, wherein the operator and the customer are individuals.
  - 4. The method of claim 1, wherein at least one of the one or more cues is a pre-defined verbal cue spoken by the operator.
  - 5. The method of claim 4, wherein at least one of the one or more cues is designated by a phrase preceding the information field in the context.
  - 6. The method of claim 1, wherein the operator confirms an accuracy of the information stored before the context is saved.
  - 7. The method of claim 6, wherein the operator corrects errors in the stored information while continuing the live conversation.
  - 8. The method of claim 1, wherein the information field is designated as a particular type of information, and wherein the particular type of information is at least one selected from a group consisting of a number, a date, a phrase, a formatted number, and a formatted phrase.
  - 9. The method of claim 1, wherein the live conversation occurs using at least one selected from a group consisting of an analog phone, a digital phone, and a computer.

10. A system for capturing information from a live conversation between an operator and a customer, comprising:
- a context designator executing on a processor and configured to;
  
  designate a context for the live conversation between the operator and the customer wherein the context is a form viewed on a computer screen and actively in use by the operator;
  
  obtain a visual cue setting from the operator during the live conversation, wherein the operator sets the visual cue by physically moving a cursor to an information field in the form;
  
  monitor the live conversation in response to setting the visual cue, wherein the visual cue triggers the conversion of the live conversation to text;
  
  a speech recognition engine executing on the processor and configured to recognize the live conversation as a text portion after converting the live conversation to text; and
  
  a document completion engine executing on the processor and configured to;
  
  interpret one or more cues in the live conversation, wherein the one or more cues comprises at least the visual cue;
  
  relate the one or more cues to the information field associated with the context for the live conversation; and
  
  store information obtained from the text portion into the information field within the context, wherein the information obtained from the text portion comprises at least one word spoken after the one or more cues.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The system of claim 10, wherein the at least one word spoken after the visual cue is spoken by the customer.
  - 12. The system of claim 10, wherein the speech recognition engine is further configured to improve accuracy of recognizing the live conversation as the text portion based on corrections made by the operator.
  - 13. The system of claim 10, wherein the speech recognition engine recognizes the live conversation as the text portion using a restricted grammar.
  - 14. The system of claim 10, wherein the information field is designated as a particular type of information, and wherein the particular type of information is at least one selected from a group consisting of a number, a date, a phrase, a formatted number, and a formatted phrase.
  - 15. The system of claim 10, wherein the live conversation occurs using at least one selected from a group consisting of an analog phone, a digital phone, and a computer.
  - 16. The system of claim 10, wherein the operator manually corrects errors in the stored information while continuing the live conversation.
  - 17. The system of claim 16, wherein the speech recognition engine learns from the manually corrected errors.
  - 18. The system of claim 10, wherein the one or more cues comprises a pre-defined verbal phrase recognized by the speech recognition engine.

19. A computer readable storage medium comprising instructions embodied thereon to perform a method for capturing information from a live conversation between an operator and a customer, comprising:
- designating a context for the live conversation between the operator and the customer wherein the context is a form viewed on a computer screen and actively in use by the operator;
  
  setting, by the operator during the live conversation, a visual cue by physically moving a cursor to an information field in the form;
  
  monitoring the live conversation in response to setting the visual cue, wherein the visual cue triggers the conversion of the live conversation to text;
  
  recognizing at least one portion of the live conversation as a text portion after converting the live conversation to text;
  
  interpreting one or more cues in the live conversation, wherein the one or more cues comprises at least the visual cue;
  
  relating the one or more cues to the information field associated with the context for the live conversation; and
  
  storing information obtained from the text portion of the live conversation into the information field, wherein the information obtained from the text portion comprises at least one word spoken after the one or more cues.
- View Dependent Claims (20, 21, 22)
- - 20. The computer readable storage medium of claim 19, wherein the at least one word spoken after the visual cue is spoken by the customer.
  - 21. The computer readable storage medium of claim 19, wherein the information field is contained within an external computer application.
  - 22. The computer readable storage medium of claim 19, wherein at least one of the one or more cues is a pre-defined verbal cue spoken by the operator.

23. A computer readable storage medium organized in a library comprising instructions embodied thereon to provide a method for capturing information from a live conversation between an operator and a customer, comprising:
- designating a context for the live conversation between the operator and the customer wherein the context is a form viewed on a computer screen and actively in use by the operator;
  
  setting, by the operator during the live conversation, a visual cue by physically moving a cursor to an information field in the form;
  
  monitoring the live conversation in response to setting the visual cue, wherein the visual cue triggers the conversion of the live conversation to text;
  
  recognizing at least one portion of the live conversation as a text portion after converting the live conversation to text;
  
  interpreting one or more cues in the live conversation, wherein the one or more cues comprises at least the visual cue;
  
  relating the one or more cues to the information field associated with the context for the live conversation; and
  
  storing information obtained from the text portion of the live conversation into the information field, wherein the information obtained from the text portion comprises at least one word spoken after the one or more cues.
- View Dependent Claims (24)
- - 24. The computer readable storage medium of claim 23, wherein the library is used to write a separate computer application for capturing customer information.

25. A computer system for capturing information from a live conversation between an operator and a customer, comprising:
- a processor;
  
  a memory;
  
  a storage device; and
  
  software instruction stored in the memory for enabling the computer system under control of the processor to;
  
  designate a context for the live conversation between the operator and the customer wherein the context is a form viewed on a computer screen and actively in use by the operator;
  
  set, by the operator during the live conversation, a visual cue by physically moving a cursor to an information field in the form;
  
  monitor the live conversation in response to setting the visual cue, wherein the visual cue triggers the conversion of the live conversation to text;
  
  recognize at least one portion of the live conversation as a text portion after converting the live conversation to text;
  
  interpret one or more cues in the live conversation, wherein the one or more cues comprises at least the visual cue;
  
  relate the one or more cues to the information field associated with the context for the live conversation; and
  
  store information obtained from the text portion of the live conversation into the information field, wherein the information obtained from the text portion comprises at least one word spoken after the one or more cues.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intuit, Inc.
Original Assignee
Intuit, Inc.
Inventors
Huff, Gerald, Goldman, Roy
Primary Examiner(s)
Gauthier; Gerald

Application Number

US11/545,937
Time in Patent Office

1,617 Days
Field of Search

235/380, 379/52, 379/68, 379/88.14, 379/221.86, 379/265.09, 379/266.01, 434/238, 455/41.1, 704/200, 704/270.1, 704/235, 705/1, 370/352, 709/242
US Class Current

379/88.14
CPC Class Codes

H04L 12/66 Arrangements for connecting...

Speech to text for assisted form completion

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

78 Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Speech to text for assisted form completion

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

78 Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links