Computer implemented methods and apparatus for selectively interacting with a server to build a local database for speech recognition at a device

US 9,715,879 B2
Filed: 07/02/2013
Issued: 07/25/2017
Est. Priority Date: 07/02/2012
Status: Active Grant

First Claim

Patent Images

1. A method for selectively interacting with a server to build a local database for speech recognition at a computing device, the method comprising:

maintaining, at a computing device associated with a user, a local database comprising a plurality of audio samples, each audio sample being identified in association with;

any one of a plurality of voice command text files, each voice command text file being configured to store a text string of the audio sample, the text string being a transcription of a vocalization in the audio sample,any one of a plurality of different applications, the plurality of different applications being executable at the computing device, andany one of a plurality of different application command files, each application command file corresponding to a single respective application and associating at least one executable action to be performed within the application with at least one voice command text file;

receiving, at the computing device, a first audio command;

determining, using a local speech recognition algorithm at the computing device, that the first audio command does not match any of the plurality of audio samples of the local database within a margin of error;

transmitting, responsive to the determining step, the first audio command from the computing device to a remote server for detection of one or more voice command text files associated with the first audio command;

receiving, at the computing device from the remote server, the one or more detected voice command text files associated with the first audio command;

identifying an application at the computing device in relation to the one or more detected voice command text files, the identifying comprising analyzing a plurality of application command files at the computing device to locate an application command file matching the detected voice command text to the application; and

updating, at the computing device, the local database to;

include the first audio command in the plurality of audio samples of the local database,associate the first audio command with the identified application, andassociate the first audio command with an application command file corresponding to the identified application.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are methods, apparatus, systems, and computer-readable storage media for selectively interacting with a server to build a local dictation database for speech recognition at a device. In some implementations, a computing device receives an audio sample. The computing device may determine that the received audio sample does not match any of one or more existing audio samples stored in the local dictation database of the computing device. The received audio sample may be transmitted to a remote server for detection of one or more words indicated by the received audio sample. The computing device may receive data identifying the one or more words, and update the local dictation database to store the received audio sample in association with the one or more words.

Citations

16 Claims

1. A method for selectively interacting with a server to build a local database for speech recognition at a computing device, the method comprising:
- maintaining, at a computing device associated with a user, a local database comprising a plurality of audio samples, each audio sample being identified in association with;
  
  any one of a plurality of voice command text files, each voice command text file being configured to store a text string of the audio sample, the text string being a transcription of a vocalization in the audio sample,any one of a plurality of different applications, the plurality of different applications being executable at the computing device, andany one of a plurality of different application command files, each application command file corresponding to a single respective application and associating at least one executable action to be performed within the application with at least one voice command text file;
  
  receiving, at the computing device, a first audio command;
  
  determining, using a local speech recognition algorithm at the computing device, that the first audio command does not match any of the plurality of audio samples of the local database within a margin of error;
  
  transmitting, responsive to the determining step, the first audio command from the computing device to a remote server for detection of one or more voice command text files associated with the first audio command;
  
  receiving, at the computing device from the remote server, the one or more detected voice command text files associated with the first audio command;
  
  identifying an application at the computing device in relation to the one or more detected voice command text files, the identifying comprising analyzing a plurality of application command files at the computing device to locate an application command file matching the detected voice command text to the application; and
  
  updating, at the computing device, the local database to;
  
  include the first audio command in the plurality of audio samples of the local database,associate the first audio command with the identified application, andassociate the first audio command with an application command file corresponding to the identified application.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, the method further comprising:
    - determining that the one or more detected voice command text files are associated with a first executable action of the computing device.
  - 3. The method of claim 2, wherein one or more executable actions of the computing device are defined by and executable by the identified application.
  - 4. The method of claim 2, wherein one or more executable actions of the computing device are defined by an operating system of the computing device, and wherein the executable actions are executable by one or more applications installed on the computing device.
  - 5. The method of claim 2, the method further comprising:
    - while the computing device is not in communication with the remote server, receiving, at the computing device, a second audio command;
      
      determining that the second audio command matches one of the plurality of audio samples of the local database;
      
      identifying one or more voice command text files associated with the second audio command, the one or more identified voice command text files being stored in the local database;
      
      identifying a second executable action associated with the one or more identified voice command text files; and
      
      executing, at the computing device, the second executable action.
  - 6. The method of claim 2, wherein a plurality of executable actions are stored on the computing device, each executable action including an action text describing the executable action, and wherein the identifying an application at the computing device in relation to the one or more detected voice command text files comprises:
    - identifying a group of executable actions that a currently open application of the computing device is capable of executing; and
      
      determining that the action text of a first executable action matches the one or more detected voice command text files.
  - 7. The method of claim 1, the method further comprising:
    - determining that the one or more detected voice command text files are not associated with an executable action of the computing device; and
      
      displaying, in a user interface of a display at the computing device, an indication that the first audio command was not recognized.
  - 8. The method of claim 1, the method further comprising:
    - initializing the local database of the computing device.
  - 9. The method of claim 1, wherein each of the plurality of different application command files includes one or more accessibility labels for providing accessibility services to the visually impaired.
  - 10. The method of claim 1, the method further comprising:
    - determining that the one or more detected voice command text files is stored in the local database in association with a second audio command, such that the first audio command and the second audio command are associated with the one or more detected voice command text files.
  - 11. The method of claim 1, wherein the first audio command includes an audio recording of a voice command from the user.

12. A non-transitory computer-readable storage medium storing program code executable by one or more processors for selectively interacting with a server to build a local database for speech recognition at a device, the program code comprising instructions configured to cause:
- maintaining, at a computing device associated with a user, a local database comprising a plurality of audio samples, each audio sample being identified in association with;
  
  any one of a plurality of voice command text files, each voice command text file being configured to store a text string of the audio sample, the text string being a transcription of a vocalization in the audio sample,any one of a plurality of different applications, the plurality of different applications being executable at the computing device, andany one of a plurality of different application command files, each application command file corresponding to a single respective application and associating at least one executable action to be performed within the application with at least one voice command text file;
  
  receiving, at the computing device, a first audio command;
  
  determining, using a local speech recognition algorithm at the computing device, that the first audio command does not match any of the plurality of audio samples of the local database within a margin of error;
  
  transmitting, responsive to the determining step, the first audio command from the computing device to a remote server for detection of one or more voice command text files associated with the first audio command;
  
  receiving, at the computing device from the remote server, the one or more detected voice command text files associated with the first audio command;
  
  identifying an application at the computing device in relation to the one or more detected voice command text files, the identifying comprising analyzing a plurality of application command files at the computing device to locate an application command file matching the detected voice command text to the application; and
  
  updating, at the computing device, the local database to;
  
  include the first audio command in the plurality of audio samples of the local database,associate the first audio command with the identified application, andassociate the first audio command with an application command file corresponding to the identified application.
- View Dependent Claims (13, 14)
- - 13. The non-transitory computer-readable storage medium of claim 12, the instructions further configured to cause:
    - determining that the one or more detected voice command text files are associated with a first executable action of the computing device.
  - 14. The non-transitory computer-readable storage medium of claim 13, the instructions further configured to cause:
    - while the computing device is not in communication with the remote server, receiving, at the computing device, a second audio command;
      
      determining that the second audio command matches one of the plurality of audio samples of the local database;
      
      identifying one or more voice command text files associated with the second audio command, the one or more identified voice command text files being stored in the local database;
      
      identifying a second executable action associated with the one or more identified voice command text files; and
      
      executing, at the computing device, the second executable action.

15. One or more computing devices for selectively interacting with a server to build a local database for speech recognition at a device, the one or more computing devices comprising:
- one or more processors configured to cause;
  
  maintaining, at a computing device associated with a user, a local database comprising a plurality of audio samples, each audio sample being identified in association with;
  
  any one of a plurality of voice command text files, each voice command text file being configured to store a text string of the audio sample, the text string being a transcription of a vocalization in the audio sample,any one of a plurality of different applications, the plurality of different applications being executable at the computing device, andany one of a plurality of different application command files, each application command file corresponding to a single respective application and associating at least one executable action to be performed within the application with at least one voice command text file;
  
  receiving, at the computing device, a first audio command;
  
  determining, using a local speech recognition algorithm at the computing device, that the first audio command does not match any of the plurality of audio samples of the local database within a margin of error;
  
  transmitting, responsive to the determining step, the first audio command from the computing device to a remote server for detection of one or more voice command text files associated with the first audio command;
  
  receiving, at the computing device from the remote server, the one or more detected voice command text files associated with the first audio command;
  
  identifying an application at the computing device in relation to the one or more detected voice command text files, the identifying comprising analyzing a plurality of application command files at the computing device to locate an application command file matching the detected voice command text to the application; and
  
  updating, at the computing device, the local database to;
  
  include the first audio command in the plurality of audio samples of the local database,associate the first audio command with the identified application, andassociate the first audio command with an application command file corresponding to the identified application.
- View Dependent Claims (16)
- - 16. The one or more computing devices of claim 15, the one or more processors further configured to cause:
    - while the computing device is not in communication with the remote server, receiving, at the computing device, a second audio command;
      
      determining that the second audio command matches one of the plurality of audio samples of the local database;
      
      identifying one or more voice command text files associated with the second audio command, the one or more identified voice command text files being stored in the local database;
      
      identifying a second executable action associated with the one or more identified voice command text files; and
      
      executing, at the computing device, the second executable action.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Salesforce.com, Inc.
Original Assignee
Salesforce.com, Inc.
Inventors
Hu, Minzhi
Primary Examiner(s)
Lerner, Martin

Application Number

US13/933,463
Publication Number

US 20140006028A1
Time in Patent Office

1,484 Days
Field of Search

704244, 7042701, 704271, 704235, 704236
US Class Current
CPC Class Codes

G10L 15/10   using distance or distortio...

G10L 15/30   Distributed recognition, e....

G10L 17/04   Training, enrolment or mode...

G10L 2015/223   Execution procedure of a sp...

Computer implemented methods and apparatus for selectively interacting with a server to build a local database for speech recognition at a device

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Computer implemented methods and apparatus for selectively interacting with a server to build a local database for speech recognition at a device

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links