Method and apparatus for uniterm discovery and voice-to-voice search on mobile device
First Claim
1. In an electronic device, a method comprising:
- generating, by the electronic device, one or more first phoneme lattices from audio data stored within an audio database;
determining, by the electronic device, one or more best paths from the one or more first phoneme lattices;
extracting, by the electronic device, one or more uniterms from the one or more first phoneme lattices; and
storing, by the electronic device, the one or more uniterms and the one or more best paths in a uniterm index database;
wherein extracting one or more uniterms comprises;
generating, by the electronic device, a next latent statistical lattice model from the one or more phoneme lattices generated from the audio dataextracting, by the electronic device, phoneme strings with a length that is at least equal to a pre-set minimum length from the one or more phoneme as candidates for the one or more uniterms;
scoring, by the electronic device, the candidates for the one or more uniterms against the next latent statistical lattice model; and
identifying, by the electronic device, a preset number of candidates with best scores as the one or more uniterms selected to represent the phoneme lattice.
4 Assignments
0 Petitions
Accused Products
Abstract
A method, system and communication device for enabling uniterm discovery from audio content and voice-to-voice searching of audio content stored on a device using discovered uniterms. Received audio/voice input signal is sent to a uniterm discovery and search (UDS) engine within the device. The audio data may be associated with other content that is also stored within the device. The UDS engine retrieves a number of uniterms from the audio data and associates the uniterms with the stored content. When a voice search is initiated at the device, the UDS engine generates a statistical latent lattice model from the voice query and scores the uniterms from the audio database against the latent lattice model. Following a further refinement, the best group of uniterms is then determined and segments of the stored audio data and/or other content corresponding to the best group of uniterms are outputted.
-
Citations
18 Claims
-
1. In an electronic device, a method comprising:
-
generating, by the electronic device, one or more first phoneme lattices from audio data stored within an audio database; determining, by the electronic device, one or more best paths from the one or more first phoneme lattices; extracting, by the electronic device, one or more uniterms from the one or more first phoneme lattices; and storing, by the electronic device, the one or more uniterms and the one or more best paths in a uniterm index database; wherein extracting one or more uniterms comprises; generating, by the electronic device, a next latent statistical lattice model from the one or more phoneme lattices generated from the audio data extracting, by the electronic device, phoneme strings with a length that is at least equal to a pre-set minimum length from the one or more phoneme as candidates for the one or more uniterms; scoring, by the electronic device, the candidates for the one or more uniterms against the next latent statistical lattice model; and identifying, by the electronic device, a preset number of candidates with best scores as the one or more uniterms selected to represent the phoneme lattice. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A device comprising:
-
a processor; an audio input device for receiving audio data including voice input data and voice queries; a storage mechanism for storing content including the audio data; and a uniterm discovery and search (UDS) engine executing on the processor and having functional components for completing the following functions; generating one or more first phoneme lattices from audio data stored within an audio database; determining one or more best paths from the one or more first phoneme lattices; extracting one or more uniterms from the one or more first phoneme lattices; and storing the one or more uniterms and the one or more best paths in a uniterm index database; wherein the functional component for extracting one or more uniterms further performs the functions of; generating a next latent statistical lattice model from the one or more phoneme lattices generated from the audio data extracting phoneme strings with a length that is at least equal to a pre-set minimum length from the one or more phoneme lattices as candidates for the one or more uniterms; scoring the candidates for the one or more uniterms against the next latent statistical lattice model; identifying a preset number of candidates with best scores as the one or more uniterms selected to represent the phoneme lattice; storing the one or more uniterms in a uniterms phoneme tree structure; and forwarding the uniterms phoneme tree structure and the one or more best paths to a coarse search function that scores the one or more uniterms of the uniterms phoneme tree structure against the statistical latent lattice model. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification