Voice-controlled data system
First Claim
1. A voice-controlled data system comprising:
- a data storage unit configured to store a plurality of media files;
a vocabulary generating unit configured to generate multilingual phonetic data corresponding to each of the media files;
a speech recognition unit configured to recognize a speech control command;
a processor configured to select a media file from the media files based on a comparison of a part of the speech control command and the generated multilingual phonetic data corresponding to the media files, where the multilingual phonetic data corresponding to any one of the media files comprises a phonetic representation of each data field of multilingual file identification data in a header section of the media file, and where the phonetic representation of each data field is based on a language identifier associated with said each data field of the multilingual file identification data in the header section; and
a media playback unit configured to play the selected media file.
5 Assignments
0 Petitions
Accused Products
Abstract
A voice-controlled data system may include a data storage unit including media files having associated file identification data, and a vocabulary generating unit generating phonetic data corresponding to the file identification data, the phonetic data being supplied to a speech recognition unit as a recognition vocabulary, where one of the media files may be selected according to a recognized speech control command on the basis of the generated phonetic data, where the file identification data include a language identification part for identifying the language of the file identification data, and where the vocabulary generating unit generates the phonetic data for the file identification data of a media file based on its language identification part.
32 Citations
41 Claims
-
1. A voice-controlled data system comprising:
-
a data storage unit configured to store a plurality of media files; a vocabulary generating unit configured to generate multilingual phonetic data corresponding to each of the media files; a speech recognition unit configured to recognize a speech control command; a processor configured to select a media file from the media files based on a comparison of a part of the speech control command and the generated multilingual phonetic data corresponding to the media files, where the multilingual phonetic data corresponding to any one of the media files comprises a phonetic representation of each data field of multilingual file identification data in a header section of the media file, and where the phonetic representation of each data field is based on a language identifier associated with said each data field of the multilingual file identification data in the header section; and a media playback unit configured to play the selected media file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for the voice-controlled selection of a media file stored on a data storage unit, the data storage unit including a plurality of media files, the media files including respective file identification data, the method comprising:
-
receiving voice data indicative of selection of the media file from among the media files, and supplying the voice data to a speech recognition unit; extracting, by a processor, a first language identification tag included in the respective file identification data of each of the media files, the first language identification tag indicating a first language associated with a first data field of the respective file identification data, where the respective file identification data is in a header section of each of the respective media files; extracting, by the processor, a second language identification tag included in the respective file identification data of each of the media file, the second language identification tag indicating a second language associated with a second data field of the respective file identification data in the header section of each of the respective media files; generating, by the processor, phonetic data corresponding to the file identification data for each of the media files, the generated phonetic data comprising phonetic representations of the first data field and the second data field that are generated based on the first language identification tag and the second language identification tag, respectively; comparing, by the processor, the generated phonetic data to the received voice data by the speech recognition unit and generating a corresponding speech control command; and selecting, by the processor, the media file from the data storage unit in accordance with the generated speech control command. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A system comprising:
-
a data storage unit configured to store a plurality of media files, where each media file includes file identification data in a header section, the file identification data includes a first data entry containing a first natural language term and the file identification data further includes a second data entry containing a second natural language term; a language estimation unit configured to identify a first language of the first natural language term in the first data entry and further configured to add an association between a first language tag and the first data entry, where the first language tag is an indication of the first language; the language estimation unit further configured to identify a second language of the second natural language term in the second data entry and further configured to add an association between a second language tag and the second data entry, where the second language tag is an indication of the second language; a vocabulary generating unit configured to generate phonetic data associated with each respective media file of the media files, where the phonetic data contains phonetic representations of the first natural language term and the second natural language term based on the first language tag and the second language tag, respectively; a speech recognition unit configured to receive a natural language voice input representative of a request to select a media file; a processor configured to select the media file based on a comparison of the natural language voice input with the phonetic data generated for each of the media files based on the respective first and second language tags. - View Dependent Claims (21, 22, 23)
-
-
24. A method for automatically detecting the language of file identification data of a media file, the file identification data including a plurality of data fields in a header section of the media file allowing identification of the media file, the method comprising:
-
receiving, with a processor, a selection of at least one media file out of a group of media files; retrieving, with the processor, a first data field and a second data field of the data fields in the file identification data; estimating a first language of the first data field of the file identification data of the media file with the processor based on a statistical model; estimating a second language of the second data field of the file identification data of the media file with the processor based on the statistical model; adding, with the processor, the estimated first language and the estimated second language to the file identification data; linking, with the processor, the estimated first language with the first data field and the estimated second language with the second data field; and identifying, by the processor, the media file in response to receipt of a voice control command by comparing the voice control command to phonetic data comprising a first phonetic representation of the first data field in the estimated first language and a second phonetic representation of the second data field in the estimated second language. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41)
-
Specification