Multi-phoneme streamer and knowledge representation speech recognition system and method

US 20040215449A1
Filed: 06/30/2003
Published: 10/28/2004
Est. Priority Date: 06/28/2002
Status: Active Grant

First Claim

Patent Images

1. A method of processing phonemes in speech, comprising:

inputting an acoustic input of digitized speech;

segmenting said digitized acoustic input into a plurality of time-slices;

analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type; and

outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method related to a new approach to speech recognition that reacts to concepts conveyed through speech. In its fullest implementation, the system and method shifts the balance of power in speech recognition from straight sound recognition and statistical models to a more powerful and complete approach determining and addressing conveyed concepts. This is done by using a probabilistically unbiased multi-phoneme recognition process, followed by a phoneme stream analysis process that builds the list of candidate words derived from recognized phonemes, followed by a permutation analysis process that produces sequences of candidate words with high potential of being syntactically valid, and finally, by processing targeted syntactic sequences in a conceptual analysis process to generate the utterance'"'"'s conceptual representation that can be used to produce an adequate response. The invention can be employed for a myriad of applications, such as improving accuracy or automatically generating punctuation for transcription and dictation, word or concept spotting in audio streams, concept spotting in electronic text, customer support, call routing and other command/response scenarios.

124 Citations

401 Claims

1. A method of processing phonemes in speech, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type; and
  
  outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 400, 401)
- - 2. The method of claim 1, wherein the reference cluster sets are specific to at least two of a (a) region or accent, (b) a gender, and (c) an age, age range, or child/adult distinction.
  - 3. The method of claim 1, wherein at least some of the identified candidate phonemes for a time-slice originate from different cluster sets.
  - 4. The method of claim 1, wherein the phoneme stream comprises candidate phonemes from different cluster sets, thereby enabling recognition that the acoustic input represents speech by more than one person.
  - 5. The method of claim 4, wherein the speakers are a male and a female.
  - 6. The method of claim 4, wherein the recognition is that the speakers have different accents.
  - 7. The method of claim 1, wherein the segmented acoustic input comprises non-overlapping time-slices.
  - 8. The method of claim 1, wherein the segmented acoustic input comprises overlapping time-slices.
  - 9. The method of claim 1, wherein the segmented acoustic input comprises both overlapping and nonoverlapping time-slices.
  - 10. The method of claim 9, wherein a subsequent time-slice is selected to be overlapping or nonoverlapping based on the results of the analysis of the previous time-slice.
  - 11. The method of claim 1, wherein the phoneme stream is further processed for transcription or dictation.
  - 12. The method of claim 1, wherein the phoneme stream is further processed to provide a response to a query represented by the acoustic input.
  - 13. The method of claim 12, wherein the response is an acoustic response.
  - 14. The method of claim 12, wherein the response is a text-based response.
  - 15. The method of claim 12, wherein the response is a system response based on the interpreted content of the acoustic input.
  - 16. The method of claim 1, wherein the plurality of reference cluster sets correspond to more than one language, thereby enabling detection of the language of the acoustic input.
  - 400. The methods of claims 1, 17, 143, 209, 245, 316, 354, 358, 362, 366, 370, 374, 382 and 396, further comprising digitizing a received analog input into the digitized input.
  - 401. The methods of claims 1, 17, 143, 209, 245, 316, 354, 358, 362, 366, 370, 374, 382 and 396, further comprising re-digitizing a received digitized input into the digitized input.

17. A method of processing and recognizing speech, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify a candidate phoneme based on a plurality of reference cluster sets, each cluster set representing reference phonemes for that cluster type;
  
  wherein the step of analyzing includes determining a score or probability of each identified candidate phoneme; and
  
  outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step, and wherein the phoneme stream includes or is associated with the determined score or probability of each identified candidate phoneme.
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51)
- - 18. The method of claim 17, wherein the phoneme stream is outputted for substantially contemporaneous processing.
  - 19. The method of claim 17, wherein the phoneme stream is stored for deferred processing.
  - 20. The method of claim 17, wherein the phoneme stream comprises a single stream of information.
  - 21. The method of claim 17, wherein the phoneme stream comprises multiple streams of information.
  - 22. The method of claim 17, wherein the reference cluster sets further include triphone variations of the reference phonemes used in order to identify a candidate phoneme based on a triphone pronunciation.
  - 23. The method of claim 17, wherein the step of analyzing comprises applying a neural network.
  - 24. The method of claim 17, wherein the step of analyzing comprises applying formant analysis.
  - 25. The method of claim 17, wherein the step of analyzing comprises applying a multivariate Gaussian classifier.
  - 26. The method of claim 17, wherein the candidate phonemes are identified based on application of a threshold.
  - 27. The method of claim 26, wherein the threshold is fixed.
  - 28. The method of claim 26, wherein the threshold is adaptive.
  - 29. The method of claim 26, wherein there is a threshold for each of the reference cluster sets, at least some of the thresholds being different.
  - 30. The method of claim 26, wherein there is a threshold for each reference phoneme for each cluster set, at least some of the reference phoneme thresholds for a given cluster set being different.
  - 31. The method of claim 26, wherein there is a threshold for each reference phoneme for each cluster set, at least some of the cluster sets having different thresholds for the same reference phoneme.
  - 32. The method of claim 17, further comprising the step of processing the phoneme stream to identify candidate words based on the candidate phonemes.
  - 33. The method of claim 32, wherein at least some of the candidate words are alternative candidate words representing candidate words from the same or an overlapping portion of the acoustic input.
  - 34. The method of claim 32, wherein the processing includes scoring the candidate words based on the scores or probabilities of the candidate phonemes making up the candidate words.
  - 35. The method of claim 32, wherein scoring the candidate words comprises aggregating or averaging the scores or probabilities of the candidate phonemes used to construct the candidate words.
  - 36. The method of claim 35, further comprising ranking the candidate words based on the scores of the candidate words.
  - 37. The method of claim 32, wherein processing the phoneme stream to identify candidate words comprises generating search paths representing a permutation of candidate phonemes among the time-slices, each search path potentially representing at least a partial valid pronunciation of a word in a dictionary.
  - 38. The method of claim 37, wherein a search path is dropped or treated as invalid when the addition of a candidate phoneme from a further time-slice would result in no at least partial valid pronunciation of a word in a dictionary.
  - 39. The method of claim 37, wherein a search path is dropped or treated as invalid upon the addition of at least two non-matching candidate phonemes, a first non-matching candidate phoneme resulting in no correspondence to at least a partial valid pronunciation of a word in a dictionary, and a second non-matching candidate phoneme resulting in no correspondence to at least a partial valid pronunciation of a word in a dictionary when ignoring the first phoneme.
  - 40. The method of claim 32, wherein processing the phoneme stream to identify candidate words accounts for bridging wherein the speech includes a phoneme which is effectively shared between two words.
  - 41. The method of claim 32, wherein processing the phoneme stream to identify candidate words accounts for bridging wherein the speech includes adjacent phonemes having similar pronunciations.
  - 42. The method of claim 32, wherein processing the phoneme stream to identify candidate words based on the candidate phonemes is implemented by processing candidate phonemes from a time-slice in a descending order of probability or score, thereby providing candidate words that are naturally sorted according to a descending order of score for the candidate words.
  - 43. The method of claim 32, wherein processing the phoneme stream to identify candidate words comprises permuting the candidate phonemes from different points in the acoustic input to construct combinations of phonemes comprising potential words.
  - 44. The method of claim 43, wherein the permutation is between different time-slices having identified candidate phonemes.
  - 45. The method of claim 43, wherein potential words are processed according to a dictionary in order to identify the candidate words.
  - 46. The method of claim 45, wherein the dictionary comprises a plurality of words and pronunciations of words.
  - 47. The method of claim 32, wherein the candidate words correspond to at least a two-dimensional array of candidate words, a first dimension corresponding to time across the acoustic input, and a second dimension corresponding to alternative candidate words for the same or an overlapping interval of time across the acoustic input.
  - 48. The method of claim 32, wherein the candidate words are constructed using candidate phonemes originating from the same reference cluster set.
  - 49. The method of claim 32, wherein the candidate words are capable of being constructed using candidate phonemes originating from differing cluster sets.
  - 50. The method of claim 17, wherein said method is implemented in an application for transcription or dictation.
  - 51. The method of claim 17, wherein said method is implemented in an application for generating a response to a query represented by said acoustic input.

52. A system for processing an acoustic input of speech, comprising:
- an input device for inputting an acoustic input comprising digitized speech;
  
  a phoneme recognition processor for processing said digitized acoustic input based on a plurality of reference cluster sets to generate a plurality of candidate phonemes;
  
  wherein the phoneme recognition processor identifies a score or probability for each candidate phoneme; and
  
  wherein at least some of the candidate phonemes are alternative candidate phonemes corresponding to the portion of the acoustic input.
- View Dependent Claims (53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84)
- - 53. The system of claim 52, wherein at least some of the candidate phonemes originate from different reference cluster sets.
  - 54. The system of claim 52, wherein the phoneme recognition processor segments said digitized acoustic input into time-slices in order to identify candidate phonemes.
  - 55. The system of claim 54, wherein the time-slices are overlapping.
  - 56. The system of claim 54, wherein the time-slices are nonoverlapping.
  - 57. The system of claim 54, wherein the time-slices include both overlapping and nonoverlapping time-slices.
  - 58. The system of claim 55 or 57, wherein the overlapping time segments overlap within the range of approximately 40% and 60%.
  - 59. The system of claim 57, wherein a subsequent time-slice is selected to be overlapping or nonoverlapping based on the phoneme recognition result of the previous time-slice.
  - 60. The system of claim 52, wherein the plurality of reference cluster sets comprise sets of reference phonemes for a single language.
  - 61. The system of claim 52, wherein the plurality of reference cluster sets comprise sets of reference phonemes for multiple languages, thereby allowing the system to detect the language spoken by the person inputting the speech.
  - 62. The system of claim 52, wherein the plurality of reference cluster sets comprise reference triphones, thereby enabling the system to recognize candidate phonemes according to the triphone variations in the pronunciations of candidate phonemes.
  - 63. The system of claim 62, wherein the phoneme recognition processor is adapted to generate a candidate phoneme by mapping a detected triphone to the corresponding phoneme.
  - 64. The system of claim 52, wherein the phoneme recognition unit is further adapted to output a phoneme stream of the candidate phonemes including or associated with the identified score or probability of each identified candidate phoneme.
  - 65. The system of claim 64, wherein the phoneme stream is stored for further processing.
  - 66. The system of claim 64, wherein the phoneme stream is processed by a phoneme stream analyzer to identify candidate words corresponding to the candidate phonemes.
  - 67. The system of claim 66, wherein the candidate words are stored for further processing.
  - 68. The system of claim 66, wherein the phoneme stream data is stored for further processing.
  - 69. The system of claim 66, wherein the candidate words and the phoneme stream data are stored for further processing.
  - 70. The system of claim 66, wherein the candidate words are based on potential words constructed according to permutations of candidate phonemes from different time-slices.
  - 71. The system of claim 70, wherein the candidate words are generated by creating search paths reflecting permuted candidate phonemes from different time-slices matching at least a partial valid pronunciation of a word in a dictionary.
  - 72. The system of claim 71, wherein a search path is terminated or dropped upon the permutation with a further candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary.
  - 73. The system of claim 71, wherein a search path is terminated or dropped upon the permutation with at least two further consecutive non-matching candidate phonemes, the first non-matching candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary, and the second non-matching candidate phoneme resulting in no at least partial valid pronunciation when ignoring the first non-matching candidate phoneme, thereby providing an error tolerant system.
  - 74. The system of claim 70, wherein the candidate words are identified by processing the potential words according to a dictionary.
  - 75. The system of claim 70, wherein the candidate words are scored based on the scores or probabilities of the candidate phonemes used to construct the candidate words.
  - 76. The system of claim 66, wherein the candidate words are constructed based on candidate phonemes originating from the same reference cluster set.
  - 77. The system of claim 52, wherein the phoneme recognition processor is further adapted to process said digitized acoustic input to detect or derive at least one parameter in addition to (a) candidate phonemes which are identified and (b) score or probabilities which are identified, wherein the at least one additional parameter is used by a phoneme stream analyzer in analyzing the identified candidate phonemes to identify candidate words corresponding to the candidate phonemes.
  - 78. The system of claim 77, wherein the at least one additional parameter is derived through time domain processing.
  - 79. The system of claim 77, wherein the at least one additional parameter is derived through frequency domain processing.
  - 80. The system of claim 77, wherein the at least one additional parameter comprises pitch information, wherein the pitch information is used in conjunction with information contained in a dictionary to identify the candidate words
  - 81. The system of claim 80, wherein the dictionary contains Chinese language words.
  - 82. The system of claim 77, wherein the acoustic input is segmented into time-slices, each time-slice being characterized by a pitch value.
  - 83. The system of claim 52, wherein said system is implemented in an application for transcription or dictation.
  - 84. The system of claim 52, wherein said system is implemented in an application for providing a response to a query represented by said acoustic input.

85. A method of processing and recognizing speech, comprising:
- generating a phoneme stream by processing a digitized speech sample to identify candidate phonemes including at least some alternative candidate phonemes;
  
  permuting candidate phonemes between different time-slices to generate potential words represented by the speech sample; and
  
  generating a list of candidate words for the phoneme stream based on the potential words.
- View Dependent Claims (86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116)
- - 86. The method of claim 85, wherein the phoneme stream is outputted to apparatus for performing said permuting step.
  - 87. The method of claim 85, wherein the phoneme stream is stored for performing the permuting step at a later time.
  - 88. The method of claim 85, where the at least one of phoneme stream and the candidate words are stored for further processing.
  - 89. The method of claim 85, further comprising processing the potential words according to a dictionary to identify candidate words.
  - 90. The method of claim 89, wherein permuting the candidate phonemes comprises permuting candidate phonemes between different time-slices to create a search path, and wherein processing according to a dictionary comprises processing the permutated phonemes of the search path to determine correspondence to at least a partial valid pronunciation for a word in the dictionary.
  - 91. The method of claim 90, wherein a search path is expanded by permuting the search path to add a candidate phoneme from a further time-slice.
  - 92. The method of claim 91, wherein an expanded search path is terminated or dropped when permutation with the further candidate phoneme results in no correspondence to at least a partial valid pronunciation of a word from the dictionary.
  - 93. The method of claim 91, wherein a search path is terminated or dropped upon the permutation with at least two further consecutive non-matching candidate phonemes, the first non-matching candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary, and the second non-matching candidate phoneme resulting in no at least partial valid pronunciation when ignoring the first non-matching candidate phoneme, thereby providing an error tolerant system.
  - 94. The method of claim 90, wherein a separate search path is created for each candidate phoneme in a time-slice.
  - 95. The method of claim 94, wherein the separate search paths are created in a descending order beginning with the candidate phoneme in the time-slice with the highest score or probability, thereby naturally sorting potential words based on scores or probabilities.
  - 96. The method of claim 95, wherein further permuting a search path to add a candidate phoneme from a further time-slice comprises selecting candidate phonemes from the further time-slice in a descending order beginning with the candidate phoneme with the highest score or probability.
  - 97. The method of claim 85, wherein at least some of the candidate words are alternative candidate words for the same portion or an overlapping portion of the digitized speech sample.
  - 98. The method of claim 97, wherein the identified candidate words correspond to an at least two-dimensional array of candidate words, a first dimension corresponding to time across the speech sample, and a second dimension corresponding to alternative candidate words for the same or overlapping portions of the speech sample.
  - 99. The method of claim 97, wherein the identified candidate words are scored according to probabilities or scores of candidate phonemes making up the candidate words, and wherein alternative candidate words are ranked according to the scores of the alternative candidate words.
  - 100. The method of claim 85, wherein generating the phoneme stream further comprises computing or identifying scores or probabilities for the candidate phonemes.
  - 101. The method of claim 100, further comprising the step of scoring the candidate words based on the scores or probabilities of the candidate phonemes making up the candidate words.
  - 102. The method of claim 85, wherein permuting the candidate phonemes is implemented by processing candidate phonemes from each time-slice in a descending order of probability or score, thereby providing candidate words that are naturally sorted according to a descending order of score for the candidate words.
  - 103. The method of claim 85, wherein the phoneme stream is generated by deriving candidate words from an N-best list of potential words generated from the application of the Hidden Markov Model (HMM) technique to the speech sample, further deriving additional candidate words from combinations of two or more consecutive N-best list potential words, and deriving candidate phonemes from the candidate words.
  - 104. The method of claim 85, wherein the phoneme stream is generated by deriving candidate phonemes from the results generated by application of the Backus-Naur (BNF) technique to the speech sample.
  - 105. The method of claim 85, further comprising the step of permuting the candidate words to generate potential syntactic structures, the potential syntactic structures comprising sequences of words which are potentially syntactically valid.
  - 106. The method of claim 105, further comprising the step of permuting potential syntactic structures with at least one of (a) potential syntactic structures or (b) candidate words, to generate further potential syntactic structures.
  - 107. The method of claim 105 or 106, further comprising syntactically analyzing the potential syntactic structures to generate syntactically valid sequences of words.
  - 108. The method of claim 107, wherein the syntactic analysis is carried out to respect interjections, so that the presence of interjections does not result in invalidating an otherwise valid sequence of words.
  - 109. The system of claim 107, wherein the syntactic analysis is implemented as one of a bottom-up parsing process, a top-down parsing process, an Early parsing process, a finite-state parsing process, and a CYK parsing process.
  - 110. The method of claim 107, wherein syntactically analyzing comprises applying syntactic transform scripts to the potential syntactic structures.
  - 111. The method of claim 107, further comprising deriving conceptual representations of at least some of the syntactically valid sequences of words.
  - 112. The method of claim 111, further comprising identifying at least one of the syntactically valid sequences of words as a sentence, and deriving a conceptual representation of the at least one sentence.
  - 113. The method of claim 85, wherein the list of candidate words is further processed for transcription or dictation.
  - 114. The method of claim 85, wherein the list of candidate words is further processed to provide a response to a query represented by the speech sample.
  - 115. The method of claim 85, wherein the candidate phonemes are identified through pattern recognition applied to cluster sets of reference phonemes.
  - 116. The method of claim 85, wherein the candidate phonemes are identified through pattern recognition applied to cluster sets including reference triphones.

117. A system for processing speech, comprising:
- means for generating a phoneme stream by processing a digitized speech sample to identify candidate phonemes including at least some alternative candidate phonemes;
  
  a processor for (a) permuting the candidate phonemes to generate potential words represented by the speech sample; and
  
  (b) generating a list of candidate words based on the potential words.
- View Dependent Claims (118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142)
- - 118. The system of claim 117, wherein the processor is further adapted for (c) processing the potential words according to a dictionary to identify candidate words.
  - 119. The system of claim 118, wherein permuting the candidate phonemes is implemented through a search path created by permuting candidate phonemes from different time-slices and comparing the permutated candidate phonemes to the dictionary to determine if the search path corresponds to at least a partial valid pronunciation of a word.
  - 120. The method of claim 119, wherein the comparison is carried out based on symbols or values representing the permuted candidate phonemes which are compared to symbols or values in the dictionary representing partial or whole valid pronunciations of a word.
  - 121. The system of claim 119, wherein based on a favorable result of the comparison, the search path is expanded to permute one or more candidate phonemes from additional time-slices.
  - 122. The system of claim 121, wherein an expanded search path is terminated when an additional phoneme results in the expanded search path not corresponding to any at least partial valid pronunciation of a word in the dictionary.
  - 123. The system of claim 121, wherein an expanded search path is terminated or dropped upon the permutation with at least two further consecutive non-matching candidate phonemes, the first non-matching candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary, and the second non-matching candidate phoneme resulting in no at least partial valid pronunciation when ignoring the first non-matching candidate phoneme, thereby providing an error tolerant system.
  - 124. The system of claim 117, wherein said means for generating a phoneme stream comprises a processor executing the Hidden Markov Model (HMM) technique to produce candidate words from which candidate phonemes are derived.
  - 125. The system of claim 117, wherein said means for generating a phoneme stream comprises a processor executing the Backus-Naur (BNF) technique to produce results from which candidate phonemes are derived.
  - 126. The system of claim 117, wherein the phoneme stream comprises a plurality of time-slices, at least some of the time-slices including a plurality of alternative candidate phonemes, and each candidate phoneme having a score or probability.
  - 127. The system of claim 117, wherein at least some of the candidate words are alternative candidate words corresponding to the same portion or an overlapping portion of the speech sample.
  - 128. The system of claim 117, wherein the means for generating a phoneme stream provides a score or probability for each of the candidate phonemes.
  - 129. The system of claim 128, wherein the processor is further adapted for (c) scoring the candidate words based on the scores or probabilities of the candidate phonemes making up the candidate words.
  - 130. The system of claim 117, further comprising a memory for storing at least a two-dimensional array of candidate words, the first dimension related to time and the second dimension corresponding to alternative candidate words for the same or an overlapping time period.
  - 131. The system of claim 117, wherein the processor is further adapted for (c) permuting the candidate words to generate potential syntactic structures, the potential syntactic structures comprising sequences of words which are potentially syntactically valid.
  - 132. The system of claim 131, wherein the processor is further adapted for (d) permuting potential syntactic structures with at least one of (i) potential syntactic structures or (ii) candidate words, to generate further potential syntactic structures.
  - 133. The system of claim 131 or 132, wherein the processor is further adapted for (d) syntactically analyzing the potential syntactic structures to generate syntactically valid sequences of words.
  - 134. The system of claim 133, wherein the syntactic analysis is carried out to respect interjections to that the presence of an interjection does not invalidate an otherwise syntactically valid sequence of words.
  - 135. The system of claim 133, wherein the syntactic analysis is implemented as one of a bottom-up parsing process, a top-down parsing process, an Early parsing process, a finite-state parsing process, and a CYK parsing process.
  - 136. The system of claim 133, wherein the syntactic analysis includes the application of syntactic transform scripts to the potential syntactic structures.
  - 137. The system of claim 133, wherein the processor is further adapted for (e) deriving conceptual representations of at least some of the syntactically valid sequences of words.
  - 138. The system of claim 137, wherein the at least some syntactically valid sequence of words comprise sentences.
  - 139. The system of claim 117, wherein the processor is further adapted for (c) processing the candidate words for transcription or dictation.
  - 140. The system of claim 117, wherein the processor is further adapted for (c) processing the candidate words for formulating a response to a query represented by the speech sample.
  - 141. The system of claim 117, wherein the means for generating a phoneme stream identifies candidate phonemes by processing the speech sample based on cluster sets including reference phonemes.
  - 142. The system of claim 117, wherein the means for generating a phoneme stream identifies candidate phonemes by processing the speech sample based on cluster sets including reference triphones.

143. A method of processing speech, comprising:
- inputting an acoustic input comprising digitized speech;
  
  processing said digitized acoustic input to identify a plurality of candidate phonemes;
  
  computing for each candidate phoneme a score or probability;
  
  aggregating at least some of said plurality of candidate phonemes into potential words; and
  
  processing the computed scores or probabilities of the candidate phonemes.
- View Dependent Claims (144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159)
- - 144. The method of claim 143, wherein the acoustic input comprises a plurality of time-slices, the time-slices being processed to identify candidate phonemes, and wherein at least some of the time-slices are processed to identify multiple candidate phonemes which represent alternative candidate phonemes.
  - 145. The method of claim 144, wherein the identified candidate phonemes are organized as a phoneme stream representing the candidate phonemes which were capable of being detected for the plurality of time-slices.
  - 146. The method of claim 143, wherein at least some of the potential words are alternative potential words comprising potential words for the same or an overlapping portion of time in the speech.
  - 147. The method of claim 146, wherein processing the computed scores or probabilities of the candidate phonemes comprises scoring the potential words based on the scores or probabilities of the candidate phonemes making up the potential words.
  - 148. The method of claim 147, wherein the scores of the alternative potential words are used to rank the alternative potential words.
  - 149. The method of claim 147, wherein the scores of the alternative potential words are evaluated to select the alternative potential word with the most favorable score.
  - 150. The method of claim 143, wherein processing the computed scores or probabilities of the candidate phonemes comprises using the computed scores or probabilities of the candidate phonemes to select the order in which candidate phonemes are aggregated into candidate words.
  - 151. The method of claim 143, wherein aggregating comprises permuting at least some of the candidate phonemes from different time-slices to generate possible combinations resulting in potential words.
  - 152. The method of claim 151, wherein processing the computed scores or probabilities of the candidate phonemes comprises using said scores or probabilities for purposes of ordering permutation of the candidate phonemes.
  - 153. The method of claim 151, wherein potential words are processed according to a dictionary in order to identify candidate words from the potential words.
  - 154. The method of claim 153, wherein the candidate words are identified without consideration of the scores or probabilities of the candidate phonemes making up the potential words.
  - 155. The method of claim 153, wherein the candidate words are identified by processing based on both the dictionary and the scores or probabilities of the candidate phonemes making up the potential words.
  - 156. The method of claim 143, wherein processing said acoustic input to identify candidate phonemes is based on a plurality of cluster sets having reference phonemes.
  - 157. The method of claim 143, wherein processing said acoustic input to identify candidate phonemes is based on a plurality of cluster sets having reference triphones.
  - 158. The method of claim 143, wherein the potential words are further processed for a transcription or dictation application.
  - 159. The method of claim 143, wherein the potential words are further processed for formulating a response to a query represented by the acoustic input.

160. A speech processing system, comprising:
- input means for inputting a digitized speech input;
  
  phoneme recognition means for identifying a plurality of candidate phonemes in said digitized speech input and providing a score or probability for each candidate phoneme;
  
  wherein at least some of the candidate phonemes are alternative candidate phonemes; and
  
  phoneme analysis means for processing said plurality of candidate phonemes into potential words.
- View Dependent Claims (161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 398, 399)
- - 161. The speech processing system of claim 160, wherein the phoneme analysis means is further adapted for scoring the potential words based on the scores or probabilities of the candidate phonemes making up the potential words.
  - 162. The speech processing system of claim 160, wherein said input means comprises a wired or wireless telephone or other wireless communication equipment.
  - 163. The speech processing system of claim 160, wherein said input means comprises a microphone operatively coupled to the Internet.
  - 164. The speech processing system of claim 160, wherein said input means comprises a means for playback of pre-recorded audio.
  - 165. The speech processing system of claim 160, wherein said digitizing means comprises a digitizer located at the speaker'"'"'s location.
  - 166. The speech processing system of claim 165, wherein the digitizer is located in a personal computer or personal data assistant (PDA) device.
  - 167. The speech processing system of claim 165, wherein the input means comprises a wireless transceiver, and said wireless transceiver comprises said digitizer.
  - 168. The speech processing system of claim 160, wherein said digitizing means comprises a digitizer remotely located from the speaker.
  - 169. The speech processing system of claim 160, wherein said phoneme recognition means is adapted to output a phoneme stream comprising said candidate phonemes and said scores or probabilities.
  - 170. The speech processing system of claim 160, wherein the phoneme analysis means is adapted to identify alternative potential words from the same portion or an overlapping portion of the speech input.
  - 171. The speech processing system of claim 170, wherein the alternative potential words are examined to select the potential word with the most favorable score based on the scores or probabilities of the candidate phonemes making up the alternative potential words.
  - 172. The speech processing system of claim 160, further comprising dictionary processing means for processing the potential words according to a dictionary to thereby identify candidate words from the potential words.
  - 173. The speech processing system of claim 172, further comprising word aggregation means for processing said candidate words into syntactically valid sequences of words.
  - 174. The speech processing system of claim 173, wherein said word aggregation means is adapted to permute candidate words into potential syntactic structures.
  - 175. The speech processing system of claim 174, wherein said word aggregation means is adapted to permute potential syntactic structures with at least one of (a) potential syntactic structures or (b) candidate words, to generate further potential syntactic structures.
  - 176. The speech processing system of claim 174 or 175, wherein said word aggregation means is adapted to syntactically analyze the potential syntactic structures to generate syntactically valid sequences of words.
  - 177. The speech processing system of claim 176, wherein the syntactic analysis is implemented as one of a bottom-up parsing process, a top-down parsing process, an Early parsing process, a finite-state parsing process, and a CYK parsing process.
  - 178. The speech processing system of claim 176, wherein said word aggregation means is adapted to apply syntactic transform scripts to the potentially syntactic structures to generate syntactically valid sequences of words.
  - 179. The speech processing system of claim 176, further comprising means for deriving conceptual representations of at least some of the syntactically valid sequences of words.
  - 180. The speech processing system of claim 173, wherein at least some of the potential syntactic structures are scored based on the scores of the candidate phonemes making up the potential syntactic structures.
  - 181. The speech processing system of claim 180, wherein the scores of the potential syntactic potential syntactic structures are used in selecting at least one potential syntactic structure for further analysis.
  - 182. The speech processing system of claim 160, wherein the phoneme recognition means identifies the candidate phonemes based on reference cluster sets of reference phonemes.
  - 183. The speech processing system of claim 160, wherein the phoneme recognition means identifies the candidate phonemes based on reference cluster sets of reference triphones.
  - 398. The systems of claims 52, 160, 282, 353 and 390, further comprising digitizing means to digitize a received analog input into the digitized input.
  - 399. The systems of claims 52, 160, 282, 353 or 390, further comprising digitizing means to re-digitize a received digitized input into the digitized input.

184. A method of processing speech, comprising:
- processing a speech sample to identify a list of candidate words, wherein at least some of the candidate words are alternative candidate words corresponding to the same or an overlapping portion of the speech sample, permuting at least some of the candidate words to create a plurality of potential syntactic structures; and
  
  selecting one of the potential syntactic structures as corresponding to the speech sample.
- View Dependent Claims (185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200)
- - 185. The method of claim 184, wherein permuting the candidate words is carried out to give consideration to word pronunciation boundaries, thereby creating potential syntactic structures comprised of candidate words with beginning boundaries and an end boundaries that do not conflict with the beginning boundaries and end boundaries of other candidate words pronunciations.
  - 186. The method of claim 185, wherein the permutation is carried out only for combinations of candidate words without conflicting pronunciation boundaries.
  - 187. The method of claim 184, further comprising permuting potential syntactic structures with at least one of (a) potential syntactic structures or (b) candidate words, to generate further potential syntactic structures.
  - 188. The method of claim 184, further comprising syntactically analyzing the potential syntactic structures to generate syntactically valid sequences of words.
  - 189. The method of claim 188, wherein the syntactic analysis is carried out to respect interjections so that the presence of an interjection does not invalidate an otherwise syntactically valid sequence of words.
  - 190. The method of claim 188, wherein the syntactic analysis is implemented as a bottom-up parsing process, top-down parsing process, Early parsing process, finite-state parsing process, or CYK parsing process.
  - 191. The method of claim 188, wherein syntactically analyzing comprises applying syntactic transform scripts to the potential syntactic structures.
  - 192. The method of claim 184, wherein each of the candidate words is assigned a score or probability.
  - 193. The method of claim 184, wherein each of the potential syntactic structures is assigned a score or probability.
  - 194. The method of claim 184, wherein each of the candidate words is assigned a score or probability, and further wherein each of the potential syntactic structures is assigned a score or probability based on the scores or probabilities of the candidate words used to construct the potential syntactic structure.
  - 195. The method of claim 184, wherein each candidate word is constructed from candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the scores or probabilities of the candidate phonemes making up the candidate word, and further wherein each of the potential syntactic structures is assigned a score or probability.
  - 196. The method of claim 184, wherein processing the speech sample comprises producing the candidate words from an N-best list of potential words produced by application of the Hidden Markov Model (HMM) technique to the speech sample and also from combinations of two or more consecutive N-best list potential words.
  - 197. The method of claim 184, wherein processing the speech sample comprises processing a series of time-slices to identify candidate phonemes, at least some of the time segments including alternative candidate phonemes.
  - 198. The method of claim 184, further comprising the step of deriving a conceptual representation of at least one selected potential syntactic sequence.
  - 199. The method of claim 198, wherein the selected potential syntactic sequence is a syntactically valid sequence of words comprising a sentence.
  - 200. The method of claim 198, further comprising the step of using the conceptual representation to formulate a response to a speech sample comprising an inquiry.

201. A speech processing system, comprising:
- a phoneme recognition unit for identifying candidate phonemes, wherein at least some of the candidate phonemes are alternative candidate phonemes;
  
  a phoneme stream analyzer for identifying a list of candidate words constructed from the candidate phonemes, wherein at least some of the candidate words are alternative candidate words corresponding to the same portion or an overlapping portion of a speech input;
  
  a word permutation unit for permuting the candidate words to create a plurality of potential syntactic structures;
  
  wherein one of the plurality of potential syntactic structures is selected as corresponding to the speech input.
- View Dependent Claims (202, 203, 204, 205, 206, 207, 208)
- - 202. The speech processing system of claim 201, wherein each of the candidate phonemes is assigned a score or probability.
  - 203. The speech processing system of claim 201, wherein the word permutation unit is further adapted for syntactically validating the potential syntactic structures to render syntactically valid sequences of words.
  - 204. The speech processing system of claim 203, further comprising means for extracting conceptual representations of syntactically valid sequences of words.
  - 205. The speech processing system of claim 204, wherein the conceptual representations are used to derive a response to an inquiry represented by one of the syntactically valid sequences of words.
  - 206. The speech processing system of claim 201, wherein the phoneme stream analyzer permutes the candidate phonemes in order to generate a list of potential words.
  - 207. The speech processing system of claim 206, wherein the list of potential words are selected as the list of candidate words.
  - 208. The speech processing system of claim 206, wherein the list of potential words are processed according to a dictionary to generate the list of candidate words.

209. A method of processing speech, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate triphones based on a plurality of reference cluster sets, each cluster set representing reference triphones for a cluster type; and
  
  outputting the identified candidate triphones.
- View Dependent Claims (210)
- - 210. The method of claim 209, further comprising processing the identified candidate triphones according to a triphone-based dictionary to identify candidate words.

211. A method of processing speech, comprising:
- processing a speech input to identify a plurality of syntactic sequences of words, the syntactic sequences of words comprising candidate words, the candidate words and the syntactic sequences of words having at least one associated part of speech;
  
  deriving one or more conceptual representations from at least one of the syntactic sequences of words; and
  
  formulating one or more responses to the speech input based on at least one conceptual representation.
- View Dependent Claims (212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245)
- - 212. The method of claim 211, wherein the step of formulating the response comprises processing the conceptual representation in relation to reference data.
  - 213. The method of claim 212, wherein the reference data comprises a database.
  - 214. The method of claim 212, wherein the reference data comprises a physical measurement.
  - 215. The method of claim 212, further comprising executing a command to communicate at least one of the responses.
  - 216. The method of claim 215, wherein the step of communicating the response comprises at least one of an audio response, a text response, a visual response or a mechanical response.
  - 217. The method of claim 216, further comprising identifying one or more inquiry anomalies in the speech input for at least one of the syntactic sequences of words.
  - 218. The method of claim 217, wherein the inquiry anomaly comprises an inconsistency between the conceptual representations and at least some of the reference data.
  - 219. The method of claim 218, wherein inquiry anomalies are given a scaled designation relating to the magnitude of the inquiry anomaly and ranked according to the scaled designation.
  - 220. The method of claim 219, further comprising associating one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 221. The method of claim 220, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 222. The method of claim 221, further comprising formulating responses only from the conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 223. The method of claim 221, further comprising deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 224. The method of claim 217, wherein the inquiry anomaly comprises an inconsistency internally within the conceptual representation.
  - 225. The method of claim 224, wherein inquiry anomalies are given a scaled designation relating to the magnitude of the inquiry anomaly and ranked according to the scaled designation.
  - 226. The method of claim 225, further comprising associating one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 227. The method of claim 226, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 228. The method of claim 227, further comprising formulating responses only from the conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 229. The method of claim 227, further comprising deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 230. The method of claim 211, wherein the step of deriving the conceptual representation comprises deriving one or more response conceptual representations.
  - 231. The method of claim 230, wherein the step of formulating one or more responses to the speech input comprises formulating one or more responses to the speech input based on one or more of the response conceptual representations.
  - 232. The method of claim 211, wherein at least one of the syntactic sequences of words comprises a sentence.
  - 233. The method of claim 211, wherein at least one of the syntactic sequences of words comprises any syntactic organization.
  - 234. The method of claim 211, further comprising associating semantic rules with each candidate word and each associated part of speech, and each syntactic sequence of words and each associated part of speech, wherein further the semantic rules relate to conceptual relationships between at least two of the candidate words and syntactic sequences of words.
  - 235. The method of claim 234, wherein the step of deriving the conceptual representation further comprises applying the semantic rules to the syntactic sequence of words, the candidate words or any combination thereof.
  - 236. The method of claim 235, wherein the semantic rules comprise an interpreted language.
  - 237. The method of claim 236, wherein the semantic rules comprise a predicate builder scripting language.
  - 238. The method of claim 237, wherein the semantic rules comprise a compiled language.
  - 239. The method of claim 211, wherein the candidate words comprising the syntactic sequences of words are assigned a score or probability, and the syntactic sequence of words is assigned a score or probability based on the scores or probabilities of the candidate words.
  - 240. The method of claim 211, wherein each of the candidate words is constructed based on candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the candidate phonemes making up the candidate word, and the syntactic sequence of words being assigned a score or probability based on the scores or probabilities of the candidate words making up the syntactic sequences of words.
  - 241. The method of claim 211, wherein processing the speech input comprises deriving candidate words from the result of the application of the Hidden Markov Model (HMM) technique to the speech input, the candidate words used to identify the syntactic sequences of words.
  - 242. The method of claim 211, wherein processing the speech input comprises deriving candidate phonemes from the result of the application of the Backus-Naur (BNF) technique to the speech input, the candidate phonemes being used to identify the list of candidate words.
  - 243. The method of claim 211, wherein processing the speech input comprises processing a series of time-slices to identify candidate phonemes, at least some of the time-slices including alternative candidate phonemes, and wherein the candidate phonemes are used to identify a list of candidate words, the candidate words being used to identify the syntactic sequences of words.
  - 244. The method of claim 211, wherein the step of deriving the conceptual representation comprises applying the principles of Conceptual Dependency to the syntactic sequences of words.
  - 245. The method of claim 211, wherein the step of processing the speech input to identify a plurality of syntactic sequences of words comprises inputting an acoustic input of digitized speech;
    - segmenting said digitized acoustic input into a plurality of time-slices;
      
      analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type;
      
      outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
      
      permuting the candidate phonemes to generate potential words represented by the speech input;
      
      generating a list of candidate words based on the potential words;
      
      permuting the candidate words to generate potential syntactic structures while respecting pronunciation boundaries of the candidate words;
      
      permuting at least two or more of the candidate words and potential syntactic structures while respecting word pronunciation boundaries of the candidate words and potential syntactic structures; and
      
      generating syntactic sequences of words from the permuted candidate words and potential syntactic structures.

246. A system for processing speech, comprising:
- means for identifying a plurality of syntactic sequences of words corresponding to a speech input, the syntactic sequences of words comprising candidate words, the candidate words and the syntactic sequences of words having at least one associated part of speech;
  
  means for deriving one or more conceptual representations from at least one of the syntactic sequences of words; and
  
  means for formulating one or more responses to the speech input based on one or more of the conceptual representations.
- View Dependent Claims (247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282)
- - 247. The system of claim 246, wherein the means for formulating the response comprises means for processing the conceptual representation in relation to reference data.
  - 248. The system of claim 247, wherein the reference data comprises a database.
  - 249. The system of claim 247, wherein the reference data comprises a physical measurement.
  - 250. The system of claim 247, further comprising means for communicating one or more of the responses.
  - 251. The system of claim 250, wherein the means for communicating one or more of the responses comprises at least one of audio response or visual response means.
  - 252. The system of claim 250, wherein the means for communicating one or more of the responses comprises text response means.
  - 253. The system of claim 250, wherein the means for communicating one or more of the responses comprises mechanical response means.
  - 254. The system of claim 250, further comprising means for identifying one or more inquiry anomalies in the speech input for at least one of the syntactic sequences of words.
  - 255. The system of claim 254, wherein the inquiry anomaly comprises an inconsistency between the conceptual representations and at least some of the reference data.
  - 256. The system of claim 255, further comprising ranking means for giving inquiry anomalies a scaled designation relating to the magnitude of the inquiry anomaly and ranking the inquiry anomalies according to the scaled designation.
  - 257. The system of claim 256, further comprising means to associate one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 258. The system of claim 257, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 259. The system of claim 258, further comprising means to formulate responses only from conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 260. The system of claim 258, further comprising means for deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 261. The system of claim 254, wherein the inquiry anomaly comprises an inconsistency internally within the conceptual representation.
  - 262. The system of claim 261, further comprising ranking means for giving inquiry anomalies a scaled designation relating to the magnitude of the inquiry anomaly and ranking the inquiry anomalies according to the scaled designation.
  - 263. The system of claim 262, further comprising means to associate one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 264. The system of claim 263, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 265. The system of claim 264, further comprising means to formulate responses only from conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 266. The system of claim 265, further comprising means for deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 267. The system of claim 246, further comprising means for deriving one or more responsive conceptual representations.
  - 268. The system of claim 267, wherein the means for formulating one or more responses to the speech input comprises means for formulating one or more responses to the speech input based on one or more of the responsive conceptual representations.
  - 269. The system of claim 246, wherein at least one of the syntactic sequences of words comprises a sentence.
  - 270. The system of claim 246, wherein at least one of the syntactic sequences of words comprises any syntactic organization.
  - 271. The system of claim 246, further comprising semantic rules associated with each candidate word and each associated part of speech, and each syntactic sequence of words and each associated part of speech, wherein further the semantic rules relate to conceptual relationships between at least two of the candidate words and syntactic sequences of words.
  - 272. The system of claim 271, wherein the means for deriving the conceptual representation further comprises means for applying the semantic rules to the syntactic sequence of words, the candidate words or any combination thereof.
  - 273. The system of claim 272, wherein the semantic rules comprise an interpreted language.
  - 274. The system of claim 273, wherein the semantic rules comprise a predicate builder scripting language.
  - 275. The system of claim 273, wherein the semantic rules comprise a compiled language.
  - 276. The system of claim 246, wherein the candidate words comprising the syntactic sequences of words are assigned a score or probability, and the syntactic sequence of words is assigned a score or probability based on the scores or probabilities of the candidate words.
  - 277. The system of claim 246, wherein each of the candidate words is constructed based on candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the candidate phonemes making up the candidate word, and the syntactic sequence of words being assigned a score or probability based on the scores or probabilities of the candidate words making up the syntactic sequences of words.
  - 278. The system of claim 246, wherein the means for identifying the syntactic sequences of words comprises means for deriving candidate words from the result of the application of the Hidden Markov Model (HMM) technique to the speech input, the candidate words used to identify the syntactic sequences of words.
  - 279. The system of claim 246, wherein the means for processing the speech input comprises means for deriving candidate phonemes from the result of the application of the Backus-Naur (BNF) technique to the speech input, the candidate phonemes being used to identify the list of candidate words.
  - 280. The system of claim 246, wherein the means for processing the speech input comprises means for processing a series of time-slices to identify candidate phonemes, at least some of the time-slices including alternative candidate phonemes, and wherein the candidate phonemes are used to identify a list of candidate words, the candidate words being used to identify the plurality of syntactic sequences of words.
  - 281. The system of claim 246, wherein the means for deriving the conceptual representation comprises means for applying the principles of Conceptual Dependency to the syntactic sequences of words.
  - 282. The system of claim 246, wherein the means for processing the speech input to identify a plurality of syntactic sequences of words comprises an inputting device for inputting an acoustic input of digitized speech;
    - a segmenter for segmenting said digitized acoustic input into a plurality of time-slices;
      
      an analysis device for analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type, wherein the output of the analysis device comprises a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
      
      a first permutation device for permuting the candidate phonemes to generate potential words represented by the speech input;
      
      a word generator for generating a list of candidate words based on the potential words;
      
      a second permutation device for permuting the candidate words to generate potential syntactic structures and for permuting at least two or more of the candidate words and potential syntactic structures while respecting word pronunciation boundaries of the candidate words and potential syntactic structures; and
      
      a syntactic sequence generator for generating syntactic sequences of words from the permuted candidate words and potential syntactic structures.

283. A method of processing speech, comprising:
- processing a speech input to identify a plurality of syntactic sequences of words, the syntactic sequences of words comprising candidate words, the candidate words and the syntactic sequences of words having at least one associated part of speech;
  
  deriving one or more conceptual representations from at least one of the syntactic sequences of words;
  
  processing at least one of the conceptual representations of at least one of the syntactic sequences of words according to a database of reference conceptual representations; and
  
  formulating one or more responses to the speech input.
- View Dependent Claims (284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316)
- - 284. The method of claim 283, wherein the means for formulating the response comprises means for processing the conceptual representation in relation to reference data.
  - 285. The method of claim 284, wherein the reference data comprises a database.
  - 286. The method of claim 284, wherein the reference data comprises a physical measurement.
  - 287. The method of claim 284, further comprising executing a command to communicate at least one of the responses.
  - 288. The method of claim 287, wherein the step of communicating the response comprises at least one of an audio response, a text response, a visual response or a mechanical response.
  - 289. The method of claim 287, further comprising identifying one or more inquiry anomalies in the speech input for at least one of the syntactic sequences of words.
  - 290. The method of claim 289, wherein the inquiry anomaly comprises an inconsistency between the conceptual representation and at least some of the reference data.
  - 291. The method of claim 290, wherein inquiry anomalies are given a scaled designation relating to the magnitude of the inquiry anomalies and ranked according to the scaled designation.
  - 292. The method of claim 291, further comprising associating one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representation.
  - 293. The method of claim 292, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 294. The method of claim 293, further comprising formulating responses only from the conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 295. The method of claim 293, further comprising deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 296. The method of claim 288, wherein the inquiry anomaly comprises an inconsistency internally within the conceptual representation.
  - 297. The method of claim 296, wherein inquiry anomalies are given a scaled designation relating to the magnitude of the inquiry anomalies and ranked according to the scaled designation.
  - 298. The method of claim 297, further comprising associating one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representation.
  - 299. The method of claim 298, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 300. The method of claim 299, further comprising formulating responses only from the conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 301. The method of claim 299, further comprising deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 302. The method of claim 283, wherein the step of processing at least one of the conceptual representations comprises comparing the derived conceptual representation to reference conceptual representations in the database.
  - 303. The method of claim 302, wherein the step of formulating one or more responses to the speech input comprises formulating one or more responses to the speech input based on a successful comparison of the conceptual representations to at least one reference conceptual representation in the database.
  - 304. The method of claim 283, wherein at least one of the syntactic sequences of words comprises a sentence.
  - 305. The method of claim 283, further comprising associating semantic rules with each candidate word and each associated part of speech, and each syntactic sequence of words and each associated part of speech, wherein further the semantic rules relate to conceptual relationships between at least two of the candidate words and syntactic sequences of words.
  - 306. The method of claim 305, wherein the step of deriving the conceptual representations further comprises applying the semantic rules to the syntactic sequence of words, the candidate words or any combination thereof.
  - 307. The method of claim 306, wherein the semantic rules comprise an interpreted language.
  - 308. The method of claim 307, wherein the semantic rules comprise a predicate builder scripting language.
  - 309. The method of claim 307, wherein the semantic rules comprise a compiled language.
  - 310. The method of claim 283, wherein the candidate words comprising the syntactic sequences of words are assigned a score or probability, and the syntactic sequence of words is assigned a score or probability based on the scores or probabilities of the candidate words.
  - 311. The method of claim 283, wherein each of the candidate words is constructed based on candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the candidate phonemes making up the candidate word, and the syntactic sequence of words being assigned a score or probability based on the scores or probabilities of the candidate words making up the syntactic sequences of words.
  - 312. The method of claim 283, wherein processing the speech input comprises deriving candidate words from the result of the application of the Hidden Markov Model (HMM) technique to the speech input, the candidate words used to identify the syntactic sequences of words.
  - 313. The method of claim 283, wherein processing the speech input comprises deriving candidate phonemes from the result of the application of the Backus-Naur (BNF) technique to the speech input, the candidate phonemes being used to identify the list of candidate words.
  - 314. The method of claim 283, wherein processing the speech input comprises processing a series of time-slices to identify candidate phonemes, at least some of the time-slices including alternative candidate phonemes, and wherein the candidate phonemes are used to identify a list of candidate words, the candidate words being used to identify the plurality of syntactic sequences of words.
  - 315. The method of claim 283, wherein the step of deriving the conceptual representation comprises applying the principles of Conceptual Dependency to the syntactic sequences of words.
  - 316. The method of claim 283, wherein the step of processing the speech input to identify a plurality of syntactic sequences of words comprises inputting an acoustic input of digitized speech;
    - segmenting said digitized acoustic input into a plurality of time-slices;
      
      analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type;
      
      outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
      
      permuting the candidate phonemes to generate potential words represented by the speech input;
      
      generating a list of candidate words based on the potential words;
      
      permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
      
      permuting at least two or more of the candidate words and potential syntactic structures while respecting word pronunciation boundaries of the candidate words and potential syntactic structures; and
      
      generating syntactic sequences of words from the permuted candidate words and potential syntactic structures.

317. A system for processing speech, comprising:
- means for identifying a plurality of syntactic sequences of words corresponding to a speech input, the syntactic sequences of words comprising candidate words, the syntactic sequences of words and candidate words having at least one associated part of speech;
  
  means for deriving one or more conceptual representations from at least one of the syntactic sequences of words;
  
  means for processing at least one of the conceptual representations of the syntactic sequences of words according to a database of reference conceptual representations; and
  
  means for formulating one or more responses to the speech input based on one or more of the conceptual representations.
- View Dependent Claims (318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353)
- - 318. The system of claim 317, wherein the means for formulating the response comprises means for processing the conceptual representation in relation to reference data.
  - 319. The system of claim 318, wherein the reference data comprises a database.
  - 320. The system of claim 318, wherein the reference data comprises a physical measurement.
  - 321. The system of claim 318, further comprising means for communicating one or more of the responses.
  - 322. The system of claim 321, wherein the means for communicating one or more of the responses comprises at least one of audio response or visual response means.
  - 323. The system of claim 321, wherein the means for communicating one or more of the responses comprises text response means.
  - 324. The system of claim 321, wherein the means for communicating one or more of the responses comprises mechanical response means.
  - 325. The system of claim 321, further comprising means for identifying one or more inquiry anomalies in the speech input for at least one of the syntactic sequences of words.
  - 326. The system of claim 325, wherein the inquiry anomaly comprises an inconsistency between the conceptual representations and at least some of the reference data.
  - 327. The system of claim 326, further comprising ranking means for giving inquiry anomalies a scaled designation relating to the magnitude of the inquiry anomalies and ranking the inquiry anomalies according to the scaled designation.
  - 328. The system of claim 327, further comprising means to associate one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 329. The system of claim 328, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 330. The system of claim 329, further comprising means to formulate responses only from conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 331. The system of claim 330, further comprising means for deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 332. The system of claim 325, wherein the inquiry anomaly comprises an inconsistency internally within the conceptual representation.
  - 333. The system of claim 332, further comprising ranking means for giving inquiry anomalies a scaled designation relating to the magnitude of the inquiry anomalies and ranking the inquiry anomalies according to the scaled designation.
  - 334. The system of claim 333, further comprising means to associate one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 335. The system of claim 334, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 336. The system of claim 335, further comprising means to formulate responses only from conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 337. The system of claim 336, further comprising means for deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 338. The system of claim 317, wherein the processing means of at least one of the conceptual representations comprises comparing the derived conceptual representation to reference conceptual representations in the database.
  - 339. The method of claim 338, wherein the means for formulating one or more responses to the speech input comprises means for formulating one or more responses to the speech input based on a successful comparison of the conceptual representations to at least one reference conceptual representation in the database.
  - 340. The system of claim 317, wherein at least one of the syntactic sequences of words comprises a sentence.
  - 341. The system of claim 317, wherein at least one of the syntactic sequences of words comprises any syntactic organization.
  - 342. The system of claim 317, further comprising semantic rules associated with each candidate word and each associated part of speech, and each syntactic sequence of words and each associated part of speech, wherein further the semantic rules relate to conceptual relationships between at least two of the candidate words and syntactic sequences of words.
  - 343. The system of claim 342, wherein the means for deriving the conceptual representation further comprises means for applying the semantic rules to the syntactic sequence of words, the candidate words that comprise the syntactic sequence of words or any combination thereof.
  - 344. The system of claim 343, wherein the semantic rules comprise an interpreted language.
  - 345. The system of claim 344, wherein the semantic rules comprise a predicate builder scripting language.
  - 346. The system of claim 344, wherein the semantic rules comprise a compiled language.
  - 347. The system of claim 317, wherein the candidate words comprising the syntactic sequences of words are assigned a score or probability, and the syntactic sequence of words is assigned a score or probability based on the scores or probabilities of the candidate words.
  - 348. The system of claim 317, wherein each of the candidate words is constructed based on candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the candidate phonemes making up the candidate word, and the syntactic sequence of words being assigned a score or probability based on the scores or probabilities of the candidate words making up the syntactic sequences of words.
  - 349. The system of claim 317, wherein the means for identifying the syntactic sequences of words comprises means for deriving candidate words from the result of the application of the Hidden Markov Model (HMM) technique to the speech input, the candidate words used to identify the syntactic sequences of words.
  - 350. The system of claim 317, wherein the means for processing the speech input comprises means for deriving candidate phonemes from the result of the application of the Backus-Naur (BNF) technique to the speech input, the candidate phonemes being used to identify the list of candidate words.
  - 351. The system of claim 317, wherein the means for processing the speech input comprises means for processing a series of time-slices to identify candidate phonemes, at least some of the time-slices including alternative candidate phonemes, and wherein the candidate phonemes are used to identify a list of candidate words, the candidate words being used to identify the plurality of syntactic sequences of words.
  - 352. The system of claim 317, wherein the means for deriving the conceptual representation comprises means for applying the principles of Conceptual Dependency to the syntactic sequences of words.
  - 353. The system of claim 317, wherein the means for processing the speech input to identify a plurality of syntactic sequences of words comprises an inputting device for inputting an acoustic input of digitized speech;
    - a segmenter for segmenting said digitized acoustic input into a plurality of time-slices;
      
      an analysis device for analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type, wherein the output of the analysis device comprises a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
      
      a first permutation device for permuting the candidate phonemes to generate potential words represented by the speech input;
      
      a word generator for generating a list of candidate words based on the potential words;
      
      a second permutation device for permuting the candidate words to generate potential syntactic structures and for permuting at least two or more of the candidate words and potential syntactic structures while respecting word pronunciation boundaries of words from the permuted candidate words and potential syntactic structures; and
      
      a syntactic sequence generator for generating syntactic sequences of words from the permuted candidate words and potential syntactic structures.

354. A method for improving accuracy in dictation or transcription, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type;
  
  outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
  
  permuting the candidate phonemes to generate potential words represented by the speech input;
  
  generating a list of candidate words based on the potential words;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures; and
  
  communicating the syntactic sequences of words.
- View Dependent Claims (355, 356, 357)
- - 355. The method of claim 354, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 356. The method of claim 354, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 357. The method of claim 354, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

358. A method for improving accuracy in dictation or transcription, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate words derived from an N-best list of potential words from an application of the HMM technique;
  
  further identifying additional candidate words based on combinations of two or more consecutive N-best list potential words;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures; and
  
  communicating the syntactic sequences of words.
- View Dependent Claims (359, 360, 361)
- - 359. The method of claim 358, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 360. The method of claim 358, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 361. The method of claim 358, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

362. A method for generating punctuation in dictation or transcription, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type;
  
  outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
  
  permuting the candidate phonemes to generate potential words represented by the speech input;
  
  generating a list of candidate words based on the potential words;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  generating punctuation based on the syntactic sequences of words; and
  
  communicating the syntactic sequences of words.
- View Dependent Claims (363, 364, 365)
- - 363. The method of claim 362, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 364. The method of claim 362, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 365. The method of claim 362, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

366. A method for generating punctuation in dictation or transcription, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate words derived from an N-best list of potential words from an application of the HMM technique;
  
  further identifying additional candidate words based on combinations of two or more consecutive N-best list potential words;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  generating punctuation based on the syntactic sequences of words; and
  
  communicating the syntactic sequences of words.
- View Dependent Claims (367, 368, 369)
- - 367. The method of claim 366, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 368. The method of claim 366, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 369. The method of claim 366, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

370. A method for improving accuracy in dictation or transcription, comprising:
- inputting an acoustic input of digital speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate words based on the application of the HMM technique;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  calculating a conceptual representation for each syntactic sequence of words; and
  
  communicating the syntactic sequence of words related to the first valid calculated conceptual representation.
- View Dependent Claims (371, 372, 373)
- - 371. The method of claim 370, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 372. The method of claim 370, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 373. The method of claim 370, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

374. A method for generating punctuation in dictation or transcription, comprising:
- inputting an acoustic input of digital speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate words based on the application of the HMM technique;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  calculating a conceptual representation for each syntactic sequence of words;
  
  generating punctuation based on the syntactic sequences of words; and
  
  communicating the syntactic sequence of words and punctuation related to the first valid calculated conceptual representation.
- View Dependent Claims (375, 376, 377)
- - 375. The method of claim 374, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 376. The method of claim 374, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 377. The method of claim 374, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

378. A method for improving accuracy in dictation or transcription, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type;
  
  outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
  
  permuting the candidate phonemes to generate potential words represented by the speech input;
  
  generating a list of candidate words based on the potential words;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  calculating a conceptual representation for each of the syntactic sequences of words; and
  
  communicating the syntactic sequence of words related to the first valid conceptual representation.
- View Dependent Claims (379, 380, 381)
- - 379. The method of claim 378, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 380. The method of claim 378, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 381. The method of claim 378, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

382. A method for generating punctuation in dictation or transcription, comprising:
- inputting an acoustic input of digitized speech;
  
  segmenting said digitized acoustic input into a plurality of time-slices;
  
  analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type;
  
  outputting a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step;
  
  permuting the candidate phonemes to generate potential words represented by the speech input;
  
  generating a list of candidate words based on the potential words;
  
  permuting the candidate words to generate potential syntactic structures while respecting word boundaries of the candidate words;
  
  permuting at least two or more of the candidate words and potential syntactic structures while respecting word boundaries of the candidate words and potential syntactic structures;
  
  generating syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  calculating a conceptual representation for each of the syntactic sequences of words;
  
  generating punctuation based on the syntactic sequences of words; and
  
  communicating the syntactic sequence of words and punctuation related to the first valid conceptual representation.
- View Dependent Claims (383, 384, 385)
- - 383. The method of claim 382, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a video display terminal
  - 384. The method of claim 382, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 385. The method of claim 382, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.

386. A system for recognizing concepts in speech, comprising:
- a phoneme recognition unit for identifying candidate phonemes in a digitized input, wherein at least some of the candidate phonemes are alternative candidate phonemes;
  
  a phoneme stream analyzer for identifying a list of candidate words constructed from the candidate phonemes, wherein at least some of the candidate words are alternative candidate words corresponding to the same portion or an overlapping portion of the input;
  
  a word permutation unit for permuting the candidate words to create a plurality of potential syntactic structures, wherein at least one of the plurality of potential syntactic structures is selected as corresponding to the input, wherein further the word permutation unit is further adapted for syntactically validating the potential syntactic structures to render syntactically valid sequences of words;
  
  means for extracting conceptual representations of syntactically valid sequences of words;
  
  means for comparing the conceptual representations to reference data; and
  
  means for communicating one or more successful comparisons of the conceptual representations in relation to the reference data.
- View Dependent Claims (387, 388, 389, 390, 391)
- - 387. The system of claim 386, wherein the reference data comprises a database.
  - 388. The system of claim 386, wherein the reference data comprises a physical measurement.
  - 389. The system of claim 386, wherein the means for communicating one or more of the successful comparisons comprises at least one of audio response, visual response, text response or mechanical response means.
  - 390. The system of claim 386, wherein the input comprises an acoustic sample of digitized speech.
  - 391. The system of claim 386, wherein the input comprises electronic form of text.

392. A method for recognizing concepts in speech, comprising:
- identifying candidate phonemes in a digitized input, wherein at least some of the candidate phonemes are alternative candidate phonemes;
  
  identifying a list of candidate words constructed from the candidate phonemes, wherein at least some of the candidate words are alternative candidate words corresponding to the same portion or an overlapping portion of the input;
  
  permuting the candidate words to create a plurality of potential syntactic structures, wherein at least one of the plurality of potential syntactic structures is selected as corresponding to the input;
  
  syntactically validating the potential syntactic structures to render syntactically valid sequences of words;
  
  extracting conceptual representations of syntactically valid sequences of words;
  
  comparing the conceptual representations to reference data; and
  
  communicating one or more successful comparisons of the conceptual representations in relation to the reference data.
- View Dependent Claims (393, 394, 395, 396, 397)
- - 393. The method of claim 392, wherein the reference data comprises a database.
  - 394. The method of claim 392, wherein the reference data comprises a physical measurement.
  - 395. The method of claim 392, wherein communicating one or more of the successful comparisons comprises at least one of audio communication, visual communication, text communication or mechanical communication.
  - 396. The method of claim 392, further comprising inputting an input of digitized speech.
  - 397. The method of claim 392, further comprising inputting an input of electronic form of text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Chemtron Research LLC (Intellectual Ventures LLC)
Original Assignee
Conceptual Speech LLC
Inventors
Roy, Philippe

Granted Patent

US 7,286,987 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/211
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/1822   Parsing for meaning underst...

G10L 2015/025   Phonemes, fenemes or fenone...

Multi-phoneme streamer and knowledge representation speech recognition system and method

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

124 Citations

401 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-phoneme streamer and knowledge representation speech recognition system and method

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

124 Citations

401 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links