Multi-phoneme streamer and knowledge representation speech recognition system and method

US 7,286,987 B2
Filed: 06/30/2003
Issued: 10/23/2007
Est. Priority Date: 06/28/2002
Status: Active Grant

First Claim

Patent Images

1. A method of processing speech, comprising:

generating a list of candidate words for at least one set of phonemes, each candidate word having a pronunciation boundary, from a phoneme analysis of a received speech input;

permuting at least one member of the list of candidate words for the at least one set of phonemes to generate a plurality of potential syntactic structures which are valid in accordance with a set of syntactic rules, while respecting pronunciation boundaries of the candidate words;

generating a plurality of valid syntactic sequences of words from the permuted candidate words and potential syntactic structures;

processing a speech input to identify a plurality of syntactic sequences of words, the syntactic sequences of words comprising the candidate words, the candidate words and the syntactic sequences of words each having at least one associated part of speech;

deriving one or more conceptual representations lion at least one of the syntactic sequences of words; and

formulating one or more responses to the speech input based on at least one conceptual representation.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method related to a new approach to speech recognition that reacts to concepts conveyed through speech. In its fullest implementation, the system and method shifts the balance of power in speech recognition from straight sound recognition and statistical models to a more powerful and complete approach determining and addressing conveyed concepts. This is done by using a probabilistically unbiased multi-phoneme recognition process, followed by a phoneme stream analysis process that builds the list of candidate words derived from recognized phonemes, followed by a permutation analysis process that produces sequences of candidate words with high potential of being syntactically valid, and finally, by processing targeted syntactic sequences in a conceptual analysis process to generate the utterance'"'"'s conceptual representation that can be used to produce an adequate response. The invention can be employed for a myriad of applications, such as improving accuracy or automatically generating punctuation for transcription and dictation, word or concept spotting in audio streams, concept spotting in electronic text, customer support, call routing and other command/response scenarios.

75 Citations

View as Search Results

290 Claims

1. A method of processing speech, comprising:
- generating a list of candidate words for at least one set of phonemes, each candidate word having a pronunciation boundary, from a phoneme analysis of a received speech input;
  
  permuting at least one member of the list of candidate words for the at least one set of phonemes to generate a plurality of potential syntactic structures which are valid in accordance with a set of syntactic rules, while respecting pronunciation boundaries of the candidate words;
  
  generating a plurality of valid syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  processing a speech input to identify a plurality of syntactic sequences of words, the syntactic sequences of words comprising the candidate words, the candidate words and the syntactic sequences of words each having at least one associated part of speech;
  
  deriving one or more conceptual representations lion at least one of the syntactic sequences of words; and
  
  formulating one or more responses to the speech input based on at least one conceptual representation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 195, 196)
- - 2. The method of claim 1, wherein the step of formulating the response comprises processing the conceptual representation in relation to reference data.
  - 3. The method of claim 2, wherein the reference data comprises a database.
  - 4. The method of claim 2, wherein the reference data comprises a physical measurement.
  - 5. The method of claim 2, further comprising executing a command to communicate at least one of the responses.
  - 6. The method of claim 5, wherein the step of communicating the response comprises at least one of an audio response, a text response, a visual response or a mechanical response.
  - 7. The method of claim 6, further comprising identifying one or more inquiry anomalies in the speech input for at least one of the syntactic sequences of words.
  - 8. The method of claim 7, wherein the inquiry anomaly comprises an inconsistency between the conceptual representations and at least some of the reference data.
  - 9. The method of claim 8, wherein inquiry anomalies are given a scaled designation relating to the magnitude of the inquiry anomaly and ranked according to the sealed designation.
  - 10. The method of claim 8, further comprising associating one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 11. The method of claim 10, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 12. The method of claim 11, further comprising formulating responses only from the conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 13. The method of claim 11, further comprising deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 14. The method of claim 7, wherein the inquiry anomaly comprises an inconsistency internally within the conceptual representation.
  - 15. The method of claim 14, wherein inquiry anomalies are given a sealed designation relating to the magnitude of the inquiry anomaly and ranked according to the sealed designation.
  - 16. The method of claim 15, further comprising associating one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 17. The method of claim 16, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 18. The method of claim 17, further comprising formulating responses only from the conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 19. The method of claim 17, further comprising deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 20. The method of claim 1, wherein the step of deriving the conceptual representation comprises deriving one or more response conceptual representations.
  - 21. The method of claim 20, wherein the step of formulating one or more responses to the speech input comprises formulation one or more responses to the speech input based on one or more of the response conceptual representations.
  - 22. The method of claim 1, wherein at least one of the syntactic sequences of words comprises a sentence.
  - 23. The method of claim 1, wherein at least one of the syntactic sequences of words comprises any syntactic organization.
  - 24. The method of claim 1, further comprising associating semantic rules with each candidate word and each associated part of speech, and each syntactic sequence of words and each associated part of speech, wherein further the semantic rules relate to conceptual relationships between at least two of the candidate words and syntactic sequences of words.
  - 25. The method of claim 24, wherein the step of deriving the conceptual representation further comprises applying the semantic rules to the syntactic sequence of words, the candidate words or any combination thereof.
  - 26. The method of claim 25, wherein the semantic rules comprise an interpreted language.
  - 27. The method of claim 26, wherein the semantic rules comprise a predicate builder scripting language.
  - 28. The method of claim 27, wherein the semantic rules comprise a compiled language.
  - 29. The method of claim 1, wherein the candidate words comprising the syntactic sequences of words are assigned a score or probability, and the syntactic sequence of words is assigned a score or probability based on the scores or probabilities of the candidate words.
  - 30. The method of claim 1, wherein each of the candidate words is constructed based on candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the candidate phonemes making up the candidate word, and the syntactic sequence of words being assigned a score or probability based on the scores or probabilities of the candidate words making up the syntactic sequences of words.
  - 31. The method of claim 1, wherein processing the speech input comprises deriving candidate words from the result of the application of the Hidden Markov Model (HMM) technique to the speech input, the candidate words used to identify the syntactic sequences of words.
  - 32. The method of claim 1, wherein processing the speech input comprises deriving candidate phonemes from the result of the application of the Backus-Naur (BNF) technique to the speech input, the candidate phonemes being used to identify the list of candidate words.
  - 33. The method of claim 1, wherein the step of deriving the conceptual representation comprises applying the principles of Conceptual Dependency to the syntactic sequences of words.
  - 34. The method of claim 1, wherein the step of processing the speech input to identify a plurality of syntactic sequences of words comprises:
    - inputting an acoustic input of digitized speech;
      
      segmenting said digitized acoustic input into plurality of time-slices; and
      
      analyzing, each time-slice to identify one or more candidate phoneme based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type.
  - 35. The method according to claim 1, further comprising:
    - segmenting the speech input into a plurality of time-slices;
      
      analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type; and
      
      defining a phoneme stream of identified candidate phonemes based on the analysis, wherein at least some time-slices are represented by alternative candidate phonemes based on said analyzing step.wherein the defined phoneme stream is processed to identify at least one of the candidate words.
  - 36. The method of claim 35, wherein the reference cluster sets are specific to at least two of a (a) region or accent, (b) a gender, and (c) an age, age range, or child/adult distinction.
  - 37. The method of claim 35, wherein at least sonic of the identified candidate phonemes for a time-slice originate from different cluster sets.
  - 38. The method of claim 35, wherein the phoneme stream comprises candidate phonemes from different cluster sets, thereby enabling recognition that the acoustic input represents speech by more than one person.
  - 39. The method of claim 38, wherein the processing step is adapted to process speech input from both male and female speakers.
  - 40. The method of claim 38, wherein the recognition is that the speakers have different accents.
  - 41. The method of claim 35, wherein the segmented speech input comprises non-overlapping time-slices.
  - 42. The method of claim 35, wherein the segmented speech input comprises overlapping time-slices.
  - 43. The method of claim 35, wherein the segmented speech input comprises both overlapping and nonoverlapping time-slices.
  - 44. The method of claim 43, wherein a subsequent time-slice is selected to be overlapping or nonoverlapping based on the results of the analysis of the previous time-slice.
  - 45. The method of claim 35, wherein the phoneme stream is further processed for transcription or dictation.
  - 46. The method of claim 35, wherein the phoneme stream is further processed to provide a response to a query represented by the speech input.
  - 47. The method of claim 46, wherein the response is an acoustic response.
  - 48. The method of claim 46, wherein the response is a text-based response.
  - 49. The method of claim 46, wherein the response is a system response based on the interpreted content of the acoustic input.
  - 50. The method of claim 35, wherein the plurality of reference cluster sets correspond to more than one language, thereby enabling detection of the language of the acoustic input.
  - 51. The method according to claim 1, further comprising:
    - segmenting said speech input into a plurality of time-slices;
      
      analyzing each time-slice to identify a candidate phoneme based on a plurality of reference cluster sets, each cluster set representing reference phonemes for that cluster type; and
      
      defining a phoneme stream of identified candidate phonemes based on the analysis.
  - 52. The method of claim 51, wherein the reference cluster sets further include triphone variations of the reference phonemes used in order to identify a candidate phoneme based on a triphone pronunciation.
  - 53. The method of claim 51, wherein the step of analyzing comprises applying a neural network.
  - 54. The method of claim 51, wherein the step of analyzing comprises applying formant analysis.
  - 55. The method of claim 51, wherein the step of analyzing comprises applying a multivariate Gaussian classifier.
  - 56. The method of claim 51, wherein the candidate phonemes are identified based on application of a threshold.
  - 57. The method of claim 56, wherein the threshold is fixed.
  - 58. The method of claim 56, wherein the threshold is adaptive.
  - 59. The method of claim 56, wherein there is a threshold for each of the reference cluster sets at least some of the thresholds being different.
  - 60. The method of claim 56, wherein there is a threshold for each reference phoneme for each cluster set, at least some of the reference phoneme thresholds for a given cluster set being different.
  - 61. The method of claim 56, wherein there is a threshold for each reference phoneme for each cluster set, at least some of the cluster sets having different thresholds for the same reference phoneme.
  - 62. The method of claim 51, further comprising the step of processing the phoneme stream to identify candidate words based on the candidate phonemes.
  - 63. The method of claim 62, wherein at least some of the candidate words are alternative candidate words representing candidate words from the same or an overlapping portion of the speech input.
  - 64. The method of claim 62, wherein the candidate words are scored based on the scores or probabilities of the candidate phonemes making up the candidate words.
  - 65. The method of claim 62, wherein scoring the candidate words comprises aggregating or averaging the scores or probabilities of the candidate phonemes used to construct the candidate words.
  - 66. The method of claim 65, further comprising ranking the candidate words based on the scores of the candidate words.
  - 67. The method of claim 62, wherein processing the phoneme stream to identify candidate words comprises generating search paths representing a permutation of candidate phonemes among the time-slices, each search path potentially representing at least a partial valid pronunciation of a word in a dictionary.
  - 68. The method of claim 67, wherein a search path is dropped or treated as invalid when the addition of a candidate phoneme from a further time-slice would result in no at least partial valid pronunciation of a word in a dictionary.
  - 69. The method of claim 67, wherein a search path is dropped or treated as invalid upon the addition of at least two non-matching candidate phonemes, a first non-matching candidate phoneme resulting in no correspondence to at least a partial valid pronunciation of a word in a dictionary, and a second non-matching candidate phoneme resulting in no correspondence to at least a partial valid pronunciation of a word in a dictionary when ignoring the first phoneme.
  - 70. The method of claim 62, wherein processing the phoneme stream to identify candidate words accounts for bridging wherein the speech comprises a phoneme which is effectively shared between two words.
  - 71. The method of claim 62, wherein processing the phoneme stream to identify candidate words accounts for bridging wherein the speech comprises adjacent phonemes having similar pronunciations.
  - 72. The method of claim 62, wherein processing the phoneme stream to identify candidate words based on the candidate phonemes is implemented, by processing candidate phonemes from a time-slice in a descending order of probability or score, thereby providing candidate words that are naturally sorted according to a descending order of score for the candidate words.
  - 73. The method of claim 62, wherein processing the phoneme stream to identify candidate words comprises permuting the candidate phonemes from different points in the acoustic input to construct combinations of phonemes comprising potential words.
  - 74. The method of claim 73, wherein the permutation is between different time-slices having identified candidate phonemes.
  - 75. The method of claim 73, wherein potential words are processed according to a dictionary in order to identify the candidate words.
  - 76. The method of claim 75, wherein the dictionary comprises a plurality of words and pronunciations of words.
  - 77. The method of claim 62, wherein the candidate words correspond to at least a two-dimensional array of candidate word, a first dimension corresponding to time across the acoustic input, and a second dimension corresponding to alternative candidate words for the same or an overlapping interval of time across the acoustic input.
  - 78. The method of claim 62, wherein the candidate words are constructed using candidate phonemes originating from the same reference cluster set.
  - 79. The method of claim 62, wherein the candidate words are capable of being constructed using candidate phonemes originating from differing cluster sets.
  - 80. The method of claim 51, wherein said method is implemented in an application for transcription or dictation.
  - 81. The method of claim 51, wherein said method is implemented in an application for generating a response to a query represented by said acoustic input.
  - 82. The method according to claim 1, further comprising:
    - generative a phoneme stream by processing a digitized speech sample to identify candidate phonemes including at least some alternative candidate phonemes; and
      
      generating a list of candidate words for the phoneme stream based on the potential words.
  - 83. The method of claim 82, wherein the phoneme stream is stored for performing the permuting step at a later time.
  - 84. The method of claim 82, where the at least one of phoneme stream and the candidate words are stored for further processing.
  - 85. The method of claim 82, further comprising processing the potential words according to a dictionary to identify candidate words.
  - 86. The method of claim 85, wherein permuting the candidate phonemes comprises permuting candidate phonemes between different time-slice to create a search path, and wherein processing according to a dictionary comprises processing the permuted phonemes of the search path to determine correspondence to at least a partial valid pronunciation for a word in the dictionary.
  - 87. The method of claim 86, wherein a search path is expanded by permuting the search path to add a candidate phoneme from a further time-slice.
  - 88. The method of claim 87, wherein an expanded search path is terminated or dropped when permutation with the further candidate phoneme results in no correspondence to at least a partial valid pronunciation of a word from the dictionary.
  - 89. The method of claim 87, wherein a search path is terminated or dropped upon the permutation with at least two further consecutive non-matching candidate phonemes, the first non-matching candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary, and the second non-matching candidate phoneme resulting in no at least partial valid pronunciation when ignoring the first non-matching candidate phoneme, thereby providing an error tolerant system.
  - 90. The method of claim 86, wherein a separate search path is created for each candidate phoneme in a time-slice.
  - 91. The method of claim 90, wherein the separate search paths are created in a descending order beginning with the candidate phoneme in the time-slice with the highest score or probability, thereby naturally sorting potential words based on scores or probabilities.
  - 92. The method of claim 91, wherein further permuting a search path to add a candidate phoneme from a further time-slice comprises selecting candidate phonemes from the further time-slice in a descending order beginning with the candidate phoneme with the highest score or probability.
  - 93. The method of claim 82, wherein at least some of the candidate words are alternative candidate words for the same portion or an overlapping portion of the digitized speech sample.
  - 94. The method of claim 93, wherein the identified candidate words correspond to an at least two-dimensional array of candidate words, a first dimension corresponding to time across the speech sample, and a second dimension corresponding to alternative candidate words for the same or overlapping portions of the speech sample.
  - 95. The method of claim 93, wherein the identified candidate words are scored according to probabilities or scores of candidate phonemes making up the candidate words, and wherein alternative candidate words are ranked according to the scores of the alternative candidate words.
  - 96. The method of claim 82, wherein generating the phoneme stream further comprises computing or identifying scores or probabilities for the candidate phonemes.
  - 97. The method of claim 96, further comprising the step of scoring the candidate words based on the scores or probabilities of the candidate phonemes making up the candidate words.
  - 98. The method of claim 82, wherein the candidate phonemes are permuted by processing candidate phonemes from each time-slice in a descending order of probability or score, thereby providing candidate words that are naturally sorted according to a descending order of score for the candidate words.
  - 99. The method of claim 82, wherein the phoneme stream is generated by deriving candidate words from an N-best list of potential words generated from the application of the Hidden Markov Model (HMM) technique to the speech sample, further deriving additional candidate words from combinations of two or more consecutive N-best list potential words, and deriving candidate phonemes from the candidate words.
  - 100. The method of claim 82, wherein the phoneme stream is generated by deriving candidate phonemes from the results generated by application of the Backus-Naur (BNF) technique to the speech sample.
  - 101. The method of claim 82, further comprising the step of permuting the candidate words to generate potential syntactic structures, the potential syntactic structures comprising sequences of words which are potentially syntactically valid.
  - 102. The method of claim 101, further comprising the step of permuting potential syntactic structures with at least one of(a) potential syntactic structures or (b) candidate words, to generate further potential syntactic structures.
  - 103. The method of claim 101, further comprising syntactically analyzing the potential syntactic structures to generate syntactically valid sequences of words.
  - 104. The method of claim 103, wherein the syntactic analysts is carried out to respect interjections, so that the presence of interjections does not result in invalidating an otherwise valid sequence of words.
  - 105. The system of claim 103, wherein the syntactic analysis is implemented as one of a bottom-up parsing process, a top-down parsing process, an Early parsing process, a finite-state parsing process, and a CYK parsing process.
  - 106. The method of claim 103, wherein syntactically analyzing comprises applying syntactic transform scripts to the potential syntactic structures.
  - 107. The method of claim 101, further comprising identifying at least one of the syntactically valid sequences of words as a sentence, and deriving a conceptual representation of the at least one sentence.
  - 108. The method of claim 82, wherein the list of candidate words is further processed for transcription or dictation.
  - 109. The method of claim 82, wherein the list of candidate words is further processed to provide a response to a query represented by the speech sample.
  - 110. The method of claim 82, wherein the candidate phonemes are identified through pattern recognition applied to cluster sets of reference phonemes.
  - 111. The method of claim 82, wherein the candidate phonemes are identified through pattern recognition applied to cluster sets including reference triphones.
  - 112. The method according to claim 1, further comprising:
    - processing said speech input to identity a plurality of candidate phonemes;
      
      computing for each candidate phoneme a score or probability;
      
      aggregating at least some of said plurality of candidate phonemes into potential words; and
      
      processing the computed scores or probabilities of the candidate phonemes.
  - 113. The method of claim 112, wherein the acoustic input comprises a plurality of time-slices, the time-slices being processed to identify candidate phonemes, and wherein at least some of the time-slices are processed to identify multiple candidate phonemes which represent alternative candidate phonemes.
  - 114. The method of claim 113, wherein the identified candidate phonemes are organized as a phoneme stream representing the candidate phonemes which were capable of being detected for the plurality of times-slices.
  - 115. The method of claim 112, further comprising processing the candidate phonemes in a plurality of different combinations to generate a plurality of potential words, wherein at least some of the potential words are alternative potential words comprising potential words for the same or an overlapping portion of time in the speech, the potential words either comprising or being further processed to define the candidate words.
  - 116. The method of claim 115, wherein processing the computed scores or probabilities of the candidate phonemes comprises scoring the potential words based on the scores or probabilities of the candidate phonemes making up the potential words.
  - 117. The method of claim 116, wherein the scores of the alternative potential words are used to rank the alternative potential words.
  - 118. The method of claim 116, wherein the scores of the alternative potential words are evaluated to select the alternative potential word with the most favorable score.
  - 119. The method of claim 112, wherein processing the computed scores or probabilities of the candidate phonemes comprises using the computed scores or probabilities of the candidate phonemes to select the order in which candidate phonemes are aggregated into candidate words.
  - 120. The method of claim 112, wherein aggregating comprises permuting at least some of the candidate phonemes from different time-slice to generate possible combinations resulting in potential words.
  - 121. The method of claim 120, wherein processing the computed scores or probabilities of the candidate phonemes comprises using said scores or probabilities for purposes of ordering permutation of the candidate phonemes.
  - 122. The method of claim 120, wherein potential words are processed according to a dictionary in order to identify candidate words from the potential words.
  - 123. The method of claim 122, wherein the candidate words are identified without consideration of the scores or probabilities of the candidate phonemes making up the potential words.
  - 124. The method of claim 122, wherein the candidate words are identified by processing based on both the dictionary and the scores or probabilities of the candidate phonemes making up the potential words.
  - 125. The method of claim 112, wherein processing said acoustic input to identify candidate phonemes is based on a plurality of cluster sets having reference. phonemes.
  - 126. The method of claim 112, wherein processing said acoustic input to identify candidate phonemes is based on a plurality of cluster sets having reference triphones.
  - 127. The method of claim 112, wherein the potential words are further processed for a transcription or dictation application.
  - 128. The method of claim 112, wherein the potential words are further processed for formulating a response to a query represented by the acoustic input.
  - 129. The method according to claim 1,wherein at least some of the candidate words are alternative candidate words corresponding to the same or an overlapping portion of the speech input.
  - 130. The method of claim 129, wherein permuting at least one member of the list of candidate words is carried out to give consideration to word pronunciation boundaries, thereby creating potential syntactic structures comprised of candidate words with beginning boundaries and an end boundaries that do not conflict with the beginning boundaries and end boundaries of other candidate words pronunciations.
  - 131. The method of claim 130, wherein the permutation is carried out only for combinations of candidate words without conflicting pronunciation boundaries.
  - 132. The method of claim 129, further comprising permuting the plurality of syntactic sequences with at least one of (a) potential syntactic structures or (b) candidate words, to generate further potential syntactic sequences.
  - 133. The method of claim 129, further comprising syntactically analyzing the syntactic structures to generate syntactically valid sequences of words.
  - 134. The method of claim 133, wherein the syntactic analysis is carried out to respect interjections so that the presence of an interjection does not invalidate an otherwise syntactically valid sequence of words.
  - 135. The method of claim 133, wherein the syntactic analysis is implemented as a bottom-up parsing process, top-down parsing process. Early parsing process, finite-state parsing process, or CYK parsing process.
  - 136. The method of claim 133, wherein syntactically analyzing comprises applying syntactic transform scripts to the syntactic structures.
  - 137. The method of claim 129, wherein each of the candidate words is assigned a score or probability.
  - 138. The method of claim 129, wherein each of the syntactic sequences is assigned a score or probability.
  - 139. The method of claim 129, wherein each of the candidate words is assigned a score or probability, and further wherein each of the syntactic sequences is assigned a score or probability based on the scores or probabilities of the candidate words used to construct the respective syntactic sequence.
  - 140. The method of claim 129, wherein each candidate word is constructed from candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the scores or probabilities of the candidate phonemes making up the candidate word, and further wherein each of the syntactic sequences is assigned a score or probability.
  - 141. The method of claim 129, wherein processing the speech input comprises producing the candidate words front an N-best list of potential words produced by application of the Hidden Markov Model (HMM) technique to the speech sample and also from combinations of two or more consecutive N-best list potential words.
  - 142. The method of claim 129, wherein processing the speech comprises processing a series of time-slices to identify candidate phonemes, at least some of the time segments including alternative candidate phonemes.
  - 143. The method of claim 129, wherein the selected syntactic sequence is a syntactically valid sequence of words comprising a sentence.
  - 144. The method of claim 1, further comprising:
    - segmenting the speech input into a plurality of time-slices;
      
      analyzing each time-slice to identify one or more candidate triphones based on a plurality of reference cluster sets, each cluster set representing reference triphones for a cluster type.
  - 145. The method of claim 144, further comprising processing the identified candidate triphones according to a triphone-based dictionary to identify candidate words.
  - 146. The method according to claim 1, further comprising:
    - communicating the syntactic sequences of words.
  - 147. The method of claim 146, wherein the step of communicating the syntactic sequences of words comprises displaying the syntactic sequences of words on a display.
  - 148. The method of claim 146, wherein the step of communicating the syntactic sequences of words comprises storing the syntactic sequences of words in a computer memory.
  - 149. The method of claim 146, wherein the step of communicating the syntactic sequences of words comprises outputting the syntactic sequences of words in at least one of human readable or audible form.
  - 150. The method according to claim 1, the generating step further comprising:
    - segmenting the input into a plurality of time-slices;
      
      analyzing each time-slice to identify one or more candidate words derived from an N-best list of potential words from an application of the HMM technique; and
      
      further identifying additional candidate words based on combinations of two or more consecutive N-best list potential words.
  - 151. The method of claim 150, further comprising the step of communicating the syntactic sequences of words by displaying the syntactic sequences of words on a display.
  - 152. The method of claim 150, further comprising the step of communicating the syntactic sequences of words by storing the syntactic sequences of words in a computer memory.
  - 153. The method of claim 150, further comprising the step of communicating the syntactic sequences of words by outputting the syntactic sequences of words in at least one of human readable or audible form.
  - 154. The method according to claim 1, said processing step further comprising:
    - segmenting the speech input into a plurality of time-slices; and
      
      analyzing each time-slice to identify one or more candidate words based on the application of the HMM technique.
  - 155. The method of claim 154, further comprising the step of communicating the syntactic sequences of words by displaying the syntactic sequences of words on a display.
  - 156. The method of claim 154, further comprising the step of communicating the syntactic sequences of words by storing the syntactic sequences of words in a computer memory.
  - 157. The method of claim 154, further comprising the step of communicating the syntactic sequences of words by outputting the syntactic sequences of words in at least one of human readable or audible form.
  - 195. The method of claim 1, wherein the step of processing at least one of the conceptual representations comprises comparing the derived conceptual representation to reference conceptual representations in the database.
  - 196. The method of claim 195, wherein the step of formulating one or more responses to the speech input comprises formulating one or more responses to the speech input based on a successful comparison of the conceptual representation to at least one reference conceptual representation in the database.

158. A system for processing speech, comprising:
- a phoneme analyzer, receiving a speech input, generating a list of candidate words for at least one set of phonemes, each candidate word having a pronunciation boundary, the candidate words being permuted to generate a plurality of potential syntactic structures which are valid in accordance with a set of syntactic rules, the candidate words and plurality of potential syntactic structures each having an associated part of speech;
  
  means for identifying a plurality of syntactic sequences of words from the potential syntactic structures and candidate words;
  
  means for deriving one or more conceptual representations from at least one of the syntactic sequences of words; and
  
  means for formulating one or more responses to the speech input based on one or more of the conceptual representations.
- View Dependent Claims (159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279)
- - 159. The system of claim 158, wherein the means for formulating the response comprises means for processing the conceptual representation in relation to reference data.
  - 160. The system of claim 159, wherein the reference data comprises a database.
  - 161. The system of claim 159, wherein the reference data comprises a physical measurement.
  - 162. The system of claim 159, further comprising means for communicating one or more of the responses.
  - 163. The system of claim 162, wherein the means for communicating one or more of the responses comprises at least one of audio response or visual response means.
  - 164. The system of claim 162, wherein the means for communicating one or more of the responses comprises text response means.
  - 165. The system of claim 162, wherein the means for communicating one or more of the responses comprises mechanical response means.
  - 166. The system of claim 162, further comprising means for identifying one or more inquiry anomalies in the speech input for at least one of the syntactic sequences of words.
  - 167. The system of claim 166, wherein the inquiry anomaly comprises an inconsistency between the conceptual representations and at least some of the reference data.
  - 168. The system of claim 167, further comprising ranking means for giving inquiry anomalies a sealed designation relating to the magnitude of the inquiry anomaly and ranking the inquiry anomalies according to the sealed designation.
  - 169. The system of claim 168, further comprising means to associate one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 170. The system of claim 169, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 171. The system of claim 170, further comprising means to formulate responses only from conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 172. The system of claim 170, further comprising means for deriving one or more conceptual representations until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 173. The system of claim 166, wherein the inquiry anomaly comprises an inconsistency internally within the conceptual representation.
  - 174. The system of claim 173, further comprising ranking means for giving inquiry anomalies a sealed designation relating to the magnitude of the inquiry anomaly and ranking the inquiry anomalies according to the sealed designation.
  - 175. The system of claim 174, further comprising means to associate one or more inquiry anomaly indicators relating to the rank of the inquiry anomaly with the conceptual representations.
  - 176. The system of claim 175, wherein the communicated response corresponds to the conceptual representation with the lowest ranked inquiry anomaly indicator.
  - 177. The system of claim 176, further comprising means to formulate responses only from conceptual representations having the lowest ranked inquiry anomaly indicator.
  - 178. The system of claim 177, further comprising means for deriving one or more conceptual representation until a conceptual representation is derived that has an associated inquiry anomaly indicator of the lowest rank.
  - 179. The system of claim 158, further comprising means for deriving one or more responsive conceptual representations.
  - 180. The system of claim 179, wherein the means for formulating one or more responses to the speech input comprises means for formulating one or more responses to the speech input based on one or more of the responsive conceptual representations.
  - 181. The system of claim 158, wherein at least one of the syntactic sequences of words comprises a sentence.
  - 182. The system of claim 158, wherein at least one of the syntactic sequences of words comprises any syntactic organization.
  - 183. The system of claim 158, further comprising semantic rules associated with each candidate word and each associated part of speech, and each syntactic sequence of words and each associated part of speech, wherein further the semantic rules relate to conceptual relationships between at least two of the candidate words and syntactic sequences of words.
  - 184. The system of claim 183, wherein the means for deriving the conceptual representation further comprises means for applying the semantic rules to the syntactic sequence of words, the candidate words or any combination thereof.
  - 185. The system of claim 184, wherein the semantic rules comprise an interpreted language.
  - 186. The system of claim 185, wherein the semantic rules comprise a predicate builder scripting language.
  - 187. The system of claim 185, wherein the semantic rules comprise a compiled language.
  - 188. The system of claim 158, wherein the candidate words comprising the syntactic sequences of words are assigned a score or probability, and the syntactic sequence of words is assigned a score or probability based on the scores or probabilities of the candidate words.
  - 189. The system of claim 158, wherein each of the candidate words is constructed based on candidate phonemes, each candidate phoneme being assigned a score or probability, each candidate word being assigned a score or probability based on the candidate phonemes making up the candidate word, and the syntactic sequence of words being assigned a score or probability based on the scores or probabilities of the candidate words making up the syntactic sequences of words.
  - 190. The system of claim 158, wherein the means for identifying the syntactic sequences of words comprises means for deriving candidate words from the result of the application of the Hidden Markov Model (HMM) technique to the speech input, the candidate words used to identify the syntactic sequences of words.
  - 191. The system of claim 158, wherein the means for processing the speech input comprises means for deriving candidate phonemes from the result of the application of the Backus-Naur (BNF) technique to the speech input, the candidate phonemes being used to identify the list of candidate words.
  - 192. The system of claim 158, wherein the means for processing the speech input comprises means for processing a series of time-slices to identify candidate phonemes at least some of the time-slices including alternative candidate phonemes, and wherein the candidate phonemes are used to identify a list of candidate words, the candidate words being used to identity the plurality of syntactic sequences of words.
  - 193. The system of claim 158, wherein the means for deriving the conceptual representation comprises means for applying the principles of Conceptual Dependency to the syntactic sequences of words.
  - 194. The system of claim 158, wherein the means for processing the speech input to identify a plurality of syntactic sequences of words comprises;
    - an inputting device for inputting an acoustic input of digitized speech;
      
      a segmented for segmenting said digitized acoustic input into a plurality of time-slices; and
      
      an analysis device for analyzing each time-slice to identify one or more candidate phonemes based on a plurality of reference cluster sets, each cluster set representing reference phonemes for a cluster type, wherein the output of the analysis device comprises a phoneme stream of identified candidate phonemes based on the analysis.
  - 197. The system of claim 158, wherein the processing means of at least one of the conceptual representations comprises comparing the derived conceptual representation to reference conceptual representations in the database.
  - 198. The system of claim 197, wherein the means for formulating one or more responses to the speech input comprises means for formulating one or more responses to the speech input based on a successful comparison of the conceptual representations to at least one reference conceptual representation in the database.
  - 199. The system according to claim 158, further comprising:
    - a phoneme recognition processor for processing said speech input based on a plurality of reference cluster sets to generate a plurality of candidate phonemes; and
      
      wherein the phoneme recognition processor identifies a score or probability for each candidate phoneme.
  - 200. The system of claim 199, wherein at least some of the candidate phonemes originate from different reference cluster sets.
  - 201. The system of claim 199, wherein the phoneme recognition processor segments said speech input into time-slices in order to identify candidate phonemes.
  - 202. The system of claim 201, wherein the time-slices are overlapping.
  - 203. The system of claim 201, wherein the time-slices are nonoverlapping.
  - 204. The system of claim 201, wherein the time-slices include both overlapping and nonoverlapping time-slices.
  - 205. The system of claim 202 or 204, wherein the overlapping time segments overlap within the range of approximately 40% and 60%.
  - 206. The system of claim 204, wherein a subsequent time-slice, is selected to be overlapping or nonoverlapping based on the phoneme recognition result of the previous time-slice.
  - 207. The system of claim 199, wherein the plurality of reference cluster sets comprise sets of reference phonemes for a single language.
  - 208. The system of claim 199, wherein the plurality of reference cluster sets comprise sets of reference phonemes for multiple languages, thereby allowing the system to detect the language spoken by the person inputting the speech.
  - 209. The system of claim 199, wherein the plurality of reference cluster sets comprise reference triphones, thereby enabling the system to recognize candidate phonemes according to the triphone variations in the pronunciations of candidate phonemes.
  - 210. The system of claim 209, wherein the phoneme recognition processor is adapted to generate a candidate phoneme by mapping a detected triphone to the corresponding phoneme.
  - 211. The system of claim 199, wherein the phoneme recognition unit is further adapted to output a phoneme stream of the candidate phonemes comprising or associated with the identified score or probability of each identified candidate phoneme.
  - 212. The system of claim 211, wherein the phoneme stream is stored for further processing.
  - 213. The system of claim 211, wherein said means for identifying a plurality of syntactic sequences of words comprises a phoneme stream analyzer to identify candidate words corresponding to the candidate phonemes.
  - 214. The system of claim 213, wherein the candidate words are stored for further processing.
  - 215. The system of claim 213, wherein the phoneme stream data is stored for further processing.
  - 216. The system of claim 213, wherein the candidate words and the phoneme stream data are stored for further processing.
  - 217. The system of claim 213, wherein the candidate words are based on potential words constructed according to permutations of candidate phonemes from different time-slices.
  - 218. The system of claim 217, wherein the candidate words are generated by creating search paths reflecting permuted candidate phonemes from different time-slices matching at least a partial valid pronunciation of a word in a dictionary.
  - 219. The system of claim 218, wherein a search path is terminated or dropped upon the permutation with a further candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary.
  - 220. The system of claim 218, wherein a search path is terminated or dropped upon the permutation with at least two further consecutive non-matching candidate phonemes, the first non-matching candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary, and the second non-matching candidate phoneme resulting in no at least partial valid pronunciation when ignoring the first non-matching candidate phoneme, thereby providing an error tolerant system.
  - 221. The system of claim 217, wherein the candidate words are identified by processing the potential words according to a dictionary.
  - 222. The system of claim 217, wherein the candidate words are scored based on the scores or probabilities of the candidate phonemes used to construct the candidate words.
  - 223. The system of claim 213, wherein the candidate words are constructed based on candidate phonemes originating from the same reference cluster set.
  - 224. The system of claim 199, wherein the phoneme recognition processor is further adapted to process said digitized acoustic input to detect or derive at least one parameter in addition to (a) candidate phonemes which are identified and (b) score or probabilities which are identified, wherein the at least one additional parameter is used by said means for identifying a plurality of syntactic sequences in analyzing the identified candidate phonemes to identify candidate words corresponding to the candidate phonemes.
  - 225. The system of claim 224, wherein the at least one additional parameter is derived through time domain processing.
  - 226. The system of claim 224, wherein the at least one additional parameter is derived through frequency domain processing.
  - 227. The system of claim 224, wherein the at least one additional parameter comprises pitch information, wherein the pitch information is used in conjunction with information contained in a dictionary to identify the candidate words.
  - 228. The system of claim 227, wherein the dictionary contains Chinese language words.
  - 229. The system of claim 224, wherein the acoustic input is segmented into time-slice, each time-slice being characterized by a pitch value.
  - 230. The system of claim 199, wherein said system is implemented in an application for transcription or dictation.
  - 231. The system of claim 199, wherein said system is implemented in an application for providing a response to a query represented by said speech input.
  - 232. The system according to claim 158, further comprising:
    - means for generating a phoneme stream by processing the speech input to identify candidate phonemes including at least some alternative candidate phonemes.
  - 233. The system of claim 232, wherein the phoneme analyzer generates the list of candidate words according to a dictionary to identify candidate words.
  - 234. The system of claim 233, wherein the candidate phonemes are permuted through a search path created by permuting candidate phonemes from different time-slices and comparing the permuted candidate phonemes to the dictionary to determine if the search path corresponds to at least a partial valid pronunciation of a word.
  - 235. The system of claim 234, wherein the comparison is carried out based on symbols or values representing the permuted candidate phonemes which are compared to symbols or values in the dictionary representing partial or whole valid pronunciations of a word.
  - 236. The system of claim 234, wherein based on a favorable result of the comparison, the search path is expanded to permute one or more candidate phonemes from additional time-slices.
  - 237. The system of claim 236, wherein an expanded search path is terminated when an additional phoneme results in the expanded search path not corresponding to any at least partial valid pronunciation of a word in the dictionary.
  - 238. The system of claim 236, wherein an expanded search path is terminated or dropped upon the permutation with at least two further consecutive non-matching candidate phonemes, the first non-matching candidate phoneme resulting in no at least partial valid pronunciation of a word in the dictionary, and the second non-matching candidate phoneme resulting in no at least partial valid pronunciation when ignoring the first non-matching candidate phoneme, thereby providing an error tolerant system.
  - 239. The system of claim 232, wherein said means for generating a phoneme stream comprises a processor executing the Hidden Markov Model (HMM) technique to produce candidate words from which candidate phonemes are derived.
  - 240. The system of claim 232, wherein said means for generating a phoneme stream comprises a processor executing the Backus-Naur (BNF) technique to produce results from which candidate phonemes are derived.
  - 241. The system of claim 232, wherein the phoneme stream comprises a plurality of time-slices, at least some of the time-slices including a plurality of alternative candidate phonemes, and each candidate phoneme having a score or probability.
  - 242. The system of claim 232, wherein at least some of the candidate words are alternative candidate words corresponding to the same portion or an overlapping portion of the speech sample.
  - 243. The system of claim 232, wherein the means for generating a phoneme steam provides a score or probability for each of the candidate phonemes.
  - 244. The system of claim 243, wherein the phoneme analyzer generates the list of candidate words based on the scores or probabilities of the candidate phonemes making up the candidate words.
  - 245. The system of claim 232, further comprising a memory for storing at least a two-dimensional array of candidate words, the first dimension related to time and the second dimension corresponding to alternative candidate words for the same or an overlapping time period.
  - 246. The system of claim 232, wherein the phoneme analyzer permutes potential syntactic structures with at least one of (i) potential syntactic structures or (ii) candidate words, to generate further potential syntactic structures.
  - 247. The system of claim 158, wherein the syntactic analysis is carried out to respect interjections to that the presence of an interjection does not invalidate an otherwise syntactically valid sequence of words.
  - 248. The system of claim 158, wherein the phoneme analyzer implements a syntactic analysis as one of a bottom-up parsing process, a top-down parsing process, an Early parsing process, a finite-state parsing process, and a CYK parsing process.
  - 249. The system of claim 158, wherein the phoneme analyzer applies syntactic transform scripts to the potential syntactic structures.
  - 250. The system of claim 158, wherein the at least some syntactically valid sequence of words comprise sentences.
  - 251. The system of claim 232, wherein the means for generating a phoneme stream identifies candidate phonemes by processing the speech sample based on cluster sets including reference phonemes.
  - 252. The system of claim 232, wherein the means for generating a phoneme stream identities candidate phonemes by processing the speech sample based on cluster sets including reference triphones.
  - 253. The system according to claim 158, further comprising:
    - phoneme recognition means for identifying a plurality of candidate phonemes in said speech input and providing a score of probability for each candidate phoneme.
  - 254. The speech processing system of claim 253, wherein the phoneme analyzer is further adapted for scoring the potential words based on the scores or probabilities of the candidate phonemes making up the potential words.
  - 255. The speech processing system of claim 253, wherein said speech input comprises a wired or wireless telephone or other wireless communication equipment.
  - 256. The speech processing system of claim 253, wherein said speech input comprises a microphone operatively coupled to the Internet.
  - 257. The speech processing system, of claim 253, wherein said speech input comprises a means for playback of pre-recorded audio.
  - 258. The speech processing system of claim 253, wherein said speech input is digitized by a digitizer located at the speaker'"'"'s location, and the digitized speech input is communicated to said phoneme analyzer at a different location.
  - 259. The speech processing system of claim 258, wherein the digitizer is located in a personal computer or personal data assistant (PDA) device.
  - 260. The speech processing system of claim 258, wherein the speech input is received through a wireless transceiver, and said wireless transceiver comprises said digitizer.
  - 261. The speech processing system of claim 253, wherein said digitizing means comprises a digitizer remotely located from the speaker.
  - 262. The speech processing system of claim 253, wherein said phoneme recognition means is adapted to output a phoneme stream comprising said candidate phonemes and said scores or probabilities.
  - 263. The speech processing system of claim 253, wherein the phoneme analyzer is adapted to identify alternative potential words from the same portion or an overlapping portion of the speech input.
  - 264. The speech processing system of claim 263, wherein the alternative potential words are examined to select the potential word with the most favorable score based on the scores or probabilities of the candidate phonemes making up the alternative potential words.
  - 265. The speech processing system of claim 253, further comprising dictionary processing means for processing the potential words according to a dictionary to thereby identify candidate words from the potential words.
  - 266. The speech processing system of claim 253, wherein the syntactic structures are analyzed according to one of a bottom-up parsing process, a top-down parsing process, an Early parsing process, a finite-state parsing process, and a CYK parsing process.
  - 267. The speech processing system of claim 253, wherein said phoneme analyzer is adapted to apply syntactic transform scripts to the potentially syntactic structures to generate syntactically valid sequences of words.
  - 268. The speech processing system of claim 158, wherein at least some of the potential syntactic structures are scored based on the scores of the candidate phonemes making up the potential syntactic structures.
  - 269. The speech processing system of claim 268, wherein the scores of the potential syntactic potential syntactic structures are used in selecting at least one potential syntactic structure for further analysis.
  - 270. The speech processing system of claim 253, wherein the phoneme recognition means identifies the candidate phonemes based on reference cluster sets of reference phonemes.
  - 271. The speech processing system of claim 253 wherein, the phoneme recognition means identifies the candidate phonemes based on reference cluster sets of reference triphones.
  - 272. The system of claim 158, further comprising:
    - a phoneme recognition unit for identifying candidate phonemes, wherein at least some of the candidate phonemes are alternative candidate phonemes; and
      
      a phoneme stream analyzer for identifying the candidate words constructed from the candidate phonemes, wherein at least some of the candidate words are alternative candidate words corresponding to the same portion or an overlapping portion of a speech input;
      
      wherein one of the plurality of potential syntactic sequences is selected as the conceptual representation corresponding to the speech input.
  - 273. The system of claim 272, wherein each of the candidate phonemes is assigned a score or probability.
  - 274. The system of claim 272, wherein the word permutation unit is further adapted for syntactically validating the potential syntactic sequences to render syntactically valid sequences of words.
  - 275. The system of claim 274, wherein the means for deriving selectively extracts conceptual representation of syntactically valid sequences of words.
  - 276. The system of claim 272, wherein the phoneme stream analyzer permutes the candidate phonemes in order to generate a list of potential words.
  - 277. The system of claim 276, wherein the list of potential words are selected as the list of candidate words.
  - 278. The system of claim 276, wherein the list of potential words are processed according to a dictionary to generate the list of candidate words.
  - 279. The system of claim 272, wherein the word permutation unit is further adapted for syntactically validating the potential syntactic structures to render valid syntactically sequences of words, further comprising:
    - means for comparing the conceptual representations to reference data, said means for responding being sensitive to one or more successful comparisons of the conceptual representations in relation to the reference data.

280. A system for processing speech, comprising:
- an input for receiving a speech input;
  
  a processor, receiving a set of phonemes derived from a speech input, generating a set of candidate words having respective pronunciation boundaries from the set of phonemes, permuting the candidate words to produce a plurality of syntactically valid potential syntactic structures, the candidate words and plurality of potential syntactic structures each having an associated part of speech identifying a plurality of syntactic sequences of words from the potential syntactic structures and candidate words, deriving at least one conceptual representation from at least one of the syntactic sequences of words, and formulating at least one response to the speech input based on one or more of the conceptual representations; and
  
  an output, for communicating a signal responsive to the at least one response.
- View Dependent Claims (281)
- - 281. The system according to claim 280, wherein a database storing reference data is provided for deriving the at least one conceptual representation.

282. A method for processing speech, comprising:
- receiving a speech input;
  
  deriving a set of phonemes from the speech input;
  
  generating a set of candidate words having respective pronunciation boundaries from the set of phonemes;
  
  permuting the candidate words to produce a plurality of syntactically valid potential syntactic structures, the candidate words and plurality of potential syntactic structures each having an associated part of speech;
  
  identifying a plurality of syntactic sequences of words from the potential syntactic structures and candidate words;
  
  deriving at least one conceptual representation from at least one of the syntactic sequences of words;
  
  formulating at least one response to the speech input based on one or more of the conceptual representations; and
  
  communicating a signal responsive to the at least one response.
- View Dependent Claims (283)
- - 283. The method according to claim 282, further comprising the step of maintaining a database of reference data for deriving the at least or conceptual representation.

284. A method for processing speech, comprising;
- receiving an input comprising speech;
  
  identifying a list of candidate words constructed from a sequence of phonemes, wherein at least some of the candidate words are alternative candidate words corresponding to the same portion or an overlapping portion of the input, each respective candidate word having a pronunciation boundary and a respective part of speech;
  
  permuting the candidate words to create a plurality of potential syntactic structures, wherein at least some of the plurality of potential syntactic structures is selected as corresponding to the input and having a respective, part or parts of speech;
  
  syntactically validating the potential syntactic structures to render syntactically valid sequences of words;
  
  generating a plurality of valid syntactic sequences of words from the permuted candidate words and potential syntactic structures;
  
  deriving conceptual representations of syntactically valid sequences of words; and
  
  formulating at least one response to the input based on the conceptual representations.
- View Dependent Claims (285, 286, 287, 288, 289, 290)
- - 285. The method of claim 284, wherein permuting the candidate words is carried out to give consideration to the respective pronunciation boundaries, thereby creating potential syntactic structures comprising candidate words with pronunciation boundaries that do not conflict with the boundaries of other candidate word pronunciations.
  - 286. The method according to claim 284, wherein the permutating the candidate words is carried out only for combinations of candidate words without conflicting pronunciation boundaries.
  - 287. The method according to claim 284, further comprising permuting elements of potential syntactic structures to generate further potential syntactic structures.
  - 288. The method according to claim 284, further comprising syntactically analyzing the potential syntactic structures to generate syntactically valid sequences of words.
  - 289. The method according to claims 284, further comprising the step identifying an inquiry anomaly represented in at least one of the valid syntactic sequences of words, wherein the inquiry anomaly comprises an inconsistency between at least one conceptual representation and at least some linguistic data determined to have a high probability of being represented in the communication stream.
  - 290. A computer readable medium storing a program executable on a programmable computer for causing the computer to execute a method in accordance with claim 284.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Chemtron Research LLC (Intellectual Ventures LLC)
Original Assignee
Conceptual Speech LLC
Inventors
Roy, Philippe
Primary Examiner(s)
Azad, Abul K.

Application Number

US10/610,080
Publication Number

US 20040215449A1
Time in Patent Office

1,576 Days
Field of Search

None
US Class Current

704/270
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/1822   Parsing for meaning underst...

G10L 2015/025   Phonemes, fenemes or fenone...

Multi-phoneme streamer and knowledge representation speech recognition system and method

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

75 Citations

290 Claims

Specification

Use Cases

Quick Links

Others

Multi-phoneme streamer and knowledge representation speech recognition system and method

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

75 Citations

290 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others