METHOD FOR USING PAUSES DETECTED IN SPEECH INPUT TO ASSIST IN INTERPRETING THE INPUT DURING CONVERSATIONAL INTERACTION FOR INFORMATION RETRIEVAL
7 Assignments
0 Petitions
Accused Products
Abstract
A method for using speech disfluencies detected in speech input to assist in interpreting the input is provided. The method includes providing access to a set of content items with metadata describing the content items, and receiving a speech input intended to identify a desired content item. The method further includes detecting a speech disfluency in the speech input and determining a measure of confidence of a user in a portion of the speech input following the speech disfluency. If the confidence measure is lower than a threshold value, the method includes determining an alternative query input based on replacing the portion of the speech input following the speech disfluency with another word or phrase. The method further includes selecting content items based on comparing the speech input, the alternative query input (when the confidence measure is low), and the metadata associated with the content items.
-
Citations
48 Claims
-
1-28. -28. (canceled)
-
29. A method for using speech disfluencies detected in speech input to assist in interpreting the input, the method comprising:
-
providing access to a set of content items, each of the content items being associated with metadata that describes the corresponding content item; receiving a speech input from a user, the input intended by the user to identify at least one desired content item; detecting a speech disfluency in the speech input; determining a measure of confidence of the user in a portion of the speech input following the speech disfluency based on a manner by which the user utters the portion of the speech input following the speech disfluency; upon a condition in which the confidence measure does not exceed a threshold value, determining an alternative query input by automatically replacing the portion of the speech input following the speech disfluency with another word or phrase and selecting a subset of content items from the set of content items based on comparing the speech input, the alternative query input, and the metadata associated with the subset of content items; upon a condition in which the confidence measure exceeds a threshold value, selecting the subset of content items from the set of content items based on comparing the speech input and the metadata associated with the subset of content items; and presenting the subset of content items to the user. - View Dependent Claims (30, 31, 32, 33, 34, 35)
-
-
36. A method for using speech disfluencies detected in speech input to assist in interpreting the input, the method comprising:
-
providing access to a set of content items, each of the content items being associated with metadata that describes the corresponding content item; receiving a speech input from a user, the input intended by the user to identify at least one desired content item; detecting front-end clipping of a first portion of a first word in a beginning of the speech input based on an absence of a period of silence in the beginning of the speech input, wherein the first portion is not detected in the speech input, the front-end clipping resulting in incomplete detection of the first word; in response to detecting the front-end clipping, identifying a second portion of the first word detected in the received speech input; identifying a plurality of whole words having a suffix matching the second portion detected in the received speech input; constructing a plurality of query inputs using the plurality of whole words; selecting a subset of content items from the set of content items based on comparing the plurality of query inputs and the metadata associated with the subset of content items; and presenting the subset of content items to the user. - View Dependent Claims (37, 38)
-
-
39. A system for using speech disfluencies detected in speech input to assist in interpreting the input, the system comprising control circuitry configured to:
-
provide access to a set of content items, each of the content items being associated with metadata that describes the corresponding content item; receive a speech input from a user, the input intended by the user to identify at least one desired content item; detect a speech disfluency in the speech input; determine a measure of confidence of the user in a portion of the speech input following the speech disfluency based on a manner by which the user utters the portion of the speech input following the speech disfluency; upon a condition in which the confidence measure does not exceed a threshold value, determine an alternative query input by automatically replacing the portion of the speech input following the speech disfluency with another word or phrase and select a subset of content items from the set of content items based on comparing the speech input, the alternative query input, and the metadata associated with the subset of content items; upon a condition in which the confidence measure exceeds a threshold value, select the subset of content items from the set of content items based on comparing the speech input and the metadata associated with the subset of content items; and present the subset of content items to the user. - View Dependent Claims (40, 41, 42, 43, 44, 45)
-
-
46. A system for using speech disfluencies detected in speech input to assist in interpreting the input, the comprising control circuitry configured to:
-
provide access to a set of content items, each of the content items being associated with metadata that describes the corresponding content item; receive a speech input from a user, the input intended by the user to identify at least one desired content item; detect front-end clipping of a first portion of a first word in a beginning of the speech input based on an absence of a period of silence in the beginning of the speech input, wherein the first portion is not detected in the speech input, the front-end clipping resulting in incomplete detection of the first word; in response to detecting the front-end clipping, identify a second portion of the first word detected in the received speech input; identify a plurality of whole words having a suffix matching the second portion detected in the received speech input; construct a plurality of query inputs using the plurality of whole words; select a subset of content items from the set of content items based on comparing the plurality of query inputs and the metadata associated with the subset of content items; and present the subset of content items to the user. - View Dependent Claims (47, 48)
-
Specification