Controlling offensive content in output

US 9,405,741 B1
Filed: 03/24/2014
Issued: 08/02/2016
Est. Priority Date: 03/24/2014
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

a computer-readable memory storing executable instructions; and

one or more processors in communication with the computer-readable memory, wherein the one or more processors are programmed by the executable instructions to at least;

obtain first audio input data regarding a first user utterance of a user;

obtain a characteristic of the user;

perform speech processing on the first audio input data to generate first speech processing results, the speech processing results including contextual information indicating a context to which the first utterance relates;

determine a first response to the first user utterance using the first speech processing results, wherein the first response comprises a name of a content item;

generate a profanity score for the name using an output filter model, the name, and the contextual information, the output filter model adapted to provide an output profanity score based upon an input word and a context in which the input word is used;

identify a sensitivity threshold for users having the characteristic, the sensitivity threshold indicating an acceptable degree of offensiveness for users having the characteristic;

determine the profanity score for the name exceeds the sensitivity threshold;

generate first output audio data using the first response and text-to-speech synthesis, wherein (i) a portion of the first output audio data corresponding to the name of the content item is modified or (ii) the name of the content item is modified before generating the first output audio data; and

transmit the first audio output to a user device.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Features are disclosed for recognizing inappropriate content in an output. The offensive content may be generated as a result of a speech processing error. A system may identify the inappropriate elements of a generated output and select among different appropriate alternatives. The system may be adjusted based on certain user characteristics. The system may be localized based on language and cultural features. The system may modify the generated output based on characteristics such as the tolerance threshold of known persons in the proximity of the system. The tolerance threshold may further be used to personalize and modify available content. Models used by the system may be further trained using input from a user.

Citations

23 Claims

1. A system comprising:
- a computer-readable memory storing executable instructions; and
  
  one or more processors in communication with the computer-readable memory, wherein the one or more processors are programmed by the executable instructions to at least;
  
  obtain first audio input data regarding a first user utterance of a user;
  
  obtain a characteristic of the user;
  
  perform speech processing on the first audio input data to generate first speech processing results, the speech processing results including contextual information indicating a context to which the first utterance relates;
  
  determine a first response to the first user utterance using the first speech processing results, wherein the first response comprises a name of a content item;
  
  generate a profanity score for the name using an output filter model, the name, and the contextual information, the output filter model adapted to provide an output profanity score based upon an input word and a context in which the input word is used;
  
  identify a sensitivity threshold for users having the characteristic, the sensitivity threshold indicating an acceptable degree of offensiveness for users having the characteristic;
  
  determine the profanity score for the name exceeds the sensitivity threshold;
  
  generate first output audio data using the first response and text-to-speech synthesis, wherein (i) a portion of the first output audio data corresponding to the name of the content item is modified or (ii) the name of the content item is modified before generating the first output audio data; and
  
  transmit the first audio output to a user device.
- View Dependent Claims (2, 3, 4)
- - 2. The system of claim 1, wherein the output filter model is a user-specific output filter model determined using one or more of information about the user, user preferences, or prior user interactions.
  - 3. The system of claim 1, wherein the characteristic of the user is one or more of an identity of the user, age, gender, sex, language, culture, or religion.
  - 4. The system of claim 1, wherein the processor is further configured to:
    - generate the output filter mode using training data;
      
      obtain second audio input data regarding a second user utterance, wherein the second user utterance is based in part on the first response; and
      
      cause retraining of the output filter model based on the second audio input data regarding the second user utterance.

5. A computer-implemented method comprising:
- under control of one or more computing devices configured with specific computer-executable instructions,obtaining first input data regarding a first user utterance by a user;
  
  obtaining a characteristic of the user;
  
  performing speech processing on the first audio data to generate first speech processing results, the first speech processing results including contextual information indicating a context to which the first utterance relates;
  
  determining a first response using the first speech processing results, wherein the first response comprises a name of a content item;
  
  generating a profanity score for the name using an output filter model, the name, and the contextual information;
  
  identifying a sensitivity threshold for users having the characteristic;
  
  determining the profanity score for the name exceeds the sensitivity threshold;
  
  generating a first output using the first response, wherein (i) a portion of the first output corresponding to the name of the content item is modified or (ii) the name of the content item is modified before generating the first output; and
  
  transmitting the first output to a client device.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 6. The computer-implemented method of claim 5, wherein the first user utterance is obtained in response to detecting a keyword in the first input data.
  - 7. The computer-implemented method of claim 5, wherein determining the profanity score comprises determining that the name of the content item comprises one or more of graphic language, violence, or sexual content.
  - 8. The computer-implemented method of claim 5, wherein the output filter model comprises a list of words that trigger modification of an output.
  - 9. The computer-implemented method of claim 5, wherein the output filter model is a user-specific output filter model determined using one or more of information about the user, user preferences, or prior user interactions.
  - 10. The computer-implemented method of claim 5, wherein the characteristic of the user is one or more of an identity of the user, age, gender, sex, language, culture, or religion.
  - 11. The computer-implemented method of claim 5, wherein modifying the name of the content item comprises replacing at least a portion of the name of the content item with another word or phrase.
  - 12. The computer-implemented method of claim 5, wherein the sensitivity threshold is based on a lowest sensitivity threshold from one or more users in proximity of the client device.
  - 13. The computer-implemented method of claim 5, wherein the first output comprises one or more of text or audio data.
  - 14. The computer-implemented method of claim 5, wherein the method further comprises:
    - generating the output model using training data;
      
      obtaining second input data regarding a user action, wherein the user action is based in part on the first output; and
      
      cause retraining the output filter model based on the second input data regarding the user action.

15. One or more non-transitory computer readable media comprising executable code that, when executed, cause one or more computing devices to perform a process comprising:
- obtaining first input data regarding a first user utterance;
  
  performing speech processing on the first input data to generate first speech processing results, the first speech processing results including contextual information indicating a context to which the first utterance relates;
  
  determining a first response to the first user utterance using the first speech processing results, wherein the first response comprises a name of a content item;
  
  generating a profanity score for the name using an output filter model, the name, and the contextual information;
  
  identifying a sensitivity threshold using the first input data;
  
  determining the profanity score for the name exceeds the sensitivity threshold; and
  
  generating a first output using the first response, wherein (i) a portion of the first output corresponding to the name of the content item is modified or (ii) the name of the content item is modified before generating the first output.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23)
- - 16. The one or more non-transitory computer readable media of claim 15, wherein determining the profanity score comprises determining that the name of the content item comprises one or more of graphic language, violence, or sexual content.
  - 17. The one or more non-transitory computer readable media of claim 15, wherein the output filter model comprises a list of words that trigger modification of an output.
  - 18. The one or more non-transitory computer readable media of claim 15, wherein the output filter model is a user-specific output filter model determined using one or more of information about the user, user preferences, or prior user interactions.
  - 19. The one or more non-transitory computer readable media of claim 15, wherein the process further comprises obtaining a characteristic for one or more users in proximity to a user device providing the first utterance, wherein identifying the sensitivity threshold is further based on the characteristic.
  - 20. The one or more non-transitory computer readable media of claim 15, wherein modifying the first output involves replacing at least a portion of the first output with a non-offensive word or phrase.
  - 21. The one or more non-transitory computer readable media of claim 15, wherein the sensitivity threshold is based on a lowest sensitivity threshold from one or more users in proximity to a user device providing the first utterance.
  - 22. The one or more non-transitory computer readable media of claim 15, wherein the generated first output comprises one or more of text or audio data.
  - 23. The one or more non-transitory computer readable media of claim 15, wherein the process further comprises:
    - generating the output model using training data;
      
      obtaining second input data regarding a user action, wherein the user action is based in part on the first output; and
      
      cause retraining the output filter model based on second input data regarding the user action.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Alix-Gaudreau, Roger, Mois, Remus Razvan, Kuklinski, Rafal, Murman, Derek Christopher, Schaaf, Thomas, Kshirsagar, Sumedha Arvind
Primary Examiner(s)
Godbold, Douglas

Application Number

US14/223,648
Time in Patent Office

862 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 40/20   Natural language analysis s...

G06F 40/253   Grammatical analysis; Style...

G10L 13/00   Speech synthesis; Text to s...

G10L 15/08   Speech classification or se...

G10L 15/22   Procedures used during a sp...

Controlling offensive content in output

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Controlling offensive content in output

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links