Method and system for voice based media search

US 10,242,005 B2
Filed: 04/10/2018
Issued: 03/26/2019
Est. Priority Date: 10/31/2012
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving voice input data at a media device;

determining whether the voice input corresponds to one of a plurality of commands;

in response to determining that the voice input corresponds to one of the plurality of commands, executing the command;

in response to determining that the voice input does not correspond to one of the plurality of commands, sending the voice input data to a speech-to-text service;

receiving, by the media device from the speech-to-text service, a textual representation of at least a portion of the voice input data;

generating a signature based on at least a portion of the textual representation;

locating a particular data entry among a set of data entries by searching the set of data entries for a data entry matching the signature generated based on the at least a portion of the textual representation, each data entry of the set of data entries specifying a mapping between a given signature and one or more media device actions;

updating the set of data entries by storing the mapping between the signature and the at least a portion of the textual representation;

in response to locating the particular data entry among the set of data entries based on the generated signature, performing one or more particular media device actions associated with the particular data entry, the one or more particular media device actions including sending a media content query to a media search service;

receiving, by the media device, one or more content item listings based on the media content query; and

generating for display at least a portion of the one or more content item listings.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Voice-based input is used to operate a media device and/or to search for media content. Voice input is received by a media device via one or more audio input devices and is translated into a textual representation of the voice input. The textual representation of the voice input is used to search one or more cache mappings between input commands and one or more associated device actions and/or media content queries. One or more natural language processing techniques may be applied to the translated text and the resulting text may be transmitted as a query to a media search service. A media search service returns results comprising one or more content item listings and the results may be presented on a display to a user.

Citations

20 Claims

1. A method comprising:
- receiving voice input data at a media device;
  
  determining whether the voice input corresponds to one of a plurality of commands;
  
  in response to determining that the voice input corresponds to one of the plurality of commands, executing the command;
  
  in response to determining that the voice input does not correspond to one of the plurality of commands, sending the voice input data to a speech-to-text service;
  
  receiving, by the media device from the speech-to-text service, a textual representation of at least a portion of the voice input data;
  
  generating a signature based on at least a portion of the textual representation;
  
  locating a particular data entry among a set of data entries by searching the set of data entries for a data entry matching the signature generated based on the at least a portion of the textual representation, each data entry of the set of data entries specifying a mapping between a given signature and one or more media device actions;
  
  updating the set of data entries by storing the mapping between the signature and the at least a portion of the textual representation;
  
  in response to locating the particular data entry among the set of data entries based on the generated signature, performing one or more particular media device actions associated with the particular data entry, the one or more particular media device actions including sending a media content query to a media search service;
  
  receiving, by the media device, one or more content item listings based on the media content query; and
  
  generating for display at least a portion of the one or more content item listings.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the textual representation of the at least a portion of the voice input data is a textual representation of an entire voice input data.
  - 3. The method of claim 1, wherein generating the signature based on the at least a portion of the textual representation, comprises generating the signature based on the entire textual representation.
  - 4. The method of claim 1, further comprising:
    - determining whether an unknown word exists in the at least a portion of the textual representation; and
      
      in response to determining that the unknown word exists in the at least a portion of the textual representation, visually distinguishing the unknown word from other words.
  - 5. The method of claim 1, further comprising:
    - generating for display the at least a portion of the textual representation;
      
      receiving a user selection of a portion of the at least a portion of the textual representation;
      
      receiving additional voice input data;
      
      transmitting the additional voice input data to the speech-to-text service;
      
      receiving from the speech-to-text service, an additional textual representation of the additional voice input data; and
      
      replacing the displayed portion of the at least a portion of the textual representation with the additional textual representation.
  - 6. The method of claim 1, wherein searching the set of data entries for the data entry matching the signature generated based on the at least a portion of the textual representation comprises determining that a data entry matches the signature generated based on a probability assigned to a data entry in a plurality of data entries.
  - 7. The method of claim 1, wherein receiving, by the media device, the one or more content item listings based on the media content query comprises:
    - determining a relevancy to the media content query of one or more content item listings based on a weighting of the one or more content item listings; and
      
      receiving the one or more content item listings based upon the determined relevancy.
  - 8. The method of claim 1, wherein receiving, by the media device, one or more content item listings based on the media content query comprises receiving the one or more content item listings based upon a relevancy of a search result in a plurality of search results to the media content query, wherein the relevancy is determined using search result filtering.
  - 9. The method of claim 1, wherein receiving, by the media device, the one or more content item listings based on the media content query comprises receiving the one or more content item listings from the media search service.
  - 10. The method of claim 1, further comprising:
    - determining whether an extraneous portion of the at least a portion of the textual representation exists; and
      
      in response to determining that the extraneous portion exists, removing the extraneous portion from the at least a portion of the textual representation.

11. A system comprising:
- control circuitry configured to;
  
  receive voice input data at a media device;
  
  determine whether the voice input corresponds to one of a plurality of commands;
  
  in response to the determination that the voice input corresponds to one of the plurality of commands, execute the command;
  
  in response the determination that the voice input does not correspond to one of the plurality of commands, send the voice input data to a speech-to-text service;
  
  receive, by the media device from the speech-to-text service, a textual representation of at least a portion of the voice input data;
  
  generate a signature based on at least a portion of the textual representation;
  
  locate a particular data entry among a set of data entries by searching the set of data entries for a data entry matching the signature generated based on the at least a portion of the textual representation, each data entry of the set of data entries specifying a mapping between a given signature and one or more media device actions;
  
  update the set of data entries by storing the mapping between the signature and the at least a portion of the textual representation;
  
  in response to the location of the particular data entry among the set of data entries based on the generated signature, perform one or more particular media device actions associated with the particular data entry, the one or more particular media device actions including sending a media content query to a media search service;
  
  receive, by the media device, one or more content item listings based on the media content query; and
  
  generate for display at least a portion of the one or more content item listings.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The system of claim 11, wherein the textual representation of the at least a portion of the voice input data is a textual representation of an entire voice input data.
  - 13. The system of claim 11, wherein the control circuitry is further configured, when generating the signature based on the at least a portion of the textual representation, to generate the signature based on the entire textual representation.
  - 14. The system of claim 11, wherein the control circuitry is further configured to:
    - determine whether an unknown word exists in the textual representation; and
      
      in response to determining that the unknown word exists in the textual representation, visually distinguish the unknown word from other words.
  - 15. The system of claim 11, wherein the control circuitry is further configured to:
    - generate for display the at least a portion of the textual representation;
      
      receive a user selection of a portion of the at least a portion of the textual representation;
      
      receive additional voice input data;
      
      transmit the additional voice input data to the speech-to-text service;
      
      receive from the speech-to-text service, an additional textual representation of the additional voice input data; and
      
      replace the displayed portion of the at least a portion of the textual representation with the additional textual representation.
  - 16. The system of claim 11, wherein the control circuity is further configured, when searching the set of data entries for the data entry matching the signature generated based on the at least a portion of the textual representation, to determine that a data entry matches the signature generated based on a probability assigned to a data entry in a plurality of data entries.
  - 17. The system of claim 11, wherein the control circuitry is further configured, when receiving the one or more content item listings based on the media content query, to:
    - determine a relevancy to the media content query of one or more content item listings based on a weighting of the one or more content item listings; and
      
      receive the one or more content item listings based upon the determined relevancy.
  - 18. The system of claim 11, wherein the control circuitry is further configured, when receiving the one or more content item listings based on the media content query, to receive the one or more content item listings based upon a relevancy of a search result in a plurality of search results to the media content query, wherein the relevancy is determined using search result filtering.
  - 19. The system of claim 11, wherein the control circuitry is further configured, when receiving the one or more content item listings based on the media content query, to receive the one or more content item listings from the media search service.
  - 20. The system of claim 11, wherein the control circuitry is further configured to:
    - determine whether an extraneous portion of the at least a portion of the textual representation exists; and
      
      in response to determining that the extraneous portion exists, remove the extraneous portion from the at least a portion of the textual representation.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
TiVo Solutions Inc. (Adeia Inc.)
Original Assignee
TiVo Solutions Inc. (Adeia Inc.)
Inventors
Patel, Mukesh, Silverstein, Lu, Jandhyala, Srinivas
Primary Examiner(s)
Washburn, Daniel C
Assistant Examiner(s)
Nguyen, Timothy

Application Number

US15/949,754
Publication Number

US 20180232368A1
Time in Patent Office

350 Days
Field of Search

704 9
US Class Current
CPC Class Codes

G06F 16/40 of multimedia data, e.g. sl...

G06F 16/48 Retrieval characterised by ...

Method and system for voice based media search

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for voice based media search

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links