Method and subsystem for searching media content within a content-search service system
First Claim
1. A concept-service component of a content-search-service system for searching a content item having an audio track, the concept-service component comprising:
- a hardware processor configured to;
receive, as input, a content ID and search query, wherein the content ID uniquely identifies the content item;
use the content ID to retrieve a category ID, ontology, vocabulary, and a transcript, wherein;
the category ID relates to a subject matter of the content item, andthe transcript includes a textual rendering of the audio track;
correct and linguistically normalizes terms or phrases within the search query; and
use the linguistically normalized terms and phrases to process the transcript with a transcript scorer to assign a set of ontology-based scores, each ontology-based score of the set of ontology-based scores being associated with a different portion of the transcript, wherein the transcript-scorer;
prepares a list of term/ontology-metric pairs for each term or phrase in the linguistically normalized terms or phrases of the search query by;
identifying each entry in the ontology that includes the term or phrase paired with a second term; and
for each identified entry,
computing a co-occurrence metric as a combination of the co-occurrence values in the identified entry, and
adding an entry to the list that includes the second term and the computed co-occurrence metric; and
adding an entry to the list that includes the term and an identical-term co-occurrence metric; and
for each term or phrase in the transcript, associates a score with the term or phrase based on co-occurrence-metrics in the prepared lists of term/ontology-metric pairs; and
a memory coupled with the processor.
3 Assignments
0 Petitions
Accused Products
Abstract
Various embodiments of the present invention include concept-service components of content-search-service systems which employ ontologies and vocabularies prepared for particular categories of content at particular times in order to score transcripts prepared from content items to enable a search-service component of a content-search-service system to assign estimates of the relatedness of portions of a content item to search criteria in order to render search results to clients of the content-search-service system. The concept-service component processes a search request to generate lists of related terms, and then employs the lists of related terms to process transcripts in order to score transcripts based on information contained in the ontologies.
-
Citations
24 Claims
-
1. A concept-service component of a content-search-service system for searching a content item having an audio track, the concept-service component comprising:
-
a hardware processor configured to; receive, as input, a content ID and search query, wherein the content ID uniquely identifies the content item; use the content ID to retrieve a category ID, ontology, vocabulary, and a transcript, wherein; the category ID relates to a subject matter of the content item, and the transcript includes a textual rendering of the audio track; correct and linguistically normalizes terms or phrases within the search query; and use the linguistically normalized terms and phrases to process the transcript with a transcript scorer to assign a set of ontology-based scores, each ontology-based score of the set of ontology-based scores being associated with a different portion of the transcript, wherein the transcript-scorer; prepares a list of term/ontology-metric pairs for each term or phrase in the linguistically normalized terms or phrases of the search query by; identifying each entry in the ontology that includes the term or phrase paired with a second term; and for each identified entry,
computing a co-occurrence metric as a combination of the co-occurrence values in the identified entry, and
adding an entry to the list that includes the second term and the computed co-occurrence metric; andadding an entry to the list that includes the term and an identical-term co-occurrence metric; and for each term or phrase in the transcript, associates a score with the term or phrase based on co-occurrence-metrics in the prepared lists of term/ontology-metric pairs; and a memory coupled with the processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for searching for, and identifying, points in a transcribed media-content item related to a search query, the method comprising:
-
receiving, at a hardware processor, as input, a content ID and search query, wherein the content ID uniquely identifies a particular content item; using the content ID to retrieve from a memory coupled with the hardware processor a category ID, ontology, vocabulary, and a transcript, wherein; the category ID relates to a subject matter of the content item, and the transcript includes a textual rendering of an audio track of the content item; correcting and linguistically normalizing terms or phrases within the search query; and using the linguistically normalized terms and phrases, at the hardware processor, to process the transcript to assign a set of ontology-based scores, each ontology-based score of the set of ontology-based scores being associated with a different portion of in the transcript, wherein using the linguistically normalized terms and phrases to process the transcript to assign a set of ontology-based scores comprises; preparing a list of term/ontology-metric pairs for each term or phrase in the linguistically normalized terms or phrases of the search query; and for each term or phrase in the transcript, associating a score with the term or phrase based on co-occurrence-metrics in the prepared lists of term/ontology-metric pairs by; identifying each entry in each list of term/ontology-metric pairs in which the ontology that includes the currently considered term or phrase; when two or more entries are identified, adding the co-occurrence metrics of the identified entries together and computing a score from the sum; when one entry is identified, using the co-occurrence metric in the identified entry as the score; and associating the score with the currently considered term or phrase. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A machine-readable storage media having a set of instructions for searching for, and identifying, points in a transcribed media-content item related to a search query, the instructions, when executed by at least one machine, cause the at least one machine to:
-
receive, as input, a content ID and search query, wherein the content ID uniquely identifies a particular content item; use the content ID to retrieve a category ID, ontology, vocabulary, and a transcript, wherein; the category ID relates to a subject matter of the content item, and the transcript includes a textual rendering of an audio track of the content item; correct and linguistically normalize terms or phrases within the search query; and use the linguistically normalized terms and phrases to process the transcript to assign a set of ontology-based scores, each ontology-based score of the set of ontology-based scores being associated with a different portion of in the transcript, wherein processing the transcript to assign a set of ontology-based scores comprises; preparing a list of term/ontology-metric pairs for each term or phrase in the linguistically normalized terms or phrases of the search query, wherein preparing a list of term/ontology-metric pairs for each term or phrase in the linguistically normalized terms or phrases of the search query comprises; identifying each entry in the ontology that includes the term or phrase paired with a second term; and for each identified entry, computing a co-occurrence metric as a combination of the co-occurrence values in the identified entry, and adding an entry to the list that includes the second term and the computed co-occurrence metric; and adding an entry to the list that includes the term and an identical-term co-occurrence metric; and for each term or phrase in the transcript, associating a score with the term or phrase based on co-occurrence-metrics in the prepared lists of term/ontology-metric pairs. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification