Identification of semantic units from within a search query
First Claim
Patent Images
1. A method comprising:
- identifying, by one or more devices, documents relating to a query;
generating, by the one or more devices, a plurality of substrings from the query;
calculating, by the one or more devices and for a particular substring of the plurality of substrings, a value that corresponds to a comparison between the identified documents and the particular substring;
determining, by the one or more devices, that the calculated value for the particular substring satisfies a particular threshold associated with identifying compounds;
selecting, by the one or more devices and from two or more of the plurality of substrings, the particular substring as a semantic unit based on the calculated value for the particular substring satisfying the particular threshold; and
obtaining, by the one or more devices, a refined list of documents by refining the identified documents based on the semantic unit.
2 Assignments
0 Petitions
Accused Products
Abstract
A search engine for searching a corpus improves the relevancy of the results by classifying multiple terms in a search query as a single semantic unit. A semantic unit locator of the search engine generates a subset of documents that are generally relevant to the query based on the individual terms within the query. Combinations of search terms that define potential semantic units from the query are then evaluated against the subset of documents to determine which combinations of search terms should be classified as a semantic unit. The resultant semantic units are used to refine the results of the search.
31 Citations
19 Claims
-
1. A method comprising:
-
identifying, by one or more devices, documents relating to a query; generating, by the one or more devices, a plurality of substrings from the query; calculating, by the one or more devices and for a particular substring of the plurality of substrings, a value that corresponds to a comparison between the identified documents and the particular substring; determining, by the one or more devices, that the calculated value for the particular substring satisfies a particular threshold associated with identifying compounds; selecting, by the one or more devices and from two or more of the plurality of substrings, the particular substring as a semantic unit based on the calculated value for the particular substring satisfying the particular threshold; and obtaining, by the one or more devices, a refined list of documents by refining the identified documents based on the semantic unit. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A non-transitory computer-readable medium storing instructions, the instructions comprising:
-
one or more instructions that, when executed by at least one processor, cause the at least one processor to; receive a search query; identify documents based on the search query; generate a plurality of substrings based on the search query; calculate, for a particular substring of the plurality of substrings, a value based on the identified documents and the particular substring; determine that the calculated value for the particular substring satisfies a particular threshold associated with identifying compounds; select the particular substring, as a semantic unit, from the plurality of substrings based on the calculated value for the particular substring satisfying the particular threshold; and refine the identified documents based on the semantic unit. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A system comprising:
-
a server, including a processor, to; identify documents relating to a query; generate a plurality of substrings from the query; calculate, for a particular substring of the plurality of substrings, a value relating to one or more documents, of the identified documents, that contain the particular substring; determine that the calculated value for the particular substring satisfies a particular threshold associated with identifying compounds; select, for a semantic unit, the particular substring from the plurality of substrings based on the calculated value for the particular substring satisfying the particular threshold; and obtain a refined list of documents by refining the identified documents based on the semantic unit. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification