Synthewiser (TM): Document-synthesizing search method
First Claim
Patent Images
1. A search method and system that produces a document synthesized from multiple sources, comprising:
- having a user provide a search phrase;
creating seed phrases, wherein a seed phrase can be the search phrase and also can be a minor variation on the search phrase;
identifying seed locations in multiple sources, wherein seed locations are locations where a seed phrase appears;
creating expanded text segments, wherein an expanded text segment is created for each seed location and each expanded text segment contains a seed phrase;
grouping expanded text segments, wherein expanded text segments are grouped into sets based on content similarity; and
synthesizing a document, wherein this document has content from some, or all, of these sets of expanded text segments and wherein this content is organized by set.
1 Assignment
0 Petitions
Accused Products
Abstract
“Synthewiser”™ is a search method and system that synthesizes a single non-template, text-based document that is organized by topic and integrates and consolidates information from multiple sources. This is accomplished by: having a user provide a search phrase; creating seed phrases; identifying seed locations in multiple sources; creating expanded text segments; grouping expanded text segments; consolidating content; and synthesizing a single document. Synthewiser has advantages over today'"'"'s dominant search engine. Its results are organized by topic and are integrated across multiple sources.
-
Citations
13 Claims
-
1. A search method and system that produces a document synthesized from multiple sources, comprising:
-
having a user provide a search phrase; creating seed phrases, wherein a seed phrase can be the search phrase and also can be a minor variation on the search phrase; identifying seed locations in multiple sources, wherein seed locations are locations where a seed phrase appears; creating expanded text segments, wherein an expanded text segment is created for each seed location and each expanded text segment contains a seed phrase; grouping expanded text segments, wherein expanded text segments are grouped into sets based on content similarity; and synthesizing a document, wherein this document has content from some, or all, of these sets of expanded text segments and wherein this content is organized by set. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A search method and system that produces a single document synthesized from multiple sources, comprising:
-
having a user provide a search phrase; creating seed phrases, wherein a seed phrase can be the search phrase and also can be a minor variation on the search phrase; identifying seed locations in multiple sources, wherein seed locations are locations where a seed phrase appears; creating expanded text segments, wherein an expanded text segment is created for each seed location and each expanded text segment contains a seed phrase; grouping expanded text segments, wherein expanded text segments are grouped into sets based on content similarity; consolidating content, wherein sets with substantially redundant content are consolidated and wherein expanded text segments, or portions of expanded text segments, with substantially redundant content are consolidated; and synthesizing a single document, wherein this single document has content from some, or all, of these sets of expanded text segments and wherein this content is organized by set. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A search method and system that produces a single document synthesized from multiple sources, comprising:
-
having a user provide a search phrase; creating seed phrases, wherein seed phrases include the search phrase and also include minor variations on the search phrase, and wherein one or more minor variations are selected from the group consisting of;
a phrase with words that are corrected or alternative spelling variations of the words in the search phrase;
a phrase with words that are grammatical variations (such as variation in tense, plurality, or voice) of the words comprising the search phrase;
a phrase with words that are the same as those comprising the search phrase, except for the addition or deletion of grammatical articles (such as “
a”
or “
an”
or “
the”
) or relatively-neutral modifiers (such as “
very”
or “
especially”
);
a phrase with words that are the same as those comprising the search phrase, but are in a different word order;
a phrase with words that are the same as those comprising the search phrase, except for case variation (such as upper vs. lower case) in one or more letters in the search phrase;
a phrase with the same words as those comprising the search phrase, but with variation in punctuation or word contraction; and
a phrase that is a phrase synonym for the search phrase, wherein a phrase synonym is defined as alternative phrase that can be substituted for an original phrase in multiple sources without substantively changing meaning or creating a grammatical error in those sources.identifying seed locations in multiple sources, wherein seed locations are locations where a seed phrase appears; creating expanded text segments, wherein an expanded text segment is created for each seed location and each expanded text segment contains a seed phrase, and wherein a text segment is defined using one or more definitions selected from the group including;
(a) the expanded text segment includes characters spanning a first location, wherein this first location is a certain number of characters, words, sentences, or paragraphs backwards from the seed phrase, and a second location, wherein this second location is a certain number of characters, words, sentences, or paragraphs forwards from the seed phrase;
(b) the expanded text segment includes characters spanning a first location, wherein this first location expands backwards from the seed phrase until stop criteria based on the length or content of the characters in this backwards expansion are satisfied, and a second location, wherein this second location expands forwards from the seed phrase until stop criteria based on the length or content of the characters in the forwards expansion are satisfied; and
(c) the expanded text segment includes characters spanning a first location, wherein this first location expands backwards until one or more key characters or character strings are found, and a second location, wherein this second location expands forwards from the seed phrase until one or more key characters or character strings are found;grouping expanded text segments, wherein expanded text segments are grouped into sets based on content similarity, and wherein this grouping is done based on one or more criteria selected from the group consisting of;
number of shared words, phrases, or minor variations on word phrases among expanded text segments;
frequencies of shared words, phrases, or minor variations on word phrases among expanded text segments;
percentage of shared words, phrases, or minor variations on word phrases among expanded text segments;
types of shared words, phrases, or minor variations on word phrases among expanded text segments;
order of shared words, phrases, or minor variations on word phrases among expanded text segments;
number of non-shared words, phrases, or minor variations on word phrases among expanded text segments;
frequencies of non-shared words, phrases, or minor variations on word phrases among expanded text segments;
percentage of non-shared words, phrases, or minor variations on word phrases among expanded text segments;
types of non-shared words, phrases, or minor variations on word phrases among expanded text segments;
order of non-shared words, phrases, or minor variations on word phrases among expanded text segments;
semantic analysis of content similarity among expanded text segments; and
Bayesian statistical analysis of content similarity among expanded text segments;consolidating content, wherein sets with substantially redundant content are consolidated and wherein expanded text segments, or portions of expanded text segments, with substantially redundant content are consolidated; and
wherein identification of sets, expanded text segments, or portions of expanded text segments with substantially redundant content is based on one or more criteria selected from the group consisting of;
number of shared words, phrases, or minor variations on word phrases;
frequencies of shared words, phrases, or minor variations on word phrases;
percentage of shared words, phrases, or minor variations on word phrases;
types of shared words, phrases, or minor variations on word phrases;
order of shared words, phrases, or minor variations on word phrases;
number of non-shared words, phrases, or minor variations on word phrases;
frequencies of non-shared words, phrases, or minor variations on word phrases;
percentage of non-shared words, phrases, or minor variations on word phrases;
types of non-shared words, phrases, or minor variations on word phrases;
order of non-shared words, phrases, or minor variations on word phrases;
semantic analysis of content similarity; and
Bayesian statistical analysis of content similarity;and synthesizing a single document, wherein some, or all, of the post-consolidation sets of expanded text segments are selected for inclusion in the document and wherein the post-consolidation expanded text segments for those selected sets are grouped by set and included in the document.
-
Specification