Specialized language identification
First Claim
Patent Images
1. A system comprising:
- multiple engines that are each to produce output representative of a summary of the document, wherein each one of the multiple engines applies a different type of engine selected from a group of engines comprising an extractive type of engine, an abstractive type of engine, and a frequency type of engine, wherein the output from each of the multiple engines varies between the multiple engines in accordance with a respective type of engine;
a composite engine to generate a filtered set of content in a single output to reduce a size of the output produced by the multiple engines, wherein the filtered set of content comprises different combinations of the output from the multiple engines that have different densities of specialized word usage;
an identification engine to;
apply a weighting mechanism to the different combinations of the output in the filtered set of content;
obtain a value corresponding to the different combinations of the output in the filtered set of content;
identify specialized language from the different combinations of the output in the filtered set of content, wherein the value corresponding to the different combinations of the output in the filtered set of content reaching at least a particular threshold indicates specialized language within that output; and
index the document based on the specialized language that is identified to identify other documents salient to the document based on the specialized language.
1 Assignment
0 Petitions
Accused Products
Abstract
Examples herein disclose multiple engines to produce output representative of a summary of document produced by each of the multiple engines. The examples apply a weighting mechanism to the output specific to that engine to obtain a value corresponding to that output. The examples identify specialized language if the value corresponding to that output reaches at least a particular threshold.
-
Citations
10 Claims
-
1. A system comprising:
-
multiple engines that are each to produce output representative of a summary of the document, wherein each one of the multiple engines applies a different type of engine selected from a group of engines comprising an extractive type of engine, an abstractive type of engine, and a frequency type of engine, wherein the output from each of the multiple engines varies between the multiple engines in accordance with a respective type of engine; a composite engine to generate a filtered set of content in a single output to reduce a size of the output produced by the multiple engines, wherein the filtered set of content comprises different combinations of the output from the multiple engines that have different densities of specialized word usage; an identification engine to; apply a weighting mechanism to the different combinations of the output in the filtered set of content; obtain a value corresponding to the different combinations of the output in the filtered set of content; identify specialized language from the different combinations of the output in the filtered set of content, wherein the value corresponding to the different combinations of the output in the filtered set of content reaching at least a particular threshold indicates specialized language within that output; and index the document based on the specialized language that is identified to identify other documents salient to the document based on the specialized language. - View Dependent Claims (2)
-
-
3. A method comprising:
-
receiving an output from multiple engines, wherein each engine of the multiple engines is to produce an output representative of a summary of a document specific to that engine based on a different type of engine selected from a group of engines comprising an extractive type of engine, an abstractive type of engine, and a frequency type of engine; generating a filtered set of content in a single output to reduce a size of the output produced by the multiple engines, wherein the filtered set of content comprises different combinations of the output from the multiple engines that have different densities of specialized word usage; applying a weighting mechanism to the different combinations of the output in the filtered set of content to obtain a value corresponding to each one of the different combinations of the output in the filtered set of content; identifying jargon when the value corresponding to the each one of the different combinations of the output in the filtered set of content reaches at least a particular threshold; and indexing the document based the jargon that is identified to identify other documents salient to the document based on the jargon. - View Dependent Claims (4, 5, 6)
-
-
7. A non-transitory machine-readable storage medium comprising instructions that when executed by a processing resource cause a computing device to:
-
receive an output from multiple engines, wherein each engine of the multiple engines is to produce an output representative of a summary of a document specific to that engine based on a different type of engine selected from a group of engines comprising an extractive type of engine, an abstractive type of engine, and a frequency type of engine; generate a filtered set of content in a single output to reduce a size of the output produced by the multiple engines, wherein the filtered set of content comprises different combinations of the output from the multiple engines that have different densities of specialized word usage; apply a weighting mechanism to the different combinations of the output in the filtered set of content to obtain a value corresponding to each one of the different combinations of the output in the filtered set of content; compare the values of the output from the multiple engines; identify jargon based on the comparison of values; and index the document based the jargon that is identified to identify other documents salient to the document based on the jargon. - View Dependent Claims (8, 9, 10)
-
Specification