Segmenting and interpreting a document, and relocating document fragments to corresponding sections
First Claim
Patent Images
1. A system, comprising:
- an input device configured to receive a first item and a second item; and
a processor communicably coupled to the input device and configured to;
determine that the first item is a fragment matching a lexicon;
place the fragment in a first section of a document, the first section selected based on the matching lexicon;
determine a section type for each fragment of multiple fragments in the first section;
determine a first quantity of first fragments of the multiple fragments and a second quantity of second fragments of the multiple fragments, wherein the first fragments correspond to a first section type of the first section and the second fragments correspond to a second section type of a second section of the document;
determine that the ratio-first quantity of the first fragments exceeds the second quantity of the second fragments by a predetermined quantity; and
based on exceeding the predetermined quantity, re-locate the second fragments to the second section in the document or reclassify the second fragments to correspond to the first section type.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, comprising an input device configured to receive a first item and a second item, and a processor communicably coupled to the input device and configured to determine that the first item is a fragment matching a lexicon, and place the fragment in a section of a document, the section selected based on the matching lexicon, wherein the processor is configured to perform the determination and the placement after it receives the first item but before it receives the second item.
-
Citations
14 Claims
-
1. A system, comprising:
-
an input device configured to receive a first item and a second item; and a processor communicably coupled to the input device and configured to; determine that the first item is a fragment matching a lexicon; place the fragment in a first section of a document, the first section selected based on the matching lexicon; determine a section type for each fragment of multiple fragments in the first section; determine a first quantity of first fragments of the multiple fragments and a second quantity of second fragments of the multiple fragments, wherein the first fragments correspond to a first section type of the first section and the second fragments correspond to a second section type of a second section of the document; determine that the ratio-first quantity of the first fragments exceeds the second quantity of the second fragments by a predetermined quantity; and based on exceeding the predetermined quantity, re-locate the second fragments to the second section in the document or reclassify the second fragments to correspond to the first section type. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product comprising a non-transitory computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
-
receive a stream of words; dynamically match each of the words to one or more lexicons; dynamically categorize each of the words into one or more sections of a document based on the matching one or more lexicons; store the document to a hardware storage device; determine a section type for each fragment of multiple fragments in a first section of the one or more sections; determine a first quantity of first fragments of the multiple fragments and a second quantity of second fragments of the multiple fragments, wherein the first fragments correspond to a first section type of the first section and the second fragments correspond to a second section type of a second section of the one or more sections of the document; determine that the first quantity of the first fragments exceeds the second quantity of the second fragments by a predetermined quantity; and based on exceeding the predetermined quantity, re-locate the second fragments to the second section in the document or reclassify the second fragments to correspond to the first section type. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A computer program product comprising a non-transitory computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
-
receive a first item and a second item; determine that the first item is a fragment matching a lexicon; place the fragment in a first section of a document, the first section selected based on the matching lexicon; segment the document into multiple sections, wherein each of the multiple sections corresponds to a respective section type of multiple section types; segment items in a first section of multiple sections of the document into multiple fragments, wherein the first section corresponds to a first section type; determine a section type of each of the multiple fragments in the first section; determine whether the multiple fragments include fragments that correspond to different section types and that are interspersed among each other in even proportions; and based on the multiple fragments in the first section including fragments that correspond to different section types and that are interspersed among each other in even proportions; determine that the fragments that correspond to different section types and that are interspersed among each other in even proportions do not belong in the first section; generate a new section corresponding to a section type that corresponds to a section type that is different than the multiple section types; and re-locate the fragments that correspond to different section types and that are interspersed among each other in even proportions to the new section. - View Dependent Claims (14)
-
Specification