System and method for processing document
First Claim
Patent Images
1. A computer-implemented method comprising:
- parsing, on a processor, text from at least a portion of a document, wherein the document does not include meta-data indicating a hierarchical structure within the document;
identifying, on the processor, a plurality of hierarchical indicators from the text parsed from the at least a portion of the document, wherein each hierarchical indicator includes one or more portions, the one or more portions including a prefix, a stem, or a suffix;
analyzing, on the processor, the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical value associated with each of the plurality of hierarchical indicators;
analyzing, on the processor, the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical numbering style associated with each of the plurality of hierarchical indicators;
determining, on the processor, a hierarchical level for each of the plurality of hierarchical indicators based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with each of the plurality of hierarchical indicators, wherein the hierarchical level indicates a respective position within the hierarchical structure of the document and is determined for a respective hierarchical indicator prior to determining the hierarchical level associated with a following hierarchical indicator, wherein determining a hierarchical level associated with each of the plurality of hierarchical indicators further includes;
determining whether a current hierarchical indicator follows a preceding hierarchical indicator based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with the current hierarchical indicator and the preceding hierarchical indicator, anddetermining an alternative interpretation of one or more of the determined alphanumerical value and the determined alphanumerical numbering style associated with the preceding hierarchical indicator in response to determining that the current hierarchical indicator does not follow the preceding hierarchical indicator;
associating, on the processor, one or more portions of the document with the respective determined hierarchical level associated with each of the plurality of hierarchical indicators to determine a hierarchical structure for the document, wherein associating the one or more portions of the document with the respective hierarchical level includes associating meta-data with the one or more portions of the document, wherein the meta-data indicates the respective hierarchical level.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and computing system are provided for identifying a plurality of indicators of hierarchy within a document. A hierarchical level associated with each of the plurality of indicators may be determined. One or more portions of the document may be associated with a respective hierarchical level associated with each of the plurality of indicators.
197 Citations
12 Claims
-
1. A computer-implemented method comprising:
-
parsing, on a processor, text from at least a portion of a document, wherein the document does not include meta-data indicating a hierarchical structure within the document; identifying, on the processor, a plurality of hierarchical indicators from the text parsed from the at least a portion of the document, wherein each hierarchical indicator includes one or more portions, the one or more portions including a prefix, a stem, or a suffix; analyzing, on the processor, the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical value associated with each of the plurality of hierarchical indicators; analyzing, on the processor, the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical numbering style associated with each of the plurality of hierarchical indicators; determining, on the processor, a hierarchical level for each of the plurality of hierarchical indicators based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with each of the plurality of hierarchical indicators, wherein the hierarchical level indicates a respective position within the hierarchical structure of the document and is determined for a respective hierarchical indicator prior to determining the hierarchical level associated with a following hierarchical indicator, wherein determining a hierarchical level associated with each of the plurality of hierarchical indicators further includes; determining whether a current hierarchical indicator follows a preceding hierarchical indicator based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with the current hierarchical indicator and the preceding hierarchical indicator, and determining an alternative interpretation of one or more of the determined alphanumerical value and the determined alphanumerical numbering style associated with the preceding hierarchical indicator in response to determining that the current hierarchical indicator does not follow the preceding hierarchical indicator; associating, on the processor, one or more portions of the document with the respective determined hierarchical level associated with each of the plurality of hierarchical indicators to determine a hierarchical structure for the document, wherein associating the one or more portions of the document with the respective hierarchical level includes associating meta-data with the one or more portions of the document, wherein the meta-data indicates the respective hierarchical level. - View Dependent Claims (2, 3, 4)
-
-
5. A computer program product residing on a non-transitory computer readable medium having a plurality of instructions stored thereon, which, when executed by a processor cause the processor to perform operations comprising:
-
parsing text from at least a portion of a document, wherein the document does not include meta-data indicating a hierarchical structure within the document; identifying a plurality of hierarchical indicators from the text parsed from the at least a portion of the document, wherein each hierarchical indicator includes one or more portions, the one or more portions including a prefix, a stem, or a suffix; analyzing the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical value associated with each of the plurality of hierarchical indicators; analyzing the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical numbering style associated with each of the plurality of hierarchical indicators; determining a hierarchical level for each of the plurality of hierarchical indicators based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with each of the plurality of hierarchical indicators, wherein the hierarchical level indicates a respective position within the hierarchical structure of the document and is determined for a respective hierarchical indicator prior to determining the hierarchical level associated with a following hierarchical indicator, wherein determining a hierarchical level associated with each of the plurality of hierarchical indicators further includes; determining whether a current hierarchical indicator follows a preceding hierarchical indicator based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with the current hierarchical indicator and the preceding hierarchical indicator, and determining an alternative interpretation of one or more of the determined alphanumerical value and the determined alphanumerical numbering style associated with the preceding hierarchical indicator in response to determining that the current hierarchical indicator does not follow the preceding hierarchical indicator; and associating one or more portions of the document with the respective hierarchical level associated with each of the plurality of hierarchical indicators to determine a hierarchical structure for the document, wherein associating the one or more portions of the document with the respective hierarchical level includes associating meta-data with the one or more portions of the document, wherein the meta-data indicates the respective hierarchical level. - View Dependent Claims (6, 7, 8)
-
-
9. A computing system comprising one or more processors configured to:
-
parse text from at least a portion of a document, wherein the document does not include meta-data indicating a hierarchical structure within the document; identify a plurality of hierarchical indicators from the text parsed from the at least a portion of the document, wherein each hierarchical indicator includes one or more portions, the one or more portions including a prefix, a stem, or a suffix; analyze the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical value associated with each of the plurality of hierarchical indicators; analyze the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical numbering style associated with each of the plurality of hierarchical indicators; determine a hierarchical level for each of the plurality of hierarchical indicators based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with each of the plurality of hierarchical indicators, wherein the hierarchical level indicates a respective position within the hierarchical structure of the document and is determined for a respective hierarchical indicator prior to determining the hierarchical level associated with a following hierarchical indicator, wherein determining a hierarchical level associated with each of the plurality of hierarchical indicators further includes; determining whether a current hierarchical indicator follows a preceding hierarchical indicator based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with the current hierarchical indicator and the preceding hierarchical indicator, and determining an alternative interpretation of one or more of the determined alphanumerical value and the determined alphanumerical numbering style associated with the preceding hierarchical indicator in response to determining that the current hierarchical indicator does not follow the preceding hierarchical indicator; and associate one or more portions of the document with the respective hierarchical level associated with each of the plurality of hierarchical indicators to determine a hierarchical structure for the document, wherein associating the one or more portions of the document with the respective hierarchical level includes associating meta-data with the one or more portions of the document, wherein the meta-data indicates the respective hierarchical level. - View Dependent Claims (10, 11, 12)
-
Specification