Structuring unstructured machine-generated content
First Claim
1. A system comprising:
- a processing device; and
a memory operatively coupled to the processing device and storing instructions that, when executed by the processing device, cause the system to perform operations comprising;
receiving unstructured content;
processing the unstructured content to identify a first content segment;
identifying, within the first content segment, one or more parameters;
classifying the one or more parameters;
within the first content segment, substituting the one or more parameters for one or more type qualifiers determined to correspond to the one or more parameters, thereby generating a content segment skeleton, the content segment skeleton comprising a hashed representation that reflects an arrangement of the first content segment; and
based on the classifying of the one or more parameters, extracting the one or more parameters from the first content segment into a structured content element in a structured content format, wherein the structured content element includes the hashed representation of the content segment skeleton.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed for structuring unstructured machine-generated content. In one implementation, unstructured content is received and processed to identify a first content segment. Parameter(s) within the first content segment are identified and classified. A content segment skeleton that reflects an arrangement of the first content segment is generated. Based on the classifying of the parameter(s), the parameter(s) are extracted s from the first content segment into a structured content element in a structured content format, with the structured content element including a representation of the content segment skeleton. Based on the structured format, a query adapter is generated. Queries are executed via the query adapter and the structured format.
22 Citations
19 Claims
-
1. A system comprising:
-
a processing device; and a memory operatively coupled to the processing device and storing instructions that, when executed by the processing device, cause the system to perform operations comprising; receiving unstructured content; processing the unstructured content to identify a first content segment; identifying, within the first content segment, one or more parameters; classifying the one or more parameters; within the first content segment, substituting the one or more parameters for one or more type qualifiers determined to correspond to the one or more parameters, thereby generating a content segment skeleton, the content segment skeleton comprising a hashed representation that reflects an arrangement of the first content segment; and based on the classifying of the one or more parameters, extracting the one or more parameters from the first content segment into a structured content element in a structured content format, wherein the structured content element includes the hashed representation of the content segment skeleton. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method comprising:
-
receiving unstructured content; processing the unstructured content to identify a first content segment; identifying, within the first content segment, one or more parameters; classifying the one or more parameters; within the first content segment, substituting the one or more parameters for one or more type qualifiers determined to correspond to the one or more parameters, thereby generating a content segment skeleton, the content segment skeleton comprising a hashed representation that reflects an arrangement of one or more type qualifiers within the first content segment; based on the classifying, extracting the one or more parameters from the first content segment into a structured content element in a structured content format; generating, based on the structured format, a query adapter; receiving, via the query adapter, a search query; identifying, with respect to the search query, a plurality of query results; combining one or more of the identified query results to identify a structured content element that contains elements of the search query; processing one or more parameters of the identified structured content element with respect to a content segment skeleton included in the structured content element to reconstruct the content segment from the unstructured content based on which the identified structured content element was generated; and querying the reconstructed content segment with respect to the search query. - View Dependent Claims (16, 17)
-
-
18. A non-transitory computer readable medium having instructions stored thereon that, when executed by a processing device, cause the processing device to perform operations comprising:
-
receiving unstructured content; processing the unstructured content to identify a first content segment; identifying, within the first content segment, one or more parameters; classifying the one or more parameters; within the first content segment, substituting the one or more parameters for one or more type qualifiers determined to correspond to the one or more parameters, thereby generating a content segment skeleton, the content segment skeleton comprising a hashed representation of an arrangement of the one or more type qualifiers within the first content segment; based on the classifying, extracting the one or more parameters from the first content segment into a structured content element in a structured content format, wherein the structured content element includes the hashed representation of the content segment skeleton; generating, based on the structured format, a query adapter; receiving, via the query adapter, a free text search query comprising a first query element and a second query element; querying a structured content repository with respect to the first query element to generate a first query result; querying the structured content repository with respect to the second query element to generate a second query result; combining the first query result and the second query result to identify a structured content element within the structured content repository that contains the first query element and the second query element; processing one or more parameters of the identified structured content element with respect to a content segment skeleton included in the structured content element to reconstruct a content segment from the unstructured content based on which the identified structured content element was generated; and querying the reconstructed content segment with respect to the free text search query. - View Dependent Claims (19)
-
Specification