SYSTEMS AND METHODS FOR EXTRACTING ATTRIBUTES FROM TEXT CONTENT
First Claim
1. A computer implemented method for extracting one or more attributes from text data wherein the text data is obtained from at least one information source, the method comprising:
- receiving, from a user, an address for the at least one information source, and an attribute name;
creating a tagged information file by associating a part of speech tag to text data obtained from the at least one information source;
identifying a location of the attribute name in the tagged information file using an approximate text matching technique; and
determining at least one attribute descriptor from the tagged information file that precedes the attribute and at least one attribute descriptor that succeeds the attribute, wherein the tagged information file is parsed based on a set of associated part of speech tags to determine a conclusion of the at least one attribute descriptor.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and method for extracting attributes from text content are described. Example embodiments may include a computer implemented method for extracting attributes from text data, wherein the text data is obtained from at least one information source. As described, the implementation may include receiving, from a user, an address for the at least one information source and an attribute name, creating a tagged information file by associating a part of speech tag to text data obtained from the at least one information source, identifying a location of the attribute name in the tagged information file using an approximate text matching technique and determining at least one attribute descriptor from the tagged information file wherein the tagged information file is parsed based on a part of speech tag associated with the attribute name to determine a conclusion of the attribute descriptor.
42 Citations
17 Claims
-
1. A computer implemented method for extracting one or more attributes from text data wherein the text data is obtained from at least one information source, the method comprising:
-
receiving, from a user, an address for the at least one information source, and an attribute name; creating a tagged information file by associating a part of speech tag to text data obtained from the at least one information source; identifying a location of the attribute name in the tagged information file using an approximate text matching technique; and determining at least one attribute descriptor from the tagged information file that precedes the attribute and at least one attribute descriptor that succeeds the attribute, wherein the tagged information file is parsed based on a set of associated part of speech tags to determine a conclusion of the at least one attribute descriptor. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for extracting one or more attributes from text data, wherein the text data is obtained from at least one information source, the system comprising:
-
a user interface for receiving, from a user, an address for the at least one information source, and an attribute name; a tag generating module for creating a tagged information file by associating part of speech tags to the text data obtained from the at least one information source; an identifying module for identifying a location of the attribute name in the tagged information file using an approximate text matching technique; and a processing module for determining at least one attribute descriptor from the tagged information file that precedes the attribute name and at least one attribute descriptor that succeeds the attribute name, wherein the tagged information file is parsed based on a set of associated part of speech tags to determine a conclusion of the at least one attribute descriptor. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product comprising a plurality of instructions stored on a non transitory computer readable medium, and comprising program code for extracting one or more attributes from text data, the instruction comprising:
-
program code adapted for receiving, from a user, an address for the at least one information source, and an attribute name; program code adapted for creating a tagged information file by associating part of speech tags to the text data obtained from the at least one information source; program code adapted for identifying location of the attribute name in the tagged information file using approximate text matching techniques; and program code adapted for determining at least one attribute descriptor from the tagged information file that precedes the attribute name and at least one attribute descriptor that succeeds the attribute name, wherein the tagged information file is parsed based on a set of associated part of speech tags to determine a conclusion of the at least one attribute descriptor. - View Dependent Claims (14, 15, 16, 17)
-
Specification