Techniques for manipulating unstructured data using synonyms and alternate spellings prior to recasting as structured data
First Claim
1. A method of processing data comprising:
- accessing unstructured data, wherein the unstructured data comprises a plurality of words;
accessing a list of words or phrases comprising synonyms or alternate spellings; and
cross-checking the unstructured data against the list to determine if a word or phrase in the unstructured data appears in the list.
1 Assignment
0 Petitions
Accused Products
Abstract
Unstructured data is manipulated so that the unstructured data is placed in a form that is more compatible with a structured data environment. The manipulation includes editing the unstructured data in preparation for integration into a structured data environment. Specifically, one or more editing programs edit unstructured text using a synonym list and/or an alternate spellings list. Once unstructured text is ready for processing, the unstructured text is examined a word and/or a phrase at a time to determine if there is a match with words or phrases in the synonym list or the alternate spelling list. If a match is found, the synonym or alternate spelling is either replaced in the unstructured document or added to the unstructured document. The unstructured document is then ready for further editing and manipulation in preparation for entry into the structured environment.
58 Citations
25 Claims
-
1. A method of processing data comprising:
-
accessing unstructured data, wherein the unstructured data comprises a plurality of words;
accessing a list of words or phrases comprising synonyms or alternate spellings; and
cross-checking the unstructured data against the list to determine if a word or phrase in the unstructured data appears in the list. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of processing data comprising:
-
reading unstructured data, wherein the unstructured data comprises a plurality of words or phrases;
accessing a list comprising a plurality of first words or phrases, wherein each of the first words or phrases has an associated one or more second words or phrases;
comparing the words or phrases from the unstructured data against the words or phrases in the list; and
modifying one or more words or phrases in the unstructured data with a word or phrase from the list if a match is found. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer-readable medium containing instructions for controlling a computer system to perform a method of processing user inputs comprising:
-
reading unstructured data, wherein the unstructured data comprises a plurality of words or phrases;
accessing a list comprising a plurality of first words or phrases, wherein each of the first words or phrases has an associated one or more second words or phrases;
comparing the words or phrases from the unstructured data against the words or phrases in the list; and
modifying one or more words or phrases in the unstructured data with a word or phrase from the list if a match is found. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
-
Specification