METHOD AND SYSTEM FOR TEXT INTERPRETATION AND NORMALIZATION
First Claim
1. A method for text interpretation and normalization, comprising:
- receiving, using a processor and a memory, a reference data entry that includes one or more strings of text and one or more associated numeric codes, wherein each associated numeric code is associated with one or more strings of text;
creating, using the processor, one or more tokens from the one or more strings of text, wherein each token is tied to an associated numeric code; and
formatting, using the processor, the one or more tokens with an operations code (opcode) that provides additional information about the token, wherein the one or more tokens may be used to interpret non-reference data and associate the non-reference data to one of the one or more associated numeric codes.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for text interpretation and normalization is presented. The method for text interpretation and normalization may include receiving a reference data entry that includes one or more strings of text and one or more associated numeric codes, creating a plurality of tokens from the one or more strings of text, each token being tied to an associated numeric code, formatting the plurality tokens with operations codes (opcodes) that provides additional information about the tokens, retrieving configuration data including the plurality of tokens, the opcodes, and numeric codes associated with the tokens, selecting one inbound, non-reference string for interpretation, comparing tokens from the configuration data to the non-reference string to determine the best matching token, and applying, using the processor, the numeric code associated with the best matching token to the non-reference string in order to normalize the non-reference string.
21 Citations
20 Claims
-
1. A method for text interpretation and normalization, comprising:
-
receiving, using a processor and a memory, a reference data entry that includes one or more strings of text and one or more associated numeric codes, wherein each associated numeric code is associated with one or more strings of text; creating, using the processor, one or more tokens from the one or more strings of text, wherein each token is tied to an associated numeric code; and formatting, using the processor, the one or more tokens with an operations code (opcode) that provides additional information about the token, wherein the one or more tokens may be used to interpret non-reference data and associate the non-reference data to one of the one or more associated numeric codes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for text interpretation and normalization, comprising:
-
retrieving, using a processor and a memory, configuration data including a plurality of tokens, operation codes (opcodes) that provide additional information about the tokens, and numeric codes associated with the tokens; selecting, using the processor, one inbound, non-reference string for interpretation; comparing, using the processor, tokens from the configuration data to the non-reference string to determine the best matching token; and applying, using the processor, the numeric code associated with the best matching token to the non-reference string in order to normalize the non-reference string. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A method for text interpretation and normalization, comprising:
-
receiving, using a processor and a memory, a reference data entry that includes one or more strings of text and one or more associated numeric codes, wherein each associated numeric code is associated with one or more strings of text; creating, using the processor, a plurality of tokens from the one or more strings of text, wherein each token is tied to an associated numeric code; formatting, using the processor, the plurality tokens with operations codes (opcodes) that provides additional information about the tokens, wherein the plurality of tokens may be used to interpret non-reference data and associate the non-reference data to one of the one or more associated numeric codes; retrieving, using the processor, configuration data including the plurality of tokens, the opcodes, and numeric codes associated with the tokens; selecting, using the processor, one inbound, non-reference string for interpretation; comparing, using the processor, tokens from the configuration data to the non-reference string to determine the best matching token; and applying, using the processor, the numeric code associated with the best matching token to the non-reference string in order to normalize the non-reference string. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
15-1. A system for text interpretation and normalization, comprising:
a computer including a processor and memory, wherein the memory includes a computer program stored therein that includes instructions that are executed by the processor for creating tokens by; receiving a reference data entry that includes one or more strings of text and one or more associated numeric codes, wherein each associated numeric code is associated with one or more strings of text; creating a plurality of tokens from the one or more strings of text, wherein each token is tied to an associated numeric code; and formatting the plurality tokens with operations codes (opcodes) that provides additional information about the tokens, wherein the plurality of tokens may be used to interpret non-reference data and associate the non-reference data to one of the one or more associated numeric codes.
Specification