Structured dictionary
DCFirst Claim
1. A computer-readable storage medium that is not a signal, the computer-readable storage medium for storing data for access by a program being executed on a data processing system, comprising:
- a dictionary data structure stored in the computer-readable storage medium, the dictionary data structure including information used by the program and comprising;
a first table comprised of entries each representing a natural language term, each entry of the first table containing a term ID identifying its term;
a second table comprised of entries each representing a definition, each entry of the second table containing a definition ID identifying its definition; and
a third table comprised of entries each representing correspondence between a term and a definition defining the term, each entry of the third table containing a term ID identifying the defined term and a definition ID identifying the defining definition,such that the contents of the data structure are usable to identify any definitions corresponding to a term.
1 Assignment
Litigations
0 Petitions
Accused Products
Abstract
A dictionary data structure is described. The data structure is made up of first, second, and third tables. The first table is comprised of entries each representing a natural language term, each entry of the first table containing a term ID identifying its term. The second table is comprised of entries each representing a definition, each entry of the second containing a definition ID identifying its definition. The third table is comprised of entries each representing correspondence between a terminate definition defining the term, each entry of the third table containing term ID identifying the defined term and a definition ID identifying the defining definition. The contents of the data structure are usable to identify any definitions corresponding to a term.
17 Citations
35 Claims
-
1. A computer-readable storage medium that is not a signal, the computer-readable storage medium for storing data for access by a program being executed on a data processing system, comprising:
-
a dictionary data structure stored in the computer-readable storage medium, the dictionary data structure including information used by the program and comprising; a first table comprised of entries each representing a natural language term, each entry of the first table containing a term ID identifying its term; a second table comprised of entries each representing a definition, each entry of the second table containing a definition ID identifying its definition; and a third table comprised of entries each representing correspondence between a term and a definition defining the term, each entry of the third table containing a term ID identifying the defined term and a definition ID identifying the defining definition, such that the contents of the data structure are usable to identify any definitions corresponding to a term.
-
-
2. The computer-readable storage medium of claim 1 wherein each entry of the first table further contains a textual representation of the entry'"'"'s term.
-
3. The computer-readable storage medium of claim 1 wherein each entry of the second table further contains a textual representation of the entry'"'"'s definition.
-
4. The computer-readable storage medium of claim 1, the data structure further comprising a fourth table comprised of entries each representing a different part of speech, each entry of the fourth table containing a word type ID identifying its word type,
each entry of the second table further containing a word type ID identifying a word type to which its definition corresponds.
-
5. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a word type corresponding to a particular part of speech.
-
6. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a named entity word type.
-
7. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a title word type.
-
8. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a proper name word type.
-
9. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a country word type.
-
10. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating an organization word type.
-
11. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a record word type.
-
12. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a product name word type.
-
13. The computer-readable storage medium of claim 4 wherein a distinguished entity of the fourth table contains a word type ID indicating a service name word type.
-
14. The computer-readable storage medium of claim 4 wherein each entry of the fourth table further contains a textual representation of the entry'"'"'s word type.
-
15. The computer-readable storage medium of claim 1, the data structure further comprising a fourth table comprised of entries each representing a word form, each entry of the fourth table containing a term ID identifying a term for which the entry'"'"'s word form is an alternate form.
-
16. The computer-readable storage medium of claim 15 wherein each entry of the fourth table further contains a textual representation of the entry'"'"'s word form.
-
17. The computer-readable storage medium of claim 1, the data structure further comprising a fourth table comprised of entries each representing a correspondence between two terms, each entry of the fourth table containing two term IDs identifying the two terms and an indication either that the two terms are synonyms or that the two terms are antonyms.
-
18. The computer-readable storage medium of claim 1, the data structure further comprising:
-
a fourth table comprised of entries each representing an acronym, each entry of the fourth table containing an acronym ID identifying its acronym; and a fifth table comprised of entries each representing correspondence between an acronym and a term expanding the acronym, each entry of the fifth table containing an acronym ID identifying the acronym and a term ID identifying the term expanding the acronym.
-
-
19. The computer-readable storage medium of claim 18 wherein each entry of the fourth table further contains a textual representation of the entry'"'"'s acronym.
-
20. The computer-readable storage medium of claim 1, the data structure further comprising a fourth table comprised of entries each representing a correspondence between two terms, each entry of the fourth table containing a first term ID identifying a child term, a second term ID identifying a parent term, and an indication of a relationship type that exists between the identified child term and the identified parent term.
-
21. The computer-readable storage medium of claim 20 wherein a distinguished entry of the fourth table contains an indication that the identified child term is part of the identified parent term.
-
22. The computer-readable storage medium of claim 20 wherein a distinguished entry of the fourth table contains an indication that the identified child term is a type of the identified parent term.
-
23. The computer-readable storage medium of claim 20 wherein a distinguished entry of the fourth table contains an indication that the identified child term is created by the identified parent term.
-
24. The computer-readable storage medium of claim 20 wherein a distinguished entry of the fourth table contains an indication that the identified child term is enforced by the identified parent term.
-
25. The computer-readable storage medium of claim 20 wherein a distinguished entry of the fourth table contains an indication that the identified child term references the identified parent term.
-
26. The computer-readable storage medium of claim 20 wherein a distinguished entry of the fourth table contains information indicating identifying a source from which the relationship represented by the distinguished entry of the fourth table was derived.
-
27. The computer-readable storage medium of claim 1 wherein a sentence relates to a control derived from an authority document, and wherein a the data structure comprises a fourth table comprised of entries each representing a distinct portion of the sentence, each entry of the fourth table containing a definition ID identifying a definition defining a portion of the sentence represented by the entry of the fourth table.
-
28. The computer-readable storage medium of claim 1 wherein the first table comprises a first entry representing a first natural language term and a second entry representing a second natural language term,
the second natural language term being a non-standard form of the first natural language term, the first natural language term being preferred for usage over the second natural language term, the second entry including a harmonized-to field specifying the term ID identifying the first natural language term.
-
29. The computer-readable storage medium of claim 28 wherein the first table further comprises a third entry representing a third natural language term, the third natural language term being a non-standard form of the first natural language term, the first natural language term being preferred for usage over the third natural language term,
the third entry including a harmonized-to field specifying the term ID identifying the first natural language term.
-
30. The computer-readable storage medium of claim 1, the data structure further comprising:
a fourth table indicating, for each of a least a portion of the definitions represented by entries of the second table, for each of one or more groups of one or more natural language corpuses, a number of occurrences in the group of one or more natural language corpuses of the term whose correspondence to the definition is represented by an entry of the third table that have been mapped to the definition.
-
31. The computer-readable storage medium of claim 1, the data structure further comprising:
a fourth table indicating, for each of a least a portion of the terms represented by entries of the first table, for each of one or more groups of one or more natural language corpuses, a number of occurrences in the group of one or more natural language corpuses of the term.
-
32. The computer-readable storage medium of claim 1, wherein the use by the program includes the program using at least some of the information to automatically identify at least one definition corresponding to the term.
-
33. The computer-readable storage medium of claim 1, wherein the use by the program includes identifying relationships between terms, wherein the identified relationships are used, by the program, in mapping portions of a document to harmonized controls.
-
34. A method comprising:
-
executing a program that processes phrases by implementing at least one language engine comprising one or more of;
a named entity engine, a parts of speech tagger, a natural language processing engine, or any combination thereof; andaccessing a dictionary data structure comprising; a first table comprised of entries each representing a natural language term, each entry of the first table containing a term ID identifying its term; a second table comprised of entries each representing a definition, each entry of the second table containing a definition ID identifying its definition; a third table comprised of entries each representing correspondence between a term and a definition defining the term, each entry of the third table containing a term ID identifying the defined term and a definition ID identifying the defining definition; and a fourth table comprised of entries each representing a correspondence between terms, each entry of the fourth table containing one or more first term IDs identifying child terms and a second term ID identifying a parent term; wherein the processing of a particular phrase through the implementation of the at least one language engine is controlled in part based on; identifying, using term IDs in entries in the first, second, and third tables, multiple definitions for the particular phrase in the third table;
oridentifying, based on the first and fourth tables, a hierarchy between terms, including the particular phrase.
-
-
35. A computing system comprising:
-
one or more processors; a first memory storing a dictionary data structure comprising; a first table comprised of entries each representing a natural language term, each entry of the first table containing a term ID identifying its term; a second table comprised of entries each representing a definition, each entry of the second table containing a definition ID identifying its definition; a third table comprised of entries each representing correspondence between a term and a definition defining the term, each entry of the third table containing a term ID identifying the defined term and a definition ID identifying the defining definition; and a fourth table comprised of entries each representing a correspondence between terms, each entry of the fourth table containing one or more first term IDs identifying child terms and a second term ID identifying a parent term; and a second memory storing instructions that, when executed by the one or more processors, cause the computing system to processes phrases by implementing at least one language engine comprising one or more of;
a named entity engine, a parts of speech tagger, a natural language processing engine, or any combination thereof, wherein the processing of a particular phrase through the implementation of the at least one language engine is controlled in part based on;an identification, using term IDs in entries in the first, second, and third tables, of multiple definitions for the particular phrase in the third table;
oran identification, based on the term IDs in entries in the first table and the parent-child relationships defined in entries in the fourth table, a hierarchy between terms that includes the particular phrase.
-
Specification