Systems and Methods for Semantic Information Retrieval
First Claim
1. A method comprising:
- parsing, by a computer-based system for extracting semantic tags from text, a body of text by determining a language and structure of the body of text;
tokenizing, by the computer-based system, the body of text by splitting the body of text into individual tokens;
generating, by the computer-based system, a tagged body of text by assigning each individual token a part-of-speech tag indicating a grammatical role of the individual token, wherein the grammatical role includes one of a noun, a pronoun, a verb, an adverb, an adjective, a conjunction, a preposition, an article, an auxiliary verb, an infinitive, an interjection, modal verb, an object, a participle, a phrase, and a predicate;
splitting, by the computer-based system, the tagged body of text into grammatical chunks;
identifying, by the computer-based system, named entities within the body of text;
resolving, by the computer-based system, the individual tokens having a pronoun grammatical role with corresponding noun phrases;
deciding, by the computer-based system, a context and purpose of the body of text, and translating semantic concepts of the body of text into one or more semantic tags;
identifying, by the computer-based system, one or more communication topics and presuppositions of the body of text, wherein the identifying the one or more communication. topics and presuppositions comprises analysis of prior communications within the body of text to facilitate the tokenizing the body of text; and
generating, by the computer-based system, a list of the one or more semantic tags.
2 Assignments
0 Petitions
Accused Products
Abstract
A semantic tagging method may add context to a sentence in order to increase search efficiency. Regardless of an author'"'"'s writing style, translating semantic concepts into tags may increase search efficiency. Automatic semantic tagging of documents may allow semantic search and reasoning. Text for semantic tagging may include an email, a website chat room, an internet forum, or a text message, Additional texts may include aggregating general consensus of an emailed topic across multiple emails, whether in the same email chain or separate emails. To increase search efficiency, the analysis of prior communications within the body of text may comprise analyzing structured contextual information to facilitate with homophora resolution. The structured contextual information may include at least one of a sender email address, one or more recipient email addresses, a subject field, a message date and time stamp, and an attachment title.
143 Citations
20 Claims
-
1. A method comprising:
-
parsing, by a computer-based system for extracting semantic tags from text, a body of text by determining a language and structure of the body of text; tokenizing, by the computer-based system, the body of text by splitting the body of text into individual tokens; generating, by the computer-based system, a tagged body of text by assigning each individual token a part-of-speech tag indicating a grammatical role of the individual token, wherein the grammatical role includes one of a noun, a pronoun, a verb, an adverb, an adjective, a conjunction, a preposition, an article, an auxiliary verb, an infinitive, an interjection, modal verb, an object, a participle, a phrase, and a predicate; splitting, by the computer-based system, the tagged body of text into grammatical chunks; identifying, by the computer-based system, named entities within the body of text; resolving, by the computer-based system, the individual tokens having a pronoun grammatical role with corresponding noun phrases; deciding, by the computer-based system, a context and purpose of the body of text, and translating semantic concepts of the body of text into one or more semantic tags; identifying, by the computer-based system, one or more communication topics and presuppositions of the body of text, wherein the identifying the one or more communication. topics and presuppositions comprises analysis of prior communications within the body of text to facilitate the tokenizing the body of text; and generating, by the computer-based system, a list of the one or more semantic tags. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An article of manufacture including a non-transitory, tangible computer readable storage medium having instructions stored thereon that, in response to execution by a computer-based system for extracting semantic tags from text, cause the computer-based system to perform operations comprising:
-
parsing, by the computer-based system, a body of text by determining a language and structure of the body of text; tokenizing, by the computer-based system, the body of text by splitting the body of text into individual tokens; generating, by the computer-based system, a tagged body of text by assigning each individual token a part-of-speech tag indicating a grammatical role of the individual token, wherein the grammatical role includes one of a noun, a pronoun, a verb, an adverb, and an adjective; splitting, by the computer-based system, the tagged body of text into grammatical chunks; identifying, by the computer-based system, named entities within the body of text; resolving, by the computer-based system, the individual tokens having a pronoun grammatical role with corresponding noun phrases; deciding, by the computer-based system, a context and purpose of the body of text, and translating semantic concepts of the body of text into one or more semantic tags; identifying, by the computer-based system, one or more communication topics and presuppositions of the body of text, wherein the identifying the one or more communication topics and presuppositions comprises analysis of prior communications within the body of text to facilitate the tokenizing the body of text; and generating, by the computer-based system, a list of the one or more semantic tags.
-
-
18. A system comprising:
-
a tangible, non-transitory memory communicating with a processor for extracting semantic tags from text, the tangible, non-transitory memory having instructions stored thereon that, in response to execution by the processor, cause the processor to perform operations comprising; parsing, by the processor, a body of text by determining a language and structure of the body of text; tokenizing, by the processor, the body of text by splitting the body of text into individual tokens; generating, by the processor, a tagged body of text by assigning each individual token a part-of-speech tag indicating a grammatical role of the individual token, wherein the grammatical role includes one of a noun, a pronoun, a verb, an adverb, and an adjective; splitting, by the processor, the tagged body of text into grammatical chunks; identifying, by the processor, named entities within the body of text; resolving, by the processor, the individual tokens having a pronoun grammatical role with corresponding noun phrases; deciding, by the processor, a context and purpose of the body of text, and translating semantic concepts of the body of text into one or more semantic tags; identifying, by the processor, one or more communication topics and presuppositions of the body of text, wherein the identifying the one or more communication topics and presuppositions comprises analysis of prior communications within the body of text to facilitate the tokenizing the body of text; and generating, by the processor, a list of the one or more semantic tags. - View Dependent Claims (19, 20)
-
Specification