Method and apparatus for providing proper or partial proper name recognition
First Claim
Patent Images
1. A method of proper name recognition for interpretation of lexical user input and by a system configured to respond to the lexical user input, the method comprising:
- classifying each word of a word string containing a plurality of words with a tag indicating one of a proper named entity category and a non-named entity category, wherein consecutive words that are tagged as belonging to the same proper named entity category are grouped to form partial proper names; and
correcting the tag of a word of the word string;
wherein the correcting includes;
forming a pool of candidate full names selected from a full name database by selecting those full names that best match a given partial proper name, each of at least a subset of the candidate full names including a respective group of words;
changing the tag of a boundary word of the given partial proper name to indicate a non-named entity category if the boundary word does not exist in the pool of candidate full names;
examining a neighboring word of the given partial proper name; and
assigning the neighboring word of the given partial proper name to the named entity category if the neighboring word occurs in the candidate full name with the same ordering.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of proper name recognition includes classifying each word of a word string with a tag indicating a proper name entity category or a non-named entity category, and correcting the tag of a boundary word of the word string.
28 Citations
20 Claims
-
1. A method of proper name recognition for interpretation of lexical user input and by a system configured to respond to the lexical user input, the method comprising:
-
classifying each word of a word string containing a plurality of words with a tag indicating one of a proper named entity category and a non-named entity category, wherein consecutive words that are tagged as belonging to the same proper named entity category are grouped to form partial proper names; and correcting the tag of a word of the word string; wherein the correcting includes; forming a pool of candidate full names selected from a full name database by selecting those full names that best match a given partial proper name, each of at least a subset of the candidate full names including a respective group of words; changing the tag of a boundary word of the given partial proper name to indicate a non-named entity category if the boundary word does not exist in the pool of candidate full names; examining a neighboring word of the given partial proper name; and assigning the neighboring word of the given partial proper name to the named entity category if the neighboring word occurs in the candidate full name with the same ordering. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of proper name recognition for interpretation of lexical user input and by a system configured to respond to the lexical user input, comprising:
-
classifying each word of a word string containing a plurality of words with a tag indicating one of a proper named entity category and a non-named entity category, wherein consecutive words that are tagged as belonging to the same proper named entity category are grouped to form partial proper names; and correcting the tag of a word of the word string; wherein the correcting includes; forming a pool of candidate full names that each has a maximum number of words occurring in a given partial proper name, wherein an order of matched words of the candidate full names is the same as that of the partial proper name, and wherein the pool of candidate full names are selected from a full name database; removing a candidate full name from the pool if when one or more inner words in the given partial proper name is removed the resulting partial proper name is a substring in the candidate full name; changing the tag of a boundary word of the partial proper name to the non-named entity category tag if the boundary word does not exist in the full name pool; examining a neighboring word of the partial proper name; and assigning the neighboring word to the partial proper name if the neighboring word occurs in the candidate full name and with the same ordering. - View Dependent Claims (10)
-
-
11. A system for proper name recognition, comprising:
-
a database including full proper names; and a processor configured to; perform a baseline named entity classifier function to assign a named entity tag to each word of a word string containing a plurality of words, wherein consecutive words that have the same named entity tag are grouped to form partial proper names; and perform a correction function to correct the named entity tag of a word of the word string; wherein, for the correction function, the processor is configured to; form from the database a pool of candidate full names by selecting those full names that best match a given partial proper name, each of at least a subset of the candidate full names including a respective group of words; change the tag of a boundary word of the given partial proper name to indicate a non-named entity category if the boundary word does not exist in the pool of candidate full names; examine a neighboring word of the given partial proper name; and assign the neighboring word of the given partial proper name to the named entity category if the neighboring word occurs in the candidate full name with the same ordering. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A processing arrangement to perform proper name recognition, comprising:
-
an input arrangement to receive a word string containing a plurality of words; a memory storing; a first set of instructions to assign a named entity tag to each word of the word string, wherein consecutive words that have the same named entity tag are grouped to form partial proper names; and a second set of instructions to correct the named entity tag of a boundary word of the word string; a central processing unit to execute the first and second set of instructions; and an output arrangement to output a result of the executed instructions; wherein the correcting includes; forming a pool of candidate full names selected from a full name database by selecting those full names that best match a given partial proper name, each of at least a subset of the candidate full names including a respective group of words; changing the tag of a boundary word of the given partial proper name to indicate a non-named entity category if the boundary word does not exist in the pool of candidate full names; examining a neighboring word of the given partial proper name; and assigning the neighboring word of the given partial proper name to the named entity category if the neighboring word occurs in the candidate full name with the same ordering.
-
-
17. A storage medium having a set of instructions executable by a processor to perform the following:
-
classifying each word of a word string containing a plurality of words with a tag indicating one of a proper name entity category and a non-named entity category, wherein consecutive words that are tagged as belonging to the same proper named entity category are grouped to form partial proper names; and correcting the tag of a boundary word of the word string; wherein the correcting includes; forming a pool of candidate full names selected from a full name database by selecting those full names that best match a given partial proper name, each of at least a subset of the candidate full names including a respective group of words; changing the tag of a boundary word of the given partial proper name to indicate a non-named entity category if the boundary word does not exist in the pool of candidate full names; examining a neighboring word of the given partial proper name; and assigning the neighboring word of the given partial proper name to the named entity category if the neighboring word occurs in the candidate full name with the same ordering.
-
-
18. A storage medium having a set of instructions executable by a processor to perform the following:
-
classifying each word of a word string containing a plurality of words with a tag indicating one of a proper name entity category and a non-named entity category, wherein consecutive words that are tagged as belonging to the same proper named entity category are grouped to form partial proper names; and correcting the tag of a boundary word of the word string; wherein the correcting includes; forming a pool of candidate full names that each has a maximum number of words occurring in a given partial proper name, wherein an order of matched words of the candidate full names is the same as that of the partial proper name, and wherein the pool of candidate full names are selected from a full name database; removing a candidate full name from the pool if when one or more inner words in the given partial proper name is removed the resulting partial proper name is a substring in the candidate full name; removing a candidate full name from the pool if matching part of speech tags of the candidate full name are all non-content words, unless all of the part of speech tabs in the candidate full name are non-content words; changing a tag of a boundary word of the partial proper name to the non-named entity category tag if the boundary word does not exist in the full name pool; examining a neighboring word of the partial proper name; and assigning the neighboring word to the partial proper name if the neighboring word occurs in the candidate full name and with the same ordering.
-
-
19. A method of proper name recognition for interpretation of lexical user input and by a system configured to respond to the lexical user input, comprising:
-
classifying each word of a word string containing a plurality of words with a tag indicating one of a proper named entity category and a non-named entity category; and correcting the tag of a word of the word string; wherein the correcting includes; forming a pool of candidate full names selected from a full name database by selecting those full names that best match a sequence of consecutive words of the word string, each of the consecutive words having been tagged as belonging to the same proper named entity category, each of at least a subset of the candidate full names including a respective group of words; determining which of the candidate full names is a best candidate; and linking the sequence of words to the determined best candidate.
-
-
20. A method of proper name recognition for interpreting a user'"'"'s speech input by a speech recognition system, the method comprising:
-
classifying each word of a word string containing a plurality of words with a tag indicating whether the respective word is of a proper named entity category, wherein a subset of the word string includes consecutive words that have all been tagged during the classifying as being of the same proper named entity category and includes boundaries formed by edge words of the subset that are not surrounded by words tagged as being of the same proper named entity category; and performing a boundary correction procedure that includes; determining whether the boundaries are correct by forming a pool of candidate full names selected from a full name database by selecting those full names that best match the subset and comparing the subset to the pool of candidate full names, each of at least a subset of the candidate full names including a respective group of words; and if it is determined that the boundaries are incorrect, correcting the boundaries by at least one of; (a) retagging at least one of the edge words; and (b) retagging at least one word that is external to the subset and immediately adjacent one of the edge words.
-
Specification