Apparatus for classifying or disambiguating data
First Claim
1. An apparatus for classifying a search query having at least one search query term used in conducting a subject matter search using a search engine, the apparatus comprising:
- an accessor operable to access a store storing a plurality of different collocations of terms with the terms in each collocation exemplifying a particular subject matter category to facilitate disambiguation between different meanings of a same term;
a disambiguator operable to disambiguate different meanings of the at least one search query term by comparing the at least one search query term with the terms of the plurality of different collocations of terms to determine, on the basis of the relationship between the at least one search query term and the terms of the collocations, at least one subject matter category with which the at least one search query term is associated; and
a supplier operable to supply a signal including signal data representing the at least one search query term and the determined at least one subject matter category.
1 Assignment
0 Petitions
Accused Products
Abstract
A computing system has a data storage device (4, 5, 6) for storing a database consisting of a classified vocabulary of terms. A processor (1) of the apparatus is arranged to associate each term with one of a number of different categories of data and to associate all terms falling within the same category with a common code identifying a collocation of terms that exemplify that category so that terms in different categories are associated with different codes and can be disambiguated. The processor (1) is arranged to write, directly or indirectly, a classified vocabulary consisting of the terms together with the associated code onto a computer-readable storage medium (RDD2) or to supply an electrical signal via, for example a MODEM (10) or a LAN/WAN (11). The database may be used in classification of documents, spelling checking of documents and refining of keyword search results.
52 Citations
24 Claims
-
1. An apparatus for classifying a search query having at least one search query term used in conducting a subject matter search using a search engine, the apparatus comprising:
-
an accessor operable to access a store storing a plurality of different collocations of terms with the terms in each collocation exemplifying a particular subject matter category to facilitate disambiguation between different meanings of a same term; a disambiguator operable to disambiguate different meanings of the at least one search query term by comparing the at least one search query term with the terms of the plurality of different collocations of terms to determine, on the basis of the relationship between the at least one search query term and the terms of the collocations, at least one subject matter category with which the at least one search query term is associated; and a supplier operable to supply a signal including signal data representing the at least one search query term and the determined at least one subject matter category. - View Dependent Claims (2, 3, 4)
-
-
5. A computer processing apparatus for refining the results of a subject matter search carried out by a search engine using at least one keyword, the apparatus comprising:
-
a database accessor operable to access a database having a database structure providing a plurality of different subject matter categories, the database comprising a classified vocabulary consisting of terms in all of the different subject matter categories with each term being classified in accordance with the subject matter category structure of the database and the database also comprising a plurality of different collocations of terms with the terms in each collocation exemplifying a particular subject matter category to facilitate disambiguation between different meanings of the same term; a receiver operable to receive computer-readable form documents forming the results of the subject matter search; a disambiguator operable to compare the at least one keyword used to carry out the search with terms of at least one of the classified vocabulary and to disambiguate different meanings of the at least one keyword by comparing the at least one keyword with the terms of the plurality of different collocations of terms to determine, on the basis of the comparison, at least one category with which the at least one keyword is associated; and a supplier operable to supply the user with information relating the search results to the determined at least one category. - View Dependent Claims (6, 7, 8)
-
-
9. A computer processing apparatus for refining the results of a subject matter search carried out by a search engine using at least one keyword, the apparatus comprising:
-
a database accessor operable to access a database having a database structure providing a plurality of different subject matter categories, the database comprising a classified vocabulary consisting of terms in all of the different subject matter categories with each term being classified in accordance with the subject matter category structure of the database and the database also comprising a plurality of collocations each collocation being associated with a specific different one of the subject matter categories and each collocation consisting of a plurality of terms exemplifying the associated category; a receiver operable to receive computer-readable form documents forming the results of the subject matter search; a classified vocabulary comparer operable to compare the at least one keyword used to carry out the search with the classified vocabulary to determine each category with which the keyword is associated; an adviser operable to advise a user of the different categories with which the at least one keyword is associated; a user-operable selector operable to enable a user to select one of said different categories; a collocation accessor operable to access the collocation associated with the selected category; a disambiguator operable to disambiguate different meanings of terms by comparing the terms used in the search result documents with the terms in the accessed collocation of terms to determine, on the basis of the extent to which each search result contains terms from the collocation of terms associated with the category, which of the search results is related to the selected category; and a supplier operable to supply the user with information relating the search results to the selected category. - View Dependent Claims (10)
-
-
11. A method of classifying a search query having at least one search query term used in conducting a subject matter search using a search engine, the method comprising:
-
accessing a store storing a plurality of different collocations of terms with the terms in each collocation exemplifying a particular different subject matter category for facilitating disambiguation between different meanings of the same term; disambiguating different meanings of the at least one search query term by comparing the at least one search query term with the plurality of different collocations of terms to determine, on the basis of the relationship between the at least one search query term and the terms of the collocations, at least one subject matter category with which the at least one search query term is associated; and supplying a signal comprising signal data representing the at least one search query term and the determined at least one subject matter category. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A method of refining search results of a subject matter search carried out by a search engine using at least one keyword, the method comprising the steps of:
-
comparing the at least one keyword used to carry out the search with terms of a classified vocabulary consisting of terms in all of a plurality of different subject matter categories and stored in a database that also comprises a plurality of collocations each collocation being associated with a specific different one of the subject matter categories and each collocation consisting of a plurality of terms exemplifying the associated category to determine each category with which the keyword is associated; advising a user of the different categories with which the at least one keyword is associated; determining a selection by the user of one of said different categories; accessing the collocation associated with the selected category; disambiguating different meanings of terms used in the search results by comparing the terms used in the search results with the terms in the accessed collocation to determine, on the basis of the extent to which each search result contains terms from the collocation of terms associated with the category, which of the search results is related to the selected category; and supplying the user with information relating the search results to the selected category.
-
-
17. A processor readable medium storing processor readable instructions, said instructions causing a processor to:
-
access a store storing a plurality of different collocations of terms with the terms in each collocation exemplifying a particular different subject matter category for facilitating disambiguation between different meanings of a target term; disambiguate different meanings of the target term by comparing the target term with the terms of the plurality of different collocations of terms to determine, on the basis of the relationship between the target term and the terms of the collocations, at least one subject matter category with which the target term is associated; and supply a signal comprising signal data representing the target term and the determined at least one subject matter category.
-
-
18. Apparatus for classifying a search query comprising at least one search query term used in conducting a subject matter search using a search engine, the apparatus comprising:
-
means for accessing a store storing a plurality of different collocations of terms with the terms in each collocation exemplifying a particular different subject matter category to facilitate disambiguation between different meanings of the same term; means for disambiguating different meanings of the at least one search query term by comparing the at least one search query term with the plurality of different collocations of terms to determine, on the basis of the relationship between the at least one search query term and the terms of the collocations, at least one subject matter category with which the at least one search query term is associated; and means for supplying a signal comprising signal data representing the at least one search query term and the determined at least one subject matter category.
-
-
19. An apparatus for classifying an item of data that comprises or is associated with text comprising terms, the apparatus comprising:
-
an accessor operable to access a store storing a plurality of different collocations of terms with the terms in each collocation exemplifying a particular different subject matter category to facilitate disambiguation between different meanings of the same term; and a disambiguator operable to disambiguate different meanings of terms used in the text by comparing terms of the text with terms of the plurality of different collocations to determine, on the basis of the relationship between terms of the text and terms of the collocations, at least one subject matter category for the item of data to enable that item of data to be associated with other items of data in that at least one subject matter category. - View Dependent Claims (20)
-
-
21. A computer processing apparatus for classifying an item of data that comprises or is associated with text comprising terms, the computer processing apparatus comprising:
-
a database accessor operable to access a database having a database structure providing a plurality of different subject matter categories, the database comprising a classified vocabulary consisting of terms in all of the different subject matter categories with each classified vocabulary term being classified in accordance with the subject matter category structure of the database and the database also comprising a plurality of different collocations of terms with the collocation terms in each collocation exemplifying a particular different subject matter category to facilitate disambiguation between different meanings of the same term; a receiver operable to receive items of data; a disambiguator operable to disambiguate different meanings of the same term by comparing terms of the text data with at least one of the classified vocabulary terms and the collocation terms to determine, on the basis of the extent to which the terms of the text correspond to the at least one of the classified vocabulary terms and the collocation terms, at least one category with which the data item is associated; and a supplier operable to supply the user with information relating the data item to the determined at least one category. - View Dependent Claims (22)
-
-
23. A computer processing apparatus for classifying items of data that each comprise or is associated with text comprising terms, the computer processing apparatus comprising:
-
a database accessor operable to access a database having a database structure providing a plurality of different subject matter categories, the database comprising a classified vocabulary consisting of terms in all of the different subject matter categories with each classified vocabulary term being classified in accordance with the subject matter category structure of the database and the database also comprising a plurality of collocations each collocation being associated with a specific different one of the subject matter categories and each collocation consisting of a plurality of terms exemplifying the associated category; a receiver operable to receive items of data; a classified vocabulary comparer operable to compare terms of the text for received items of data with classified vocabulary terms of the classified vocabulary to determine each category with which each item of data is associated; an adviser operable to advise a user of the different categories with which the items of data are associated; a user-operable selector operable to enable a user to select one of said different categories; a collocation accessor operable to access the collocation associated with the selected category; a disambiguator operable to disambiguate between different meanings of terms by comparing the terms of the items of data with the collocation terms in the accessed collocation to determine, on the basis of the extent to which the data terms of each item of data contains collocation terms from the collocation of terms associated with the selected category, which of the items of data is related to the selected category; and a supplier operable to supply information regarding the relationship between the items of data and the selected category. - View Dependent Claims (24)
-
Specification