Analyzing data files
First Claim
1. Apparatus arranged to receive data files from data sources and to categorize said data files to facilitate searching in response to user-requests, wherein said data files contain unspecified high value items whose characteristic rather than content may be of interest and high value to a user, said apparatus comprising:
- identifying means for identifying occurrences within a received data file of unspecified candidate items in preferred contexts based on first rules likely to identify a preferred specified category, and for identifying occurrences within said received data file of said unspecified candidate items in non-preferred contexts based on second rules likely to identify a non-preferred specified category; and
processing means for processing said preferred occurrences with said non-preferred occurrences for each of said unspecified candidate items and to select one of said unspecified candidate items as a high value item whose characteristic rather than content may be of interest and high value to a user.
2 Assignments
0 Petitions
Accused Products
Abstract
Data files (205) are categorised in order to facilitate the searching for information. The analysis is performed in order to identify items which may be considered as having high value without actually being directly specified. Occurrences of unspecified candidate items are identified (207) in contexts for a preferred specified category. Occurrences of unspecified candidate items are identified (209) in non-preferred contexts. The preferred occurrences are processed (211) with the non-preferred occurrences for each candidate item in order to select candidate items as being high value items. In the preferred embodiment, data relating to companies is identified without specific company names being defined.
-
Citations
42 Claims
-
1. Apparatus arranged to receive data files from data sources and to categorize said data files to facilitate searching in response to user-requests, wherein said data files contain unspecified high value items whose characteristic rather than content may be of interest and high value to a user, said apparatus comprising:
-
identifying means for identifying occurrences within a received data file of unspecified candidate items in preferred contexts based on first rules likely to identify a preferred specified category, and for identifying occurrences within said received data file of said unspecified candidate items in non-preferred contexts based on second rules likely to identify a non-preferred specified category; and
processing means for processing said preferred occurrences with said non-preferred occurrences for each of said unspecified candidate items and to select one of said unspecified candidate items as a high value item whose characteristic rather than content may be of interest and high value to a user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of analyzing data files containing representations of a natural language to identify unspecified high value items whose characteristic rather than content may be of interest and high value to a user, said method comprising:
-
identifying occurrences of unspecified candidate items within a data file in preferred contexts based on first rules likely to identify a preferred specified category;
identifying occurrences of unspecified candidate items within said data file in non-preferred contexts based on second rules likely to identify a non-preferred specified category;
processing said preferred occurrences with said non-preferred occurrences for each one of said unspecified candidate items; and
selecting one of said unspecified candidate items as a high value item whose characteristic rather than content may be of interest and high value to a user. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A computer system programmed to execute stored instructions such that in response to said stored instructions said system is configured to:
-
identify occurrences within a data file of unspecified candidate items in preferred contexts based on first rules likely to identify a preferred specified category;
identify occurrences within said data file of unspecified candidate items in non-preferred contexts based on second rules likely to identify a non-preferred specified category;
process said preferred occurrences with said non-preferred occurrences for each one of said unspecified candidate items; and
select one of said unspecified candidate items as a high value item whose characteristic rather than content may be of interest and high value to a user. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31)
-
-
32. A computer-readable medium having computer-readable instructions executable by a computer such that, when executing said instructions, the computer will perform the steps of:
-
identifying occurrences within a data file of unspecified candidate items in preferred contexts based on first rules likely to identify a preferred specified category;
identifying occurrences within said data file of unspecified candidate items in non-preferred contexts based on second rules likely to identify a non-preferred specified category;
processing said preferred occurrences with said non-preferred occurrences for each one of said unspecified candidate items; and
selecting one of said unspecified candidate items as a high value item whose characteristic rather than content may be of interest and high value to a user. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
Specification