System and method for generating personalized user profiles and for utilizing the generated user profiles to perform adaptive internet searches
DC CAFCFirst Claim
1. A data processing method for enabling a user utilizing a local computer system having a local data storage system to locate desired data from a plurality of data items stored in a remote data storage system in a remote computer system, the remote computer system being linked to the local computer system by a telecommunication link, the method comprising the steps of:
- (a) extracting, by one of the local computer system and the remote computer system, a user profile from user linguistic data previously provided by the user, said user data profile being representative of a first linguistic pattern of the said user linguistic data;
(b) constructing, by the remote computer system, a plurality of data item profiles, each plural data item profile corresponding to a different one of each plural data item stored in the remote data storage system, each of said plural data item profiles being representative of a second linguistic pattern of a corresponding plural data item, each said plural second linguistic pattern being substantially unique to each corresponding plural data item;
(c) providing, by the user to the local computer system, search request data representative of the user'"'"'s expressed desire to locate data substantially pertaining to said search request data;
(d) extracting, by one of the local computer system and the remote computer system, a search request profile from said search request data, said search request profile being representative of a third linguistic pattern of said search request data;
(e) determining, by one of the local computer system and the remote computer system, a first similarity factor representative of a first correlation between said search request profile and said user profile by comparing said search request profile to said user profile;
(f) determining, by one of the local computer system and the remote computer system, a plurality of second similarity factors, each said plural second similarity factor being representative of a second correlation between said search request profile and a different one of said plural data item profiles, by comparing said search request profile to each of said plural data item profiles;
(g) calculating, by one of the local computer system and the remote computer system, a final match factor for each of said plural data item profiles, by adding said first similarity factor to at least one of said plural second similarity factors in accordance with at least one intersection between said first correlation and said second correlation;
(h) selecting, by one of the local computer system and the remote computer system, one of said plural data items corresponding to a plural data item profile having a highest final match factor; and
(i) retrieving, by one of the local computer system and the remote computer system from the remote data storage system, said selected data item for display to the user, such that the user is presented with a data item having linguistic characteristics that substantially correspond to linguistic characteristics of the linguistic data generated by the user, whereby the linguistic characteristics of the data item correspond to the user'"'"'s social, cultural, educational, economic background as well as to the user'"'"'s psychological profile.
5 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A system and method for automatically generating personalized user profiles and for utilizing the generated profiles to perform adaptive Internet or computer data searches is provided. In accordance with the present invention, particular linguistic patterns and their frequency of recurrence are extracted from personal texts provided by the users of the system of the present invention and stored in a user profile data file such that the user profile data file is representative of the user'"'"'s overall linguistic patterns and the frequencies of recurrence thereof. All documents in a remote computer system, such as the Internet, are likewise analyzed and their linguistic patterns and pattern frequencies are also extracted and stored in corresponding document profiles. When a search for particular data is initiated by the user, linguistic patterns are also extracted from a search string provided by the user into a search profile. The user profile is then cross matched with the search profile and the document profiles to determine whether any linguistic patterns match in all three profiles and to determine the magnitude of the match based on summation of respective frequencies of recurrence of the matching patterns. The documents with document profiles having the highest matching magnitudes are presented to the user as not only matching the subject of the search string, but also as corresponding to the user'"'"'s cultural, educational, and social backgrounds as well as the user'"'"'s psychological profile.
348 Citations
62 Claims
-
1. A data processing method for enabling a user utilizing a local computer system having a local data storage system to locate desired data from a plurality of data items stored in a remote data storage system in a remote computer system, the remote computer system being linked to the local computer system by a telecommunication link, the method comprising the steps of:
-
(a) extracting, by one of the local computer system and the remote computer system, a user profile from user linguistic data previously provided by the user, said user data profile being representative of a first linguistic pattern of the said user linguistic data;
(b) constructing, by the remote computer system, a plurality of data item profiles, each plural data item profile corresponding to a different one of each plural data item stored in the remote data storage system, each of said plural data item profiles being representative of a second linguistic pattern of a corresponding plural data item, each said plural second linguistic pattern being substantially unique to each corresponding plural data item;
(c) providing, by the user to the local computer system, search request data representative of the user'"'"'s expressed desire to locate data substantially pertaining to said search request data;
(d) extracting, by one of the local computer system and the remote computer system, a search request profile from said search request data, said search request profile being representative of a third linguistic pattern of said search request data;
(e) determining, by one of the local computer system and the remote computer system, a first similarity factor representative of a first correlation between said search request profile and said user profile by comparing said search request profile to said user profile;
(f) determining, by one of the local computer system and the remote computer system, a plurality of second similarity factors, each said plural second similarity factor being representative of a second correlation between said search request profile and a different one of said plural data item profiles, by comparing said search request profile to each of said plural data item profiles;
(g) calculating, by one of the local computer system and the remote computer system, a final match factor for each of said plural data item profiles, by adding said first similarity factor to at least one of said plural second similarity factors in accordance with at least one intersection between said first correlation and said second correlation;
(h) selecting, by one of the local computer system and the remote computer system, one of said plural data items corresponding to a plural data item profile having a highest final match factor; and
(i) retrieving, by one of the local computer system and the remote computer system from the remote data storage system, said selected data item for display to the user, such that the user is presented with a data item having linguistic characteristics that substantially correspond to linguistic characteristics of the linguistic data generated by the user, whereby the linguistic characteristics of the data item correspond to the user'"'"'s social, cultural, educational, economic background as well as to the user'"'"'s psychological profile. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 61)
(j) prior to said step (a), automatically adding, by one of the local computer system and the remote computer system, textual data generated by the user during utilization of the local computer system to said user linguistic data.
-
-
3. The method of claim 1, wherein said user linguistic data comprises at least one of:
- personal textual data generated by the user and favorite textual data generated by a source other than the user and that the user has adopted as being favorite.
-
4. The method of claim 1, wherein said user linguistic data comprises at least one text item, each said at least one text item comprising at least one sentence.
-
5. The method of claim 3, further comprising the step of:
(k) prior to said step (a), selecting, by the user at least one of said personal textual data and said favorite textual data, from textual data stored in one of the local data storage system and the remote data storage system.
-
6. The method of claim 1, further comprising the step of:
-
(l) prior to said step (a), determining, by one of the local computer system and the remote computer system, whether an existing user data profile is stored in one of the local data storage system and the remote data storage system, and;
1) when an existing user data profile is stored in one of the local data storage system and the remote data storage system, retrieving said existing user data profile and proceeding to said step (b); and
2) when an existing user data profile is not stored in one of the local data storage system and the remote data storage system, proceeding to said step (a).
-
-
7. The method of claim 4, wherein said step (a) comprises the steps of:
-
(m) generating, by one of the local computer system and the remote computer system, a user data profile;
(n) retrieving, by one of the local computer system and the remote computer system, a text item from said user linguistic data;
(o) separating, by one of the local computer system and the remote computer system, said text item into at least one sentence;
(p) extracting, from each of said at least one sentence, by one of the local computer system and the remote computer system, at least one segment representative of a linguistic pattern of each sentence of said at least one sentence;
(q) adding, by one of the local computer system and the remote computer system, at least one segment extracted at said step (p) to said user data profile;
(r) repeating, by one of the local computer system and the remote computer system, said steps (n) to (q) for each text item of said at least one text item in said user linguistic data;
(s) generating at least one user segment group, by one of the local computer system and the remote computer system, by grouping together identical segments of said at least one segment;
(t) determining a user segment count, by one of the local computer system and the remote computer system, for each user segment group of said at least one user segment group, each said user segment count being representative of a number of identical segments in the corresponding user segment group of said at least one user segment group, and linking each said user segment count to the corresponding user segment group of said at least one user segment group;
(u) sorting the user segment groups of said at least one user segment group, by one of the local computer system and the remote computer system, in an descending order of user segment counts starting from a user segment group having a highest user segment count, and recording said user segment groups and corresponding user segment counts in said user data profile; and
(v) storing, by one of the local computer system and the remote computer system, said user data profile, representative of said first linguistic pattern, in at least one of the local data storage system and the remote data storage system.
-
-
8. The method of claim 7, wherein said step (o) comprises the step of:
-
(w) determining a word count by sequentially counting words of said text item;
(x) when an end of sentence mark is reached before said word count reaches a predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (w) starting after a last word of said stored sentence; and
(y) when said word count reaches said predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (w) starting after a last word of said stored sentence.
-
-
9. The method of claim 8, wherein said end of sentence mark comprises one of:
- a period, an exclamation mark, and a question mark.
-
10. The method of claim 7, wherein said step (p) comprises the steps, performed for each sentence of said at least one sentence, of:
-
(z) identifying and tagging each word in a sentence as one of a predetermined plurality of different parts of speech; and
(aa) arranging a predetermined number of said tagged words in a predetermined order of said predetermined plural different parts of speech to compose at least one segment for each possible combination of said predetermined number of said tagged words arranged in said predetermined order, said at least one segment being representative of a linguistic pattern of said sentence.
-
-
11. The method of claim 10, further comprising the step of:
(bb) after said step (z), determining whether each word may serve as an additional part of speech, and when a word may serve as an additional part of speech, adding an additional tag to said word to identify said word as said additional part of speech.
-
12. The method of claim 10, wherein said predetermined plurality of different parts of speech comprises at least one of:
- noun, pronoun, verb, adverb, adjective, gerund, proposition, conjunction and interjection.
-
13. The method of claim 10, wherein said predetermined plurality of different parts of speech comprises a noun, a verb and an adjective, wherein said predetermined number is three, and wherein said predetermined order is noun, verb, adjective.
-
14. The method of claim 10, wherein said step (aa) further comprises the step of:
(cc) when one of said predetermined plural different parts of speech is missing from said sentence, inserting a blank mark into said segment instead of said missing predetermined part of speech.
-
15. The method of claim 7, wherein said step (v) further comprises the step of:
(dd) encrypting said user data profile such that said encrypted user data profile may only be utilized when an authorization is received from the user.
-
16. The method of claim 7, wherein said step (u) further comprises the step of:
(ee) recording, in said user data profile, only a first predetermined portion of said at least one user segment groups having highest user segment counts.
-
17. The method of claim 16, wherein said first predetermined portion comprises one of:
- 5,000 user segment groups and a top five percent of said at least one user segment groups.
-
18. The method of claim 7, wherein said step (b) further comprises a step of:
(ff) for each plural data item, generating a data item record comprising at least one text item from the data item, each said at least one text item comprising at least one sentence.
-
19. The method of claim 18, wherein one of said at least one text items is a primary text item, and wherein said primary text item comprises at least one hyperlink to at least one additional text item, such that when said at least one hyperlink is activated, said at least one additional text item is thereby retrieved, further comprising the step of:
(gg) retrieving, by the remote computer system, said at least one additional text item into said data item record.
-
20. The method of claim 18, wherein said step (b) comprises the steps, performed for each plural data item, of:
-
(hh) generating, by the remote computer system, a data item profile, said data item profile comprising a data item address representative of a location of said data item in the remote data storage system, such that said data item may be retrieved by providing said data item address to said remote computer system;
(ii) retrieving, by the remote computer system, a text item from said data item record;
(jj) separating, by the remote computer system, said text item into at least one sentence;
(kk) extracting, from each of said at least one sentence, by the remote computer system, at least one segment representative of a linguistic pattern of each sentence of said at least one sentence;
(ll) adding, by the remote computer system, at least one segment extracted at said step (kk) to said data item profile;
(mm) repeating, by the remote computer system, said steps (ii) to (ll) for each text item of said at least one text item in said data item record;
(nn) generating at least one data segment group, by the remote computer system, by grouping together identical segments of said at least one segment;
(oo) determining a data item segment count, by the remote computer system, for each data segment group of said at least one data segment group, each said data item segment count being representative of a number of identical segments in the corresponding data segment group of said at least one data segment group, and linking each said data item segment count to the corresponding data segment group of said at least one data segment group;
(pp) sorting the data segment groups of said at least one data segment group, by the remote computer system, in an descending order of data item segment counts starting from a data segment group having a highest data item segment count, and recording said data segment groups and corresponding data item segment counts in said data item profile; and
(qq) storing, by the remote computer system, said data item profile, representative of one of said plural second linguistic patterns, in the remote data storage system.
-
-
21. The method of claim 20, wherein said step (jj) comprises the step of:
-
(rr) determining a word count by sequentially counting words of said text item;
(ss) when an end of sentence mark is reached before said word count reaches a predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (rr) starting after a last word of said stored sentence; and
(tt) when said word count reaches said predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (rr) starting after a last word of said stored sentence.
-
-
22. The method of claim 21, wherein said end of sentence mark comprises one of:
- a period, an exclamation mark, and a question mark.
-
23. The method of claim 20, wherein said step (kk) comprises the steps, performed for each sentence of said at least one sentence, of:
-
(uu) identifying and tagging each word in a sentence as one of said predetermined plurality of different parts of speech; and
(vv) arranging a predetermined number of said tagged words in a predetermined order of said predetermined plural different parts of speech to compose at least one segment for each possible combination of said predetermined number of said tagged words arranged in said predetermined order, said at least one segment being representative of a linguistic pattern of said sentence.
-
-
24. The method of claim 23, further comprising the step of:
(ww) after said step (uu), determining whether each word may serve as an additional part of speech, and when a word may serve as an additional part of speech, adding an additional tag to said word to identify said word as said additional part of speech.
-
25. The method of claim 23, wherein said predetermined plurality of different parts of speech comprises at least one of:
- noun, pronoun, verb, adverb, adjective, gerund, proposition, conjunction and interjection.
-
26. The method of claim 23, wherein said predetermined plurality of different parts of speech comprises a noun, a verb and an adjective, wherein said predetermined number is three, and wherein said predetermined order is noun, verb, adjective.
-
27. The method of claim 23, wherein said step (vv) further comprises the step of:
(xx) when one of said predetermined plural different parts of speech is missing from said sentence, inserting a blank mark into said segment instead of said missing predetermined part of speech.
-
28. The method of claim 20, wherein said step (pp) further comprises the step of:
(yy) recording, in said data item profile, only a second predetermined portion of said at least one data segment groups having highest data item segment counts.
-
29. The method of claim 28, wherein said second predetermined portion comprises one of:
- 5,000 data segment groups and a top five percent of said at least one data segment groups.
-
30. The method of claim 20, wherein said step (d) comprises the steps of:
-
(zz) generating, by one of the local computer system and the remote computer system, a search profile;
(aaa) separating, by one of the local computer system and the remote computer system, said search request data into at least one sentence;
(bbb) extracting, from each of said at least one sentence, by one of the local computer system and the remote computer system, at least one search segment representative of a linguistic pattern of each sentence of said at least one sentence; and
(ccc) adding, by one of the local computer system and the remote computer system, at least one search segment extracted at said step (bbb) to said search profile, said search profile being representative of said third linguistic pattern of said search request data.
-
-
31. The method of claim 30, wherein said step (aaa) comprises the step of:
-
(ddd) determining a word count by sequentially counting words of said search request data;
(eee) when an end of sentence mark is reached before said word count reaches a predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (ddd) starting after a last word of said stored sentence; and
(fff) when said word count reaches said predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (ddd) starting after a last word of said stored sentence.
-
-
32. The method of claim 31, wherein said end of sentence mark comprises one of:
- a period, an exclamation mark, and a question mark.
-
33. The method of claim 30, wherein said step (bbb) comprises the steps, performed for each sentence of said at least one sentence, of:
-
(ggg) identifying and tagging each word in a sentence as one of said predetermined plurality of different parts of speech; and
(hhh) arranging a predetermined number of said tagged words in a predetermined order of said predetermined plural different parts of speech to compose at least one segment for each possible combination of said predetermined number of said tagged words arranged in said predetermined order, said at least one segment being representative of a linguistic pattern of said sentence.
-
-
34. The method of claim 33, further comprising the step of:
(iii) after said step (ggg), determining whether each word may serve as an additional part of speech, and when a word may serve as an additional part of speech, adding an additional tag to said word to identify said word as said additional part of speech.
-
35. The method of claim 33, wherein said predetermined plurality of different parts of speech comprises at least one of:
- noun, pronoun, verb, adverb, adjective, gerund, proposition, conjunction and interjection.
-
36. The method of claim 33, wherein said predetermined plurality of different parts of speech comprises a noun, a verb and an adjective, wherein said predetermined number is three, and wherein said predetermined order is noun, verb, adjective.
-
37. The method of claim 33, wherein said step (hhh) further comprises the step of:
(jjj) when one of said predetermined plural different parts of speech is missing from said sentence, inserting a blank mark into said segment instead of said missing predetermined part of speech.
-
38. The method of claim 33, further comprising the steps of:
-
(kkk) determining, by one of the local computer system and the remote computer system, at least one synonym for each word in each segment;
(lll) composing, by one of the local computer system and the remote computer system, a plurality of alternate search segments for each segment utilizing said synonyms, wherein said alternate search segments are composed in accordance with said predetermined order of said predetermined plural different parts of speech; and
(mmm) recording, by one of the local computer system and the remote computer system, said plural alternate search segments in said search profile.
-
-
39. The method of claim 30, wherein said step (e) comprises the steps of:
-
(nnn) retrieving, by one of the local computer system and the remote computer system, said user data profile from one of the local data storage system and the remote data storage system; and
(ooo) comparing, by one of the local computer system and the remote computer system, said at least one user segment group to said at least one search segment, and recording said user segment counts of each user segment group of said at least one user segment group that matches a corresponding search segment of said at least one search segment, said user segment counts being representative of said first similarity factor.
-
-
40. The method of claim 39, wherein said step (f) comprises the steps of:
-
(ppp) for each plural data item, retrieving, by one of the local computer system and the remote computer system, a corresponding data item profile from the remote data storage system; and
(qqq) for each plural data item profile, comparing, by one of the local computer system and the remote computer system, said at least one data segment group to said at least one search segment, and recording said data segment counts of each data segment group of said at least one data segment group that matches a corresponding search segment of said at least one search segment, said data segment counts being representative of said plural second similarity factor.
-
-
41. The method of claim 40, wherein said step (g) comprises the steps of:
-
(rrr) for each said plural data item profile, determining a least one match value, by one of the local computer system and the remote computer system, by first identifying a data segment group in the plural data item profile that matches both a corresponding search segment and a corresponding user segment group and then adding said user segment count of said corresponding user segment group to said data segment count of said identified data segment group, wherein when no matches are identified, said at least one match value is set to null; and
(sss) for each said plural data item profile, determining a final match factor, by one of the local computer system and the remote computer system, by adding together all said at least one match values determined for said plural data item profile at said step (rrr).
-
-
42. The method of claim 40, wherein said step (ppp) comprises the steps of:
-
(ttt) applying, by the remote computer system, said search request data to a conventional data search engine, implemented in the remote computer system, to return a list of at least one data item address of at least one preliminary matching data item that potentially corresponds to said search request data; and
(uuu) retrieving from the remote storage system, by one of the local computer system and the remote computer system, at least one data item profile corresponding to said at least one preliminary matching data item in said list.
-
-
43. The method of claim 1, wherein said step (h) comprises the steps of:
-
(vvv) selecting, by one of the local computer system and the remote computer system, a portion of said plural data items corresponding to a predetermined number of plural data item profiles having highest final match factors; and
wherein said step (i) comprises the step of;
(www) retrieving, by one of the local computer system and the remote computer system from the remote data storage system, said selected data items for display to the user, such that the user is presented with a group of data items having linguistic characteristics that substantially correspond to linguistic characteristics of the linguistic data generated by the user, whereby the linguistic characteristics of the data items correspond to the user'"'"'s social, cultural, educational, economic background as well as to the user'"'"'s psychological profile.
-
-
61. The method of claim 1, wherein the remote computer system comprises a plurality of computer systems connected to the Internet and the World Wide Web.
-
44. A data processing method for enabling a user, utilizing a computer system having a data storage system, to locate desired data from a plurality of data items stored in the data storage system, the method comprising the steps of:
-
(a) extracting, by the local computer system, a user profile from user linguistic data previously provided by the user, said user data profile being representative of a first linguistic pattern of the said user linguistic data;
(b) constructing, by the computer system, a plurality of data item profiles, each plural data item profile corresponding to a different one of each plural data item stored in the data storage system, each of said plural data item profiles being representative of a second linguistic pattern of a corresponding plural data item, each said plural second linguistic pattern being substantially unique to each corresponding plural data item;
(c) providing, by the user to the computer system, search request data representative of the user'"'"'s expressed desire to locate data substantially pertaining to said search request data;
(d) extracting, by the computer system, a search request profile from said search request data, said search request profile being representative of a third linguistic pattern of said search request data;
(e) determining, by the computer system, a first similarity factor representative of a first correlation between said search request profile and said user profile by comparing said search request profile to said user profile;
(f) determining, by the computer system, a plurality of second similarity factors, each said plural second similarity factor being representative of a second correlation between said search request profile and a different one of said plural data item profiles, by comparing said search request profile to each of said plural data item profiles;
(g) calculating, by the computer system, a final match factor for each of said plural data item profiles, by adding said first similarity factor to at least one of said plural second similarity factors in accordance with at least one intersection between said first correlation and said second correlation;
(h) selecting, by the computer system, one of said plural data items corresponding to a plural data item profile having a highest final match factor; and
(i) retrieving, by the computer system from the data storage system, said selected data item for display to the user, such that the user is presented with a data item having linguistic characteristics that substantially correspond to linguistic characteristics of the linguistic data generated by the user, whereby the linguistic characteristics of the data item correspond to the user'"'"'s social, cultural, educational, economic background as well as to the user'"'"'s psychological profile.
-
-
45. A data processing method for generating a user data profile representative of a user'"'"'s social, cultural, educational, economic background and of the user'"'"'s psychological profile, the method being implemented in a computer system having a storage system, comprising the steps of:
-
(a) retrieving, by the computer system, user linguistic data previously provided by the user, said user linguistic data comprising at least one text item, each said at least one text item comprising at least one sentence;
(b) generating, by the computer system, an empty user data profile;
(c) retrieving, by the computer system, a text item from said user linguistic data;
(d) separating, by the computer system, said text item into at least one sentence;
(e) extracting, from each of said at least one sentence, by the computer system, at least one segment representative of a linguistic pattern of each sentence of said at least one sentence;
(f) adding, by the computer system, at least one segment extracted at said step (e) to said user data profile;
(g) repeating, by the computer system, said steps (c) to (f) for each text item of said at least one text item in said user linguistic data;
(h) generating at least one user segment group, by the computer system, by grouping together identical segments of said at least one segment;
(i) determining a user segment count, by the computer system, for each user segment group of said at least one user segment group, each said user segment count being representative of a number of identical segments in the corresponding user segment group of said at least one user segment group, and linking each said user segment count to the corresponding user segment group of said at least one user segment group;
(j) sorting the user segment groups of said at least one user segment group, by the computer system, in an descending order of user segment counts starting from a user segment group having a highest user segment count, and recording said user segment groups and corresponding user segment counts in said user data profile; and
(k) storing, by the computer system, said user data profile, representative of an overall linguistic pattern of the user, in the data storage system, said overall linguistic pattern substantially corresponding to the user'"'"'s social, cultural, educational, economic background and to the user'"'"'s psychological profile. - View Dependent Claims (46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58)
(l) prior to said step (a), automatically adding, by the computer system, textual data generated by the user during utilization of the computer system to said user linguistic data.
-
-
47. The method of claim 45, wherein said user linguistic data comprises at least one of:
- personal textual data generated by the user and favorite textual data generated by a source other than the user and that the user has adopted as being favorite.
-
48. The method of claim 47, further comprising the step of:
(m) prior to said step (a), selecting, by the user at least one of said personal textual data and said favorite textual data, from textual data stored in the data storage system.
-
49. The method of claim 45, wherein said step (d) comprises the step of:
-
(n) determining a word count by sequentially counting words of said text item;
(o) when an end of sentence mark is reached before said word count reaches a predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (n) starting after a last word of said stored sentence; and
(p) when said word count reaches said predefined word limit, storing said counted words as a sentence, restarting said word count, and repeating said step (n) starting after a last word of said stored sentence.
-
-
50. The method of claim 49, wherein said end of sentence mark comprises one of:
- a period, an exclamation mark, and a question mark.
-
51. The method of claim 45, wherein said step (e) comprises the steps, performed for each sentence of said at least one sentence, of:
-
(q) identifying and tagging each word in a sentence as one of a predetermined plurality of different parts of speech; and
(r) arranging a predetermined number of said tagged words in a predetermined order of said predetermined plural different parts of speech to compose at least one segment for each possible combination of said predetermined number of said tagged words arranged in said predetermined order, said at least one segment being representative of a linguistic pattern of said sentence.
-
-
52. The method of claim 51, further comprising the step of:
(s) after said step (q), determining whether each word may serve as an additional part of speech, and when a word may serve as an additional part of speech, adding an additional tag to said word to identify said word as said additional part of speech.
-
53. The method of claim 51, wherein said predetermined plurality of different parts of speech comprises at least one of:
- noun, pronoun, verb, adverb, adjective, gerund, proposition, conjunction and interjection.
-
54. The method of claim 51, wherein said predetermined plurality of different parts of speech comprises a noun, a verb and an adjective, wherein said predetermined number is three, and wherein said predetermined order is noun, verb, adjective.
-
55. The method of claim 51, wherein said step (r) further comprises the step of:
(t) when one of said predetermined plural different parts of speech is missing from said sentence, inserting a blank mark into said segment instead of said missing predetermined part of speech.
-
56. The method of claim 45, wherein said step (k) further comprises the step of:
(u) encrypting said user data profile such that said encrypted user data profile may only be utilized when an authorization is received from the user.
-
57. The method of claim 45, wherein said step (j) further comprises the step of:
(v) recording, in said user data profile, only a first predetermined portion of said at least one user segment groups having highest user segment counts.
-
58. The method of claim 57, wherein said first predetermined portion comprises one of:
- 5,000 user segment groups and a top five percent of said at least one user segment groups.
-
59. A data processing system, comprising a local computer system having a local data storage system, and a remote computer system having a remote data storage, the remote computer system being linked to the local computer system by a telecommunication link, for enabling a user of the local computer system to locate desired data from a plurality of data items stored in the remote data storage system, the data processing system comprising:
-
first extracting means, in one of the local computer system and the remote computer system, for extracting a user profile from user linguistic data previously provided by the user, said user data profile being representative of a first linguistic pattern of the said user linguistic data;
first control means, in one of the local computer system and the remote computer system, for constructing a plurality of data item profiles, each plural data item profile corresponding to a different one of each plural data item stored in the remote data storage system, each of said plural data item profiles being representative of a second linguistic pattern of a corresponding plural data item, each said plural second linguistic pattern being substantially unique to each corresponding plural data item;
first input means, in the local computer system, for acquiring search request data from the user, said search request data being representative of the user'"'"'s expressed desire to locate data in the remote storage system substantially pertaining to said search request data;
second extracting means, in one of the local computer system and the remote computer system, connected to said first input means, for extracting a search request profile from said acquired search request data, said search request profile being representative of a third linguistic pattern of said search request data;
second control means, in one of the local computer system and the remote computer system, connected to said first extracting means and said second extracting means, for determining a first similarity factor representative of a first correlation between said search request profile and said user profile by comparing said search request profile to said user profile;
third control means, in one of the local computer system and the remote computer system, connected to said first control means and said second extracting means, for determining a plurality of second similarity factors, each said plural second similarity factor being representative of a second correlation between said search request profile and a different one of said plural data item profiles, by comparing said search request profile to each of said plural data item profiles;
fourth control means, in one of the local computer system and the remote computer system, connected to said second an said third control means, for calculating a final match factor for each of said plural data item profiles, by adding said first similarity factor to at least one of said plural second similarity factors in accordance with at least one intersection between said first correlation and said second correlation;
first selection means, in one of the local computer system and the remote computer system, connected to said fourth control means, for selecting one of said plural data items corresponding to a plural data item profile having a highest final match factor; and
first retrieving means, in one of the local computer system and the remote computer system, connected to said first selection means, for retrieving, from the remote data storage system, said selected data item for display to the user, such that the user is presented with a data item having linguistic characteristics that substantially correspond to linguistic characteristics of the linguistic data generated by the user, whereby the linguistic characteristics of the data item correspond to the user'"'"'s social, cultural, educational, economic background as well as to the user'"'"'s psychological profile. - View Dependent Claims (62)
-
-
60. A data processing system, comprising a computer system having a data storage system, for enabling a user of the computer system to locate desired data from a plurality of data items stored in the data storage system, the data processing system comprising:
-
first extracting means for extracting a user profile from user linguistic data previously provided by the user, said user data profile being representative of a first linguistic pattern of the said user linguistic data;
first control means for constructing a plurality of data item profiles, each plural data item profile corresponding to a different one of each plural data item stored in the data storage system, each of said plural data item profiles being representative of a second linguistic pattern of a corresponding plural data item, each said plural second linguistic pattern being substantially unique to each corresponding plural data item;
first input means for acquiring search request data from the user, said search request data being representative of the user'"'"'s expressed desire to locate data in the storage system substantially pertaining to said search request data;
second extracting means, connected to said first input means, for extracting a search request profile from said acquired search request data, said search request profile being representative of a third linguistic pattern of said search request data;
second control means, connected to said first extracting means and said second extracting means, for determining a first similarity factor representative of a first correlation between said search request profile and said user profile by comparing said search request profile to said user profile;
third control means, connected to said first control means and said second extracting means, for determining a plurality of second similarity factors, each said plural second similarity factor being representative of a second correlation between said search request profile and a different one of said plural data item profiles, by comparing said search request profile to each of said plural data item profiles;
fourth control means, connected to said second an said third control means, for calculating a final match factor for each of said plural data item profiles, by adding said first similarity factor to at least one of said plural second similarity factors in accordance with at least one intersection between said first correlation and said second correlation;
first selection means, connected to said fourth control means, for selecting one of said plural data items corresponding to a plural data item profile having a highest final match factor; and
first retrieving means, connected to said first selection means, for retrieving, from the data storage system, said selected data item for display to the user, such that the user is presented with a data item having linguistic characteristics that substantially correspond to linguistic characteristics of the linguistic data generated by the user, whereby the linguistic characteristics of the data item correspond to the user'"'"'s social, cultural, educational, economic background as well as to the user'"'"'s psychological profile.
-
Specification