Identification of users for advertising using data with missing values
First Claim
1. A method for identifying users for advertising purposes, said method comprising:
- identifying N users and M attributes, wherein N≧
4 and M≧
2;
identifying a first network of first web sites of the Internet accessed by N users, said access to the first web sites provided to the N users by a first at least one Internet Service Provider (ISP);
receiving, by a processor of a computer system from the first at least one ISP, first data comprising content of the first web sites and time data pertaining to when each user of the N users accessed the first web sites;
said processor analyzing the first data, said analyzing comprising determining first attribute values comprising a first value of each attribute of the N attributes for each user, wherein the first value of each attribute is indicative of a level of interest in each attribute by each user, said analyzing based on an amount of time spent by each user at each website of the first web sites in relation to the content of each website of the first web sites, wherein Vn,m,1 denotes the determined first attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M;
identifying a second network of second web sites of the Internet accessed by the plurality users, said access to the second web sites provided to the N users by a second at least one ISP;
said processor determining, from questionnaires completed by the N users, second attribute values comprising a second value of each attribute of the plurality of attributes for each user, wherein the second value is indicative of a level of interest in each attribute by each user, wherein attribute values of the second attribute values have been indicated on the questionnaire by the users of the N users for each attribute to which a keyword relevant to the second web sites has been mapped, and wherein second data pertains to said access of the second web sites by the N users, and wherein Vn,m,2 denotes the determined second attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M;
said processor determining third attribute values that comprise a third value of each attribute of the plurality of attributes for each user, by combining the first attribute values for each user with the second attribute values for each user;
said processor processing the third attribute values, comprising determining from the third attribute values an identification of a subset of the N users to whom advertising of a product or service may be directed;
communicating the identification of the subset of the N users to a provider of the product or service;
wherein for a function F(Vn,m,k) of Vnm,k for n=1, 2, . . . N and m=1, 2, . . . , M and k=1, 2, said determining third attribute values Vn,m,3 comprises computing Vn,m,3 according to Vn,m,3=Wm,1*F(Vn,m,1)+Wm,2*F(Vn,m,2) such that Wm,k is a weight that acts as a multiplier on F(Vn,m,k) for k=1, 2;
wherein said determining Vn,m,k for k=1, k=2, or k=3 comprises determining Vn,m,k for all user-attribute pairs (n,m) of user n and attribute m for n=1, 2, . . . N and m=1, 2, . . . , M except for a single user-attribute pair (n1,m1) such that n1 is 1, 2, . . . , or N and m1 is 1, 2, . . . , or M, followed by performing the steps of;
selecting attribute m2 of the M attributes subject to m2≠
m1, said selecting m2 comprising determining that Vn,m1,k is linearly correlated with Vn,m2,k for a user class consisting of NUC users, wherein the NUC users initially consist of the N users minus user n1 such that NUC=N−
1;
after said selecting attribute m2, performing a linear regression to determine a regression equation expressing Vn,m1,k as a linear function of Vn,m2,k for the user class; and
computing Vn1,m1,1 via the regression equation.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for identifying users for advertising. Users, attributes, and first web sites provided by ISPs accessed by the users are identified. First data including content of the first web sites and user time spent thereon are received from ISPs and analyzed to determine first attribute values indicative of user interest. Second data received from ISPs include content of second web sites and user time spent thereon. Second attribute values, derived from questionnaires completed by the users, indicate interest in each attribute by each user. Third attribute values are determined by combining the first and second attribute values. The third attribute values are processed to identify users to which a product or service may be advertised. The identified users are communicated to a service provider or product provider. The first, second, or third attribute values may have missing values, which are determined by correlation and linear regression.
24 Citations
20 Claims
-
1. A method for identifying users for advertising purposes, said method comprising:
-
identifying N users and M attributes, wherein N≧
4 and M≧
2;identifying a first network of first web sites of the Internet accessed by N users, said access to the first web sites provided to the N users by a first at least one Internet Service Provider (ISP); receiving, by a processor of a computer system from the first at least one ISP, first data comprising content of the first web sites and time data pertaining to when each user of the N users accessed the first web sites; said processor analyzing the first data, said analyzing comprising determining first attribute values comprising a first value of each attribute of the N attributes for each user, wherein the first value of each attribute is indicative of a level of interest in each attribute by each user, said analyzing based on an amount of time spent by each user at each website of the first web sites in relation to the content of each website of the first web sites, wherein Vn,m,1 denotes the determined first attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; identifying a second network of second web sites of the Internet accessed by the plurality users, said access to the second web sites provided to the N users by a second at least one ISP; said processor determining, from questionnaires completed by the N users, second attribute values comprising a second value of each attribute of the plurality of attributes for each user, wherein the second value is indicative of a level of interest in each attribute by each user, wherein attribute values of the second attribute values have been indicated on the questionnaire by the users of the N users for each attribute to which a keyword relevant to the second web sites has been mapped, and wherein second data pertains to said access of the second web sites by the N users, and wherein Vn,m,2 denotes the determined second attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; said processor determining third attribute values that comprise a third value of each attribute of the plurality of attributes for each user, by combining the first attribute values for each user with the second attribute values for each user; said processor processing the third attribute values, comprising determining from the third attribute values an identification of a subset of the N users to whom advertising of a product or service may be directed; communicating the identification of the subset of the N users to a provider of the product or service; wherein for a function F(Vn,m,k) of Vnm,k for n=1, 2, . . . N and m=1, 2, . . . , M and k=1, 2, said determining third attribute values Vn,m,3 comprises computing Vn,m,3 according to Vn,m,3=Wm,1*F(Vn,m,1)+Wm,2*F(Vn,m,2) such that Wm,k is a weight that acts as a multiplier on F(Vn,m,k) for k=1, 2; wherein said determining Vn,m,k for k=1, k=2, or k=3 comprises determining Vn,m,k for all user-attribute pairs (n,m) of user n and attribute m for n=1, 2, . . . N and m=1, 2, . . . , M except for a single user-attribute pair (n1,m1) such that n1 is 1, 2, . . . , or N and m1 is 1, 2, . . . , or M, followed by performing the steps of; selecting attribute m2 of the M attributes subject to m2≠
m1, said selecting m2 comprising determining that Vn,m1,k is linearly correlated with Vn,m2,k for a user class consisting of NUC users, wherein the NUC users initially consist of the N users minus user n1 such that NUC=N−
1;after said selecting attribute m2, performing a linear regression to determine a regression equation expressing Vn,m1,k as a linear function of Vn,m2,k for the user class; and computing Vn1,m1,1 via the regression equation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer system comprising a processor and a computer readable memory unit coupled to the processor, said memory unit containing instructions that when executed by the processor perform a method for identifying users for advertising purposes, said method comprising:
-
identifying N users and M attributes, wherein N≧
4 and M≧
2;identifying a first network of first web sites of the Internet accessed by N users, said access to the first web sites provided to the N users by a first at least one Internet Service Provider (ISP); receiving, from the first at least one ISP, first data comprising content of the first web sites and time data pertaining to when each user of the N users accessed the first web sites; analyzing the first data, said analyzing comprising determining first attribute values comprising a first value of each attribute of the N attributes for each user, wherein the first value of each attribute is indicative of a level of interest in each attribute by each user, said analyzing based on an amount of time spent by each user at each website of the first web sites in relation to the content of each website of the first web sites, wherein Vn,m,1 denotes the determined first attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; identifying a second network of second web sites of the Internet accessed by the plurality users, said access to the second web sites provided to the N users by a second at least one ISP; determining, from questionnaires completed by the N users, second attribute values comprising a second value of each attribute of the plurality of attributes for each user, wherein the second value is indicative of a level of interest in each attribute by each user, wherein attribute values of the second attribute values have been indicated on the questionnaire by the users of the N users for each attribute to which a keyword relevant to the second web sites has been mapped, and wherein second data pertains to said access of the second web sites by the N users, and wherein Vn,m,2 denotes the determined second attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; determining third attribute values that comprise a third value of each attribute of the plurality of attributes for each user, by combining the first attribute values for each user with the second attribute values for each user; processing the third attribute values, comprising determining from the third attribute values an identification of a subset of the N users to whom advertising of a product or service may be directed; communicating the identification of the subset of the N users to a provider of the product or service; wherein for a function F(Vn,m,k) of Vnm,k for n=1, 2, . . . N and m=1, 2, . . . , M and k=1, 2, said determining third attribute values Vn,m,3 comprises computing Vn,m,3 according to Vn,m,3=Wm,1*F(Vn,m,1)+Wm,2*F(Vn,m,2) such that Wm,k is a weight that acts as a multiplier on F(Vn,m,k) for k=1, 2; wherein said determining Vn,m,k for k=1, k=2, or k=3 comprises determining Vn,m,k for all user-attribute pairs (n,m) of user n and attribute m for n=1, 2, . . . N and m=1, 2, . . . , M except for a single user-attribute pair (n1,m1) such that n1 is 1, 2, . . . , or N and m1 is 1, 2, . . . , or M, followed by performing the steps of; selecting attribute m2 of the M attributes subject to m2≠
m1, said selecting m2 comprising determining that Vn,m1,k is linearly correlated with Vn,m2,k for a user class consisting of NUC users, wherein the NUC users initially consist of the N users minus user n1 such that NUC=N−
1;after said selecting attribute m2, performing a linear regression to determine a regression equation expressing Vn,m1,k as a linear function of Vn,m2,k for the user class; and
computing Vn1,m1,1 via the regression equation.
-
-
19. A computer program product, comprising a computer readable storage device having a computer readable program code embodied therein, said computer readable program code containing instructions that when executed by a processor of a computer system perform a method for identifying users for advertising purposes, said method comprising:
-
identifying N users and M attributes, wherein N≧
4 and M≧
2;identifying a first network of first web sites of the Internet accessed by N users, said access to the first web sites provided to the N users by a first at least one Internet Service Provider (ISP); receiving, from the first at least one ISP, first data comprising content of the first web sites and time data pertaining to when each user of the N users accessed the first web sites; analyzing the first data, said analyzing comprising determining first attribute values comprising a first value of each attribute of the N attributes for each user, wherein the first value of each attribute is indicative of a level of interest in each attribute by each user, said analyzing based on an amount of time spent by each user at each website of the first web sites in relation to the content of each website of the first web sites, wherein Vn,m,1 denotes the determined first attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; identifying a second network of second web sites of the Internet accessed by the plurality users, said access to the second web sites provided to the N users by a second at least one ISP; determining, from questionnaires completed by the N users, second attribute values comprising a second value of each attribute of the plurality of attributes for each user, wherein the second value is indicative of a level of interest in each attribute by each user, wherein attribute values of the second attribute values have been indicated on the questionnaire by the users of the N users for each attribute to which a keyword relevant to the second web sites has been mapped, and wherein second data pertains to said access of the second web sites by the N users, and wherein Vn,m,2 denotes the determined second attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; determining third attribute values that comprise a third value of each attribute of the plurality of attributes for each user, by combining the first attribute values for each user with the second attribute values for each user; processing the third attribute values, comprising determining from the third attribute values an identification of a subset of the N users to whom advertising of a product or service may be directed; communicating the identification of the subset of the N users to a provider of the product or service; wherein for a function F(Vn,m,k) of Vnm,k for n=1, 2, . . . N and m=1, 2, . . . , M and k=1, 2, said determining third attribute values Vn,m,3 comprises computing Vn,m,3 according to Vn,m,3=Wm,1*F(Vn,m,1)+Wm,2*F(Vn,m,2) such that Wm,k is a weight that acts as a multiplier on F(Vn,m,k) for k=1, 2; wherein said determining Vn,m,k for k=1, k=2, or k=3 comprises determining Vn,m,k for all user-attribute pairs (n,m) of user n and attribute m for n=1, 2, . . . N and m=1, 2, . . . , M except for a single user-attribute pair (n1,m1) such that n1 is 1, 2, . . . , or N and m1 is 1, 2, . . . , or M, followed by performing the steps of; selecting attribute m2 of the M attributes subject to m2≠
m1, said selecting m2 comprising determining that Vn,m1,k is linearly correlated with Vn,m2,k for a user class consisting of NUC users, wherein the NUC users initially consist of the N users minus user n1 such that NUC=N−
1;after said selecting attribute m2, performing a linear regression to determine a regression equation expressing Vn,m1,k as a linear function of Vn,m2,k for the user class; and computing Vn1,m1,1 via the regression equation.
-
-
20. A process for supporting computer infrastructure, said process comprising providing at least one support service for at least one of creating, integrating, hosting, maintaining, and deploying computer-readable code in a computing system that comprises a processor and a computer readable storage device storing the code, wherein the code in combination with the computing system is configured to perform, via execution of the code by the processor, a method for identifying users for advertising purposes, said method comprising:
-
identifying N users and M attributes, wherein N≧
4 and M≧
2;identifying a first network of first web sites of the Internet accessed by N users, said access to the first web sites provided to the N users by a first at least one Internet Service Provider (ISP); receiving, by said processor from the first at least one ISP, first data comprising content of the first web sites and time data pertaining to when each user of the N users accessed the first web sites; said processor analyzing the first data, said analyzing comprising determining first attribute values comprising a first value of each attribute of the N attributes for each user, wherein the first value of each attribute is indicative of a level of interest in each attribute by each user, said analyzing based on an amount of time spent by each user at each website of the first web sites in relation to the content of each website of the first web sites, wherein Vn,m,1 denotes the determined first attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; identifying a second network of second web sites of the Internet accessed by the plurality users, said access to the second web sites provided to the N users by a second at least one ISP; said processor determining, from questionnaires completed by the N users, second attribute values comprising a second value of each attribute of the plurality of attributes for each user, wherein the second value is indicative of a level of interest in each attribute by each user, wherein attribute values of the second attribute values have been indicated on the questionnaire by the users of the N users for each attribute to which a keyword relevant to the second web sites has been mapped, and wherein second data pertains to said access of the second web sites by the N users, and wherein Vn,m,2 denotes the determined second attribute value for attribute m of user n for n=1, 2, . . . , N and m=1, 2, . . . , M; said processor determining third attribute values that comprise a third value of each attribute of the plurality of attributes for each user, by combining the first attribute values for each user with the second attribute values for each user; said processor processing the third attribute values, comprising determining from the third attribute values an identification of a subset of the N users to whom advertising of a product or service may be directed; communicating the identification of the subset of the N users to a provider of the product or service; wherein for a function F(Vn,m,k) of Vnm,k for n=1, 2, . . . N and m=1, 2, . . . , M and k=1, 2, said determining third attribute values Vn,m,3 comprises computing Vn,m,3 according to Vn,m,3=Wm,1*F(Vn,m,1)+Wm,2*F(Vn,m,2) such that Wm,k is a weight that acts as a multiplier on F(Vn,m,k) for k=1, 2; wherein said determining Vn,m,k for k=1, k=2, or k=3 comprises determining Vn,m,k for all user-attribute pairs (n,m) of user n and attribute m for n=1, 2, . . . N and m=1, 2, . . . , M except for a single user-attribute pair (n1,m1) such that n1 is 1, 2, . . . , or N and m1 is 1, 2, . . . , or M, followed by performing the steps of; selecting attribute m2 of the M attributes subject to m2≠
m1, said selecting m2 comprising determining that Vn,m1,k is linearly correlated with Vn,m2,k for a user class consisting of NUC users, wherein the NUC users initially consist of the N users minus user n1 such that NUC=N−
1;after said selecting attribute m2, performing a linear regression to determine a regression equation expressing Vn,m1,k as a linear function of Vn,m2,k for the user class; and computing Vn1,m1,1 via the regression equation.
-
Specification