Automatic browsing path generation to present image areas with high attention value as a function of space and time
First Claim
Patent Images
1. A method comprising:
- modeling an image with respect to multiple visual attentions to generate a respective set of attention objects (AOs) for each attention of the visual attentions;
analyzing the attention objects and corresponding attributes to optimize a rate of information gain as a function of information unit cost in terms of time associated with multiple image browsing modes; and
responsive to analyzing the attention objects, generating a browsing path to select ones of the attention objects, wherein generating the browsing path further comprises creating the browsing path in view of a perusing image-browsing mode as follows;
splitting one or more large AOs of the AOs into smaller AOs;
combining AOs in close proximity to one another into one or more attention groups;
arranging the attention groups in decreasing order based on respective attention values;
for each attention group of the attention groups;
selecting the attention group as a starting point;
for each path of all possible paths from the starting point;
calculating a total browsing time and an information fidelity; and
if the information fidelity is smaller than a browsing time threshold, discarding the path;
selecting a non-discarded path having a smallest browsing time as the browsing path, the browsing path connecting each of the attention groups.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for automatic generation of a browsing path across image content to present areas with high attention value are described. In particular, an image is modeled via multiple visual attentions to create a respective set of attention objects for each modeled attention. The attention objects and their respective attributes are analyzed to generate a browsing path to select ones of the attention objects. The browsing path is generated to optimize the rate of information gain from the attention objects as a function of information unit cost in terms of time constraints associated with multiple image browsing modes.
-
Citations
32 Claims
-
1. A method comprising:
-
modeling an image with respect to multiple visual attentions to generate a respective set of attention objects (AOs) for each attention of the visual attentions; analyzing the attention objects and corresponding attributes to optimize a rate of information gain as a function of information unit cost in terms of time associated with multiple image browsing modes; and responsive to analyzing the attention objects, generating a browsing path to select ones of the attention objects, wherein generating the browsing path further comprises creating the browsing path in view of a perusing image-browsing mode as follows; splitting one or more large AOs of the AOs into smaller AOs; combining AOs in close proximity to one another into one or more attention groups; arranging the attention groups in decreasing order based on respective attention values; for each attention group of the attention groups; selecting the attention group as a starting point; for each path of all possible paths from the starting point; calculating a total browsing time and an information fidelity; and if the information fidelity is smaller than a browsing time threshold, discarding the path; selecting a non-discarded path having a smallest browsing time as the browsing path, the browsing path connecting each of the attention groups. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable medium comprising computer-program instructions executable by a processor for:
-
modeling an image with respect to multiple visual attentions to generate a respective set of attention objects (AOs) for each attention of the visual attentions, the AOs representing respective regions of the image; analyzing the attention objects and corresponding attributes to in view of a model for human browsing behavior, the model comprising fixation and shifting states, in the fixation state an interesting region of the regions is exploited for information, in the shifting state one region of the regions is replaced with another region of the regions as a function of image view manipulation operations comprising scrolling or tabbing operations, and wherein the corresponding attributes for each attention object of the AOs comprise a minimal perceptible time (MPT) for display of subject matter associated with the attention object; and responsive to the analyzing, optimizing a rate of information gain in terms of space as a function of information unit cost in terms of time associated with the model for human browsing behavior to generate a browsing path to select ones of the attention objects; wherein the computer-program instructions for optimizing further comprise instructions for determining the rate of information gain R as follows; wherein, G is a total net amount of objectively valuable information gained as determined via information fidelity determinations, TB is a total amount of time spent on shifting between subsequent fixation areas (AOs), TW represents an exploiting cost, which is a total duration of the MPTs used while in a fixation state. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
wherein Pi represents an ith path segment, SPi represents a starting point of Pi, EPi corresponds to an ending point of Pi, SRi is a starting resolution of Pi, ERi is an ending resolution of Pi, and Ti is a time cost for scrolling from SPi to EPi.
-
-
16. A computer-readable medium as recited in claim 12, wherein the computer-program for generating the browsing path further comprise instructions for calculating an information fidelity for each AO, the information fidelity being a function of an attention value (AV) and the minimal perceptible time (MPT) for display of subject matter associated with the AO.
-
17. A computer-readable medium as recited in claim 12, wherein each AO is a respective information block, and wherein the computer-program instructions further comprise instructions for:
-
representing the image (I) as a set of M×
N evenly distributed information blocks Iij as follows;
I={Iij}={(AVij,rij)}, 1≦
i≦
M,1≦
j≦
N,rij ε
(0,1),wherein (i, j) corresponds to a location at which the information block Iij is modeled according to a visual attention, AVij is a visual attention value of Iij, rij is the spatial scale of Iij, representing the minimal spatial resolution to keep Iij perceptible; and
,generating the browsing path further by calculating an information fidelity (fRSVP) for each AO as follows for respective ones of the information blocks Iij; wherein IRSVP(t) is a subset of the information blocks and varies with time and which varies with space.
-
-
18. A computer-readable medium as recited in claim 12, wherein the computer-program instructions further comprise instructions for optimizing the rate of information gain R either by maximizing information fidelity or by minimizing time cost.
-
19. A computer-readable medium as recited in claim 12, wherein the computer-program instructions further comprise instructions for optimizing the rate of information gain R in the shifting mode as follows:
-
wherein, Tp represents a total amount of time spent for fixation and shifting in the browsing path P, λ
T is a threshold of maximal time cost, and I represents the image.
-
-
20. A computer-readable medium as recited in claim 12, wherein the computer-program instructions further comprise instructions for optimizing the rate of information gain R in the shifting mode as follows:
-
wherein, Tp represents a total amount of time spent for fixation and shifting in the browsing path P, λ
AV represents a minimal attention value or information percentage for attainment.
-
-
21. A computer-readable medium as recited in claim 12, further comprising:
-
detecting user intervention during automatic playback of the browsing path; responsive to detecting the user intervention; recording all AOs Sr of the AOs that have not been browsed; identifying all AOs Sm of the AOs browsed during the user intervention; regenerating the browsing path based on Sr-Sm; and responsive to regenerating the browsing path and determining that there is at least a lull in user intervention, automatically navigating the browsing path.
-
-
22. A computing device comprising a processor coupled to a memory, the memory comprising computer-program instructions executable by the processor for:
-
modeling an image with respect to multiple visual attentions to generate a respective set of attention objects (AOs) for each attention of the visual attentions, the AOs representing respective regions of the image; analyzing the attention objects and corresponding attributes to in view of a model for human browsing behavior, the model comprising fixation and shifting states, in the fixation state an interesting region of the regions is exploited for information, in the shifting state one region of the regions is replaced with another region of the regions as a function of image view manipulation operations comprising scrolling or tabbing operations, and wherein the corresponding attributes for each attention object of the AOs comprise a minimal perceptible time (MPT) for display of subject matter associated with the attention object; and responsive to the analyzing, optimizing a rate of information gain in terms of space as a function of information unit cost in terms of time associated with the model for human browsing behavior to generate a browsing path to select ones of the attention objects; wherein the computer-program instructions for optimizing further comprise instructions for determining the rate of information gain R as follows; wherein, G is a total net amount of objectively valuable information gained as determined via information fidelity determinations, TB is a total amount of time spent on shifting between subsequent fixation areas (AOs), TW represents an exploiting cost, which is a total duration of the MPTs used while in a fixation state. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31)
wherein Pi represents an ith path segment, SPi represents a starting point of Pi, EP corresponds to an ending point of Pi, SRi is a starting resolution of Pi, ERi is an ending resolution of Pi, and Ti is a time cost for scrolling from SPi to EPi.
-
-
26. A computing device as recited in claim 22, wherein the computer-program for generating the browsing path further comprise instructions for calculating an information fidelity for each AO, the information fidelity being a function of an attention value (AV) and the minimal perceptible time (MPT) for display of subject matter associated with the AO.
-
27. A computing device as recited in claim 22, wherein each AO is a respective information block, and wherein the computer-program instructions further comprise instructions for:
-
representing the image (I) as a set of M×
N evenly distributed information blocks Iij as follows;
I={Iij}={(AVij,rij)}, 1≦
i≦
M,1≦
j≦
N,rij ε
(0,1),wherein (i, j) corresponds to a location at which the information block Iij is modeled according to a visual attention, AVij is a visual attention value of Iij, rij is the spatial scale of Iij, representing the minimal spatial resolution to keep Iij perceptible; and
,generating the browsing path further by calculating an information fidelity (fRSVP) for each AO as follows for respective ones of the information blocks Iij; wherein IRSVP(t) is a subset of the information blocks and varies with time and which varies with space.
-
-
28. A computing device as recited in claim 22, wherein the computer-program instructions further comprise instructions for optimizing the rate of information gain R either by maximizing information fidelity or by minimizing time cost.
-
29. A computing device as recited in claim 22, wherein the computer-program instructions further comprise instructions for optimizing the rate of information gain R in the shifting mode as follows:
-
wherein, Tp represents a total amount of time spent for fixation and shifting in the browsing path P, λ
T is a threshold of maximal time cost, and I represents the image.
-
-
30. A computing device as recited in claim 22, wherein the computer-program instructions further comprise instructions for optimizing the rate of information gain R in the shifting mode as follows:
-
wherein, Tp represents a total amount of time spent for fixation and shifting in the browsing path P, λ
AV represents a minimal attention value or information percentage for attainment.
-
-
31. A computing device as recited in claim 22, further comprising:
-
detecting user intervention during automatic playback of the browsing path; responsive to detecting the user intervention; recording all AOs Sr of the AOs that have not been browsed; identifying all AOs Sm of the AOs browsed during the user intervention; regenerating the browsing path based on Sr-Sm; and responsive to regenerating the browsing path and determining that there is at least a lull in user intervention, automatically navigating the browsing path.
-
-
32. A method comprising:
-
modeling an image with respect to multiple visual attentions to generate a respective set of attention objects (AOs) for each attention of the visual attentions; analyzing the attention objects and corresponding attributes to optimize a rate of information gain as a function of information unit cost in terms of time associated with multiple image browsing modes; and responsive to analyzing the attention objects, generating a browsing path to select ones of the attention objects, the browsing path being a trade off of time for space or space for time; wherein generating the browsing path further comprises creating the browsing path in view of a skimming image-browsing mode as follows; splitting one or more large AOs of the AOs into smaller AOs; combining AOs in close proximity to one another into one or more attention groups; arranging the attention groups in decreasing order based on respective attention values; for each attention group of the attention groups; selecting the attention group as a starting point; calculating a total browsing time and information fidelity for each path of all possible paths from the starting point; and if the total browsing time is greater than a browsing time threshold, discarding the path; selecting a non-discarded path having a largest information fidelity as the browsing path, the browsing path connecting each of the attention groups.
-
Specification