Method and system for interactive ground-truthing of document images

US 20030152277A1
Filed: 06/13/2002
Published: 08/14/2003
Est. Priority Date: 02/13/2002
Status: Active Grant

First Claim

Patent Images

1. A method for analyzing a document image, comprising:

segmenting the document image to identify a set of image objects within the document image;

processing the set to group image objects within the set into a plurality of subsets, the subsets including one or more image objects;

linking reference image objects to corresponding subsets in the plurality of subsets;

creating machine readable data structures pairing the reference image objects with linked metadata fields, whereby image objects in the corresponding subsets are linked to common metadata in the linked metadata fields; and

presenting the reference image objects to a user, and accepting input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects in the corresponding subsets.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and a system by which a document image is analyzed for the purposes of establishing a searchable data structure characterizing ground-truthed contents of the document represented by the document image operates by segmenting a document image into a set of image objects, and linking the image objects with fields that store metadata. Image objects identified by segmenting the document image are grouped into subsets. The image objects are grouped according to characteristics suggesting that the image objects may have common ground-truthed metadata. By grouping the image objects into subsets, the image objects may be indexed to facilitate the ground-truthing process. In some embodiments, the index of representative image objects is presented to the user in a table form. A database of image objects with ground-truthed metadata is formed. Interactive tools and processes facilitate ground-truthing based on paired image objects and metadata.

93 Citations

View as Search Results

138 Claims

1. A method for analyzing a document image, comprising:
- segmenting the document image to identify a set of image objects within the document image;
  
  processing the set to group image objects within the set into a plurality of subsets, the subsets including one or more image objects;
  
  linking reference image objects to corresponding subsets in the plurality of subsets;
  
  creating machine readable data structures pairing the reference image objects with linked metadata fields, whereby image objects in the corresponding subsets are linked to common metadata in the linked metadata fields; and
  
  presenting the reference image objects to a user, and accepting input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects in the corresponding subsets.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 2. The method of claim 1, including generating a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 3. The method of claim 1, wherein the segmenting includes:
    - presenting at least a portion of the document image with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepting input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 4. The method of claim 1, wherein the searchable characteristics include a computer readable representation of a word within the image object.
  - 5. The method of claim 1, including accepting input from a plurality of users, to interactively populate the linked metadata fields with ground-truthed metadata.
  - 6. The method of claim 1, wherein said presenting the reference image objects to the user includes ordering the reference image objects in said presentation.
  - 7. The method of claim 6, wherein said ordering is based on shapes of the reference image objects.
  - 8. The method of claim 6, wherein said ordering is based on the metadata linked to the reference image objects.
  - 9. The method of claim 1, wherein said presenting the reference image objects to the user includes one or more reference image objects with ground-truthed metadata in the linked metadata fields.
  - 10. The method of claim 1, wherein presenting the reference image objects to a user, and accepting input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, includes accepting audio input and translating the audio input using speech recognition tools.
  - 11. The method of claim 1, wherein presenting the reference image objects to a user, and accepting input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, includes accepting input to change the ground-truthed metadata.
  - 12. The method of claim 1, wherein the processing groups image objects in the set according to characteristics suggesting that the image objects in a particular subset may have common ground-truthed metadata.
  - 13. The method of claim 1, wherein the processing groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes.
  - 14. The method of claim 1, wherein the processing groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes according to an adjustable parameter.
  - 15. The method of claim 1, wherein the presenting includes displaying a table having a set of entries, entries in the table corresponding to the subsets of image objects within the set, the entries including the representative image objects for the respective subsets, and fields for the common ground-truthed metadata.
  - 16. The method of claim 1, including displaying instances of image objects within a selected subset, and accepting user input to interactively remove an image object from the selected subset.
  - 17. The method of claim 1, including displaying instances of image objects within a selected subset, and accepting user input to interactively move an image object from the selected subset into another subset.
  - 18. The method of claim 1, wherein the reference image object consists of an image object from the corresponding subset.
  - 19. The method of claim 1, wherein the reference image object consists of an image object constructed in response to two or more image objects from the corresponding subset.
  - 20. The method of claim 1, wherein the reference image object consists of an image object constructed in response to two or more image objects from the set of image objects.
  - 21. The method of claim 1, wherein the document image comprises a machine readable file including a bit mapped representation of a document.
  - 22. The method of claim 1, wherein the document image comprises a plurality of machine readable files including respective bit mapped representations documents.

23. A method for analyzing a document image, comprising:
- segmenting the document image to identify a set of image objects within the document image;
  
  creating machine readable data structures pairing the identified image objects in the set with linked metadata fields; and
  
  presenting representations of the identified image objects to a user, and accepting audio input translated with speech recognition tools to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects to which the respective metadata fields are linked.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
- - 24. The method of claim 23, including generating a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 25. The method of claim 23, wherein the segmenting includes:
    - presenting at least a portion of the document image with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepting input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 26. The method of claim 23, wherein the document image comprises a machine readable file including a bit mapped representation of a document.
  - 27. The method of claim 23, wherein the document image comprises a plurality of machine readable files including respective bit mapped representations documents.
  - 28. The method of claim 23, wherein presenting representations of the image objects includes presenting the representations in reading order with respect to the document image.
  - 29. The method of claim 23, wherein presenting representations of the image objects includes presenting the representations in an index grouping similar image objects.
  - 30. The method of claim 23, including processing said set of image objects to find candidate image objects in response to text derived from the audio input translated with speech recognition tools, and populating the linked metadata fields of the candidate image objects with the text.
  - 31. The method of claim 23, wherein said presenting representations of the identified image objects to the user includes ordering the representations of the identified image objects in said presentation.
  - 32. The method of claim 31, wherein said ordering is based on shapes of the identified image objects.
  - 33. The method of claim 31, wherein said ordering is based on the metadata linked to the identified image objects.
  - 34. The method of claim 23, wherein said presenting representations of the identified image objects to the user includes one or more identified image objects with ground-truthed metadata in the linked metadata fields.
  - 35. The method of claim 23, wherein the presenting representations of the identified image objects includes presenting the representations in a reading order for the document image.

36. A method for analyzing a document image, comprising:
- segmenting the document image to identify a set of image objects within the document image;
  
  applying text recognition tools to produce proposed text for the set of image objects;
  
  processing the set to group image objects with the set into a plurality of subsets, the subsets including one or more image objects;
  
  linking reference image objects to corresponding subsets in the plurality of subsets;
  
  creating machine readable data structures pairing the reference image objects with linked metadata fields, whereby image objects in the corresponding subsets are linked to common metadata in the linked metadata fields, and populating the linked metadata fields based on the proposed text; and
  
  presenting the reference image objects to a user, and accepting input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects in the corresponding subsets, including accepting input to verify and to edit the proposed text to establish the ground-truthed metadata.
- View Dependent Claims (37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61)
- - 37. The method of claim 36, including generating a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 38. The method of claim 36, wherein the segmenting includes:
    - presenting at least a portion of the document image with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepting input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 39. The method of claim 36, wherein the searchable characteristics include a computer readable representation of a word within the image object.
  - 40. The method of claim 36, wherein presenting the reference image objects to a user, and accepting input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, includes accepting audio input and translating the audio input using speech recognition tools.
  - 41. The method of claim 36, presenting the reference image objects to a user, and accepting input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, includes accepting input to change the ground-truthed metadata.
  - 42. The method of claim 36, wherein the processing groups image objects in the set according to characteristics suggesting that the image objects in a particular subset may have common ground-truthed metadata.
  - 43. The method of claim 36, wherein the processing groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes.
  - 44. The method of claim 36, wherein the processing groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes according to an adjustable parameter.
  - 45. The method of claim 36, wherein the presenting the index includes displaying a table having a set of entries, entries in the table corresponding to the subsets of image objects within the set, the entries including the representative image objects for the respective subsets, and fields for the common ground-truthed metadata.
  - 46. The method of claim 36, including displaying instances of image objects within a selected subset, and accepting user input to interactively remove an image object from the selected subset.
  - 47. The method of claim 36, including displaying instances of image objects within a selected subset, and accepting user input to interactively move an image object from the selected subset into another subset.
  - 48. The method of claim 36, wherein the reference image object consists of an image object from the corresponding subset.
  - 49. The method of claim 36, wherein the reference image object consists of an image object constructed in response to two or more image objects from the corresponding subset.
  - 50. The method of claim 36, wherein the reference image object consists of an image object constructed in response to two or more image objects from the corresponding set of image objects.
  - 51. The method of claim 36, wherein the document image comprises a machine readable file including a bit mapped representation of a document.
  - 52. The method of claim 36, wherein the document image comprises a plurality of machine readable files including respective bit mapped representations documents.
  - 53. The method of claim 36, wherein the applying text recognition tools includes applying text recognition tools to the image objects individually.
  - 54. The method of claim 36, wherein the applying text recognition tools includes applying text recognition tools to the reference image objects.
  - 55. The method of claim 36, wherein the applying text recognition tools includes applying contextual text recognition tools.
  - 56. The method of claim 36, wherein the populating the linked metadata fields includes selecting proposed text for the linked metadata field based upon the proposed text for members of the corresponding subset.
  - 57. The method of claim 36, including accepting input from a plurality of users, to interactively populate the linked metadata fields with ground-truthed metadata.
  - 58. The method of claim 36, wherein said presenting the reference image objects to the user includes ordering the reference image objects in said presentation.
  - 59. The method of claim 58, wherein said ordering is based on shapes of the reference image objects.
  - 60. The method of claim 58, wherein said ordering is based on the metadata linked to the reference image objects.
  - 61. The method of claim 36, wherein said presenting the reference image objects to the user includes one or more reference image objects with ground-truthed metadata in the linked metadata fields.

62. A method for analyzing a document image, comprising:
- providing a database of representative image objects with linked metadata fields storing metadata, the metadata including searchable characteristics of image objects matching the representative image objects;
  
  segmenting the document image to identify a set of image objects within the document image;
  
  processing the set to match image objects in the set with representative image objects in the database, and to link matching image objects in the set with particular representative image objects in the database; and
  
  displaying instances of image objects in the set that are linked with a particular representative image object in the database, and accepting user input to interactively undo the link of selected image objects with the particular representative image object.
- View Dependent Claims (63, 64, 65, 66, 67, 68, 69)
- - 63. The method of claim 62, including generating a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 64. The method of claim 62, including accepting user input to interactively change the link of a selected image object with the particular representative image object to a link with another representative image object in the database.
  - 65. The method of claim 62, wherein the segmenting includes:
    - presenting at least a portion of the document image with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepting input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 66. The method of claim 62, wherein the searchable characteristics include a computer readable representation of a word within the image object.
  - 67. The method of claim 62, including creating machine readable data structures pairing particular image objects in the set, not linked to representative image objects, with linked metadata fields;
    - and presenting representations of the particular image objects to a user, and accepting input to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects to which the respective metadata fields are linked.
  - 68. The method of claim 62, including creating machine readable data structures pairing a selected image object in the set with a linked metadata field;
    - and establishing an entry in the database for the selected image object.
  - 69. The method of claim 62, wherein said accepting user input includes accepting user input from a plurality of users.

70. An apparatus, comprising:
- a data processing system including a user input device, a display, one of memory, or access to memory, storing a document image, and resources for processing the document image, the resources including logic to;
  
  segment the document image to identify a set of image objects within the document image;
  
  process the set to group image objects within the set into a plurality of subsets, the subsets including one or more image objects;
  
  link reference image objects to corresponding subsets in the plurality of subsets;
  
  store data structures pairing the reference image objects with linked metadata fields, whereby image objects in the corresponding subsets are linked to common metadata in the linked metadata fields; and
  
  present the reference image objects to a user on the display, and accept input from the user via the user input device, to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects in the corresponding subsets.
- View Dependent Claims (71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91)
- - 71. The apparatus of claim 70, the data processing resources including logic to generate a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 72. The apparatus of claim 70, wherein the logic to segment includes logic to:
    - present at least a portion of the document image on the display with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepting input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 73. The apparatus of claim 70, wherein the searchable characteristics include a computer readable representation of a word within the image object.
  - 74. The apparatus of claim 70, including resources to accept input via a communication medium from a plurality of users, to interactively populate the linked metadata fields with ground-truthed metadata.
  - 75. The apparatus of claim 70, wherein said logic to present the reference image objects to the user includes logic to order the reference image objects in said presentation.
  - 76. The apparatus of claim 70, wherein said logic to present the reference image objects to the user includes logic to order the reference image objects in said presentation based on shapes of the reference image objects.
  - 77. The apparatus of claim 70, wherein said logic to present the reference image objects to the user includes logic to order the reference image objects in said presentation based on the metadata linked to the reference image objects.
  - 78. The apparatus of claim 70, wherein said logic to present the reference image objects to the user presents one or more reference image objects with ground-truthed metadata in the linked metadata fields.
  - 79. The apparatus of claim 70, wherein said user input device includes resources accepting audio input and translating the audio input using speech recognition tools.
  - 80. The apparatus of claim 70, including resources accepting input to change the ground-truthed metadata.
  - 81. The apparatus of claim 70, wherein the logic to process includes logic that groups image objects in the set according to characteristics suggesting that the image objects in a particular subset may have common ground-truthed metadata.
  - 82. The apparatus of claim 70, wherein the logic to process includes logic that groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes.
  - 83. The apparatus of claim 70, wherein the logic to process includes logic that groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes according to an adjustable parameter.
  - 84. The apparatus of claim 70, including logic to display a table having a set of entries, entries in the table corresponding to the subsets of image objects within the set, the entries including the representative image objects for the respective subsets, and fields for the common ground-truthed metadata.
  - 85. The apparatus of claim 70, including logic to display instances of image objects within a selected subset, and accepting user input to interactively remove an image object from the selected subset.
  - 86. The apparatus of claim 70, including logic to display instances of image objects within a selected subset, and accepting user input to interactively move an image object from the selected subset into another subset.
  - 87. The apparatus of claim 70, wherein the reference image object consists of an image object from the corresponding subset.
  - 88. The apparatus of claim 70, wherein the reference image object consists of an image object constructed in response to two or more image objects from the corresponding subset.
  - 89. The apparatus of claim 70, wherein the reference image object consists of an image object constructed in response to two or more image objects from the set of image objects.
  - 90. The apparatus of claim 70, wherein the document image comprises a machine readable file including a bit mapped representation of a document.
  - 91. The apparatus of claim 70, wherein the document image comprises a plurality of machine readable files including respective bit mapped representations documents.

92. An apparatus for analyzing a document image, comprising:
- a data processing system including a user input device, a display, one of memory, or access to memory, storing a document image, and resources for processing the document image, the resources including logic to;
  
  segment the document image to identify a set of image objects within the document image;
  
  create and store machine readable data structures pairing the identified image objects in the set with linked metadata fields; and
  
  present representations of the identified image objects to a user, and accepting audio input translated with speech recognition tools to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects to which the respective metadata fields are linked.
- View Dependent Claims (93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104)
- - 93. The apparatus of claim 92, including logic to generate a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 94. The apparatus of claim 92, wherein the logic to segment presents at least a portion of the document image with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepts input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 95. The apparatus of claim 92, wherein the document image comprises a machine readable file including a bit mapped representation of a document.
  - 96. The apparatus of claim 92, wherein the document image comprises a plurality of machine readable files including respective bit mapped representations documents.
  - 97. The apparatus of claim 92, wherein logic to present representations of the image objects presents the representations in reading order with respect to the document image.
  - 98. The apparatus of claim 92, wherein logic to present representations of the image objects presents the representations in an index grouping similar image objects.
  - 99. The apparatus of claim 92, including logic to process said set of image objects to find candidate image objects in response to text derived from the audio input translated with speech recognition tools, and to populate the linked metadata fields of the candidate image objects with the text.
  - 100. The apparatus of claim 92, wherein said logic to present representations of the identified image objects to the user orders the representations of the identified image objects in said presentation.
  - 101. The apparatus of claim 100, wherein said ordering is based on shapes of the identified image objects.
  - 102. The apparatus of claim 100, wherein said ordering is based on the metadata linked to the identified image objects.
  - 103. The apparatus of claim 92, wherein said logic to present representations of the identified image objects to the user presents one or more identified image objects with ground-truthed metadata in the linked metadata fields.
  - 104. The apparatus of claim 92, wherein the logic to present representations of the identified image objects presents the representations in a reading order for the document image.

105. An apparatus, comprising:
- a data processing system including a user input device, a display, one of memory, or access to memory, storing a document image, and resources for processing the document image, the resources including logic to;
  
  segment the document image to identify a set of image objects within the document image;
  
  apply text recognition tools to produce proposed text for the set of image objects;
  
  process the set to group image objects with the set into a plurality of subsets, the subsets including one or more image objects;
  
  link reference image objects to corresponding subsets in the plurality of subsets;
  
  create and store machine readable data structures pairing the reference image objects with linked metadata fields, whereby image objects in the corresponding subsets are linked to common metadata in the linked metadata fields, and populating the linked metadata fields based on the proposed text; and
  
  present the reference image objects to a user, and accept input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects in the corresponding subsets, including logic to accept input to verify and to edit the proposed text to establish the ground-truthed metadata.
- View Dependent Claims (106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130)
- - 106. The apparatus of claim 105, including logic to generate a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 107. The apparatus of claim 105, wherein the logic to segment includes logic that:
    - presents at least a portion of the document image with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepts input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 108. The apparatus of claim 105, wherein the searchable characteristics include a computer readable representation of a word within the image object.
  - 109. The apparatus of claim 105, wherein logic to present the reference image objects to a user, and accept input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, accepts audio input and translates the audio input using speech recognition tools.
  - 110. The apparatus of claim 105, including the logic to present the reference image objects to a user, and accept input from the user, to interactively populate the linked metadata fields with ground-truthed metadata, accepts input to change the ground-truthed metadata.
  - 111. The apparatus of claim 105, wherein the logic to process groups image objects in the set according to characteristics suggesting that the image objects in a particular subset may have common ground-truthed metadata.
  - 112. The apparatus of claim 105, wherein the logic to process groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes.
  - 113. The apparatus of claim 105, wherein the logic to process groups image objects in the set according to characteristics suggesting that the image objects in a particular subset consists of image objects having similar shapes according to an adjustable parameter.
  - 114. The apparatus of claim 105, wherein the logic to present the index displays a table having a set of entries, entries in the table corresponding to the subsets of image objects within the set, the entries including the representative image objects for the respective subsets, and fields for the common ground-truthed metadata.
  - 115. The apparatus of claim 105, including logic to display instances of image objects within a selected subset, and accept user input to interactively remove an image object from the selected subset.
  - 116. The apparatus of claim 105, including logic to display instances of image objects within a selected subset, and accept user input to interactively move an image object from the selected subset into another subset.
  - 117. The apparatus of claim 105, wherein the reference image object consists of an image object from the corresponding subset.
  - 118. The apparatus of claim 105, wherein the reference image object consists of an image object constructed in response to two or more image objects from the corresponding subset.
  - 119. The apparatus of claim 105, wherein the reference image object consists of an image object constructed in response to two or more image objects from the corresponding set of image objects.
  - 120. The apparatus of claim 105, wherein the document image comprises a machine readable file including a bit mapped representation of a document.
  - 121. The apparatus of claim 105, wherein the document image comprises a plurality of machine readable files including respective bit mapped representations documents.
  - 122. The apparatus of claim 105, wherein the logic to apply text recognition tools applies text recognition tools to the image objects individually.
  - 123. The apparatus of claim 105, wherein the logic to apply text recognition tools applies text recognition tools to the reference image objects.
  - 124. The apparatus of claim 105, wherein the logic to apply text recognition tools applies contextual text recognition tools.
  - 125. The apparatus of claim 105, wherein the logic to populate the linked metadata fields selects proposed text for the linked metadata field based upon the proposed text for members of the corresponding subset.
  - 126. The apparatus of claim 105, including logic to accept input from a plurality of users, to interactively populate the linked metadata fields with ground-truthed metadata.
  - 127. The apparatus of claim 105, wherein said logic to present the reference image objects to the user orders the reference image objects in said presentation.
  - 128. The apparatus of claim 127, wherein said ordering is based on shapes of the reference image objects.
  - 129. The apparatus of claim 127, wherein said ordering is based on the metadata linked to the reference image objects.
  - 130. The apparatus of claim 105, wherein said logic to present the reference image objects to the user includes one or more reference image objects with ground-truthed metadata in the linked metadata fields.

131. An apparatus for analyzing a document image, comprising:
- a data processing system including a user input device, a display, one of memory, or access to memory, storing a document image, and resources for processing the document image, the resources including logic to;
  
  access a database of representative image objects with linked metadata fields storing metadata, the metadata including searchable characteristics of image objects matching the representative image objects;
  
  segment the document image to identify a set of image objects within the document image;
  
  process the set to match image objects in the set with representative image objects in the database, and to link matching image objects in the set with particular representative image objects in the database; and
  
  display instances of image objects in the set that are linked with a particular representative image object in the database, and accept user input to interactively undo the link of selected image objects with the particular representative image object.
- View Dependent Claims (132, 133, 134, 135, 136, 137, 138)
- - 132. The apparatus of claim 131, including logic to generate and store a searchable data structure to represent said document image, the searchable data structure comprising said metadata linked to said set of image objects, and said set of image objects.
  - 133. The apparatus of claim 131, including logic to accept user input to interactively change the link of a selected image object with the particular representative image object to a link with another representative image object in the database.
  - 134. The apparatus of claim 131, wherein the logic to segment includes logic that:
    - presents at least a portion of the document image with graphical constructs showing boundaries of the identified image objects in the set to a user, and accepts input from the user, to interactively adjust the boundaries to form a new set of identified image objects.
  - 135. The apparatus of claim 131, wherein the searchable characteristics include a computer readable representation of a word within the image object.
  - 136. The apparatus of claim 131, including logic to create machine readable data structures pairing particular image objects in the set, not linked to representative image objects, with linked metadata fields;
    - and present representations of the particular image objects to a user, and accept input to interactively populate the linked metadata fields with ground-truthed metadata, the metadata including searchable characteristics of the image objects to which the respective metadata fields are linked.
  - 137. The apparatus of claim 131, including logic to create achine readable data structures pairing a selected image object in the set with a linked metadata field;
    - and establish an entry in the database for the selected image object.
  - 138. The apparatus of claim 131, wherein said logic to accept user input accepts user input from a plurality of users.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Convey Inc. (Project44, Inc.)
Original Assignee
Convey Inc. (Project44, Inc.)
Inventors
Howie, Cameron Telfer, Hall, Floyd Steven Jr.

Granted Patent

US 6,768,816 B2
Time in Patent Office

Days
Field of Search
US Class Current

382/229
CPC Class Codes

G06V 10/987 with the intervention of an...

G06V 30/414 Extracting the geometrical ...

Method and system for interactive ground-truthing of document images

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

93 Citations

138 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for interactive ground-truthing of document images

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

93 Citations

138 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links