Vision-based document segmentation
First Claim
1. A method of identifying one or more portions of a document, the method comprising:
- identifying a plurality of visual blocks in the document;
detecting one or more separators between the visual blocks of the plurality of visual blocks; and
constructing, based at least in part on the plurality of visual blocks and the one or more separators, a content structure for the document, wherein the content structure identifies the different visual blocks as different portions of semantic content of the document.
2 Assignments
0 Petitions
Accused Products
Abstract
Vision-based document segmentation identifies one or more portions of semantic content of a document. The one or more portions are identified by identifying a plurality of visual blocks in the document, and detecting one or more separators between the visual blocks of the plurality of visual blocks. A content structure for the document is constructed based at least in part on the plurality of visual blocks and the one or more separators, and the content structure identifies the one or more portions of semantic content of the document. The content structure obtained using the vision-based document segmentation can optionally be used during document retrieval.
57 Citations
75 Claims
-
1. A method of identifying one or more portions of a document, the method comprising:
-
identifying a plurality of visual blocks in the document;
detecting one or more separators between the visual blocks of the plurality of visual blocks; and
constructing, based at least in part on the plurality of visual blocks and the one or more separators, a content structure for the document, wherein the content structure identifies the different visual blocks as different portions of semantic content of the document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. One or more computer readable media having stored thereon a plurality of instructions that, when executed by one or more processors of a device, causes the one or more processors to:
-
identify visual blocks in a document;
detect visual separators between the visual blocks; and
construct, based at least in part on the visual blocks and the visual separators, a content structure for the document that identifies regions of the document that represent semantic content of the document. - View Dependent Claims (32, 33, 34, 35)
-
-
36. A method of searching a plurality of documents, the method comprising:
-
receiving query criteria corresponding to a query;
accessing a plurality of blocks corresponding to the plurality of documents, wherein different blocks of the plurality of blocks correspond to different documents of the plurality of documents, wherein the plurality of blocks have been obtained by visually segmenting each of the plurality of documents;
generating rankings for one or more of the plurality of blocks based at least in part on how well the blocks match the query criteria;
generating rankings for one or more of the plurality of documents, wherein the ranking of each of the plurality of documents is based at least in part on the rankings of the multiple blocks corresponding to the document; and
returning an indication of at least one of the one or more ranked documents. - View Dependent Claims (37, 38, 39, 40, 41, 42)
-
-
43. One or more computer readable media having stored thereon a plurality of instructions that, when executed by one or more processors of a device, causes the one or more processors to:
-
receive a query including one or more search terms;
rank a plurality of blocks based on how well the plurality of blocks matches the one or more search terms, wherein each of the plurality of blocks is part of one document of a plurality of documents, and wherein each of the plurality of blocks is obtained by visual segmentation of one of the plurality of documents;
for each of the plurality of documents, rank the document based at least in part on the rankings of the blocks that are part of the document; and
return, in response to the query, an indication of the rankings of one or more of the plurality of documents. - View Dependent Claims (44, 45, 46, 47)
-
-
48. A method of searching a plurality of web pages, the method comprising:
-
receiving a request to search the plurality of web pages;
generating a first set of rankings for a subset of the plurality of web pages based on the request;
generating a second set of rankings for the subset of web pages. by visually segmenting each web page in the subset of web pages; and
obtaining, based at least in part on the second set of rankings, a final set of rankings for the subset of web pages. - View Dependent Claims (49, 50, 51, 52, 53)
-
-
54. One or more computer readable media having stored thereon a plurality of instructions that, when executed by one or more processors of a device, causes the one or more processors to:
-
generate first rankings for a plurality of documents based on how well the plurality of documents match search criteria;
generate second rankings for the plurality of documents by visually segmenting each of the plurality of documents; and
generate final rankings for the plurality of documents based at least in part on the second rankings. - View Dependent Claims (55, 56, 57, 58, 59)
-
-
60. A method of searching a plurality of documents, the method comprising:
-
receiving a request to search the plurality of documents, wherein the request includes query criteria;
identifying a subset of the plurality of documents based on the query criteria;
identifying, for each of the subset of documents, a plurality of blocks by visually segmenting the document;
expanding, based on the content of the plurality of blocks, the query criteria; and
identifying a second subset of the plurality of documents based on the expanded query criteria. - View Dependent Claims (61, 62, 63, 64)
-
-
65. One or more computer readable media having stored thereon a plurality of instructions that, when executed by one or more processors of a device, causes the one or more processors to:
-
receive one or more search terms;
identify a plurality of documents that satisfy the one or more search terms;
perform vision-based document segmentation on each of the plurality of documents to identify blocks of each of the plurality of documents;
generate a rank for each of the identified blocks based on how well the block matches the one or more search terms;
derive one or more expansion terms from one or more of the identified blocks; and
identify another plurality of documents that satisfy the one or more search terms and the expansion terms. - View Dependent Claims (66, 67)
-
-
68. A system comprising:
-
a visual block extractor to extract visual blocks from a document;
a visual separator detector coupled to receive the extracted visual blocks and detect, based on the extracted visual blocks, one or more visual separators between the extracted visual blocks; and
a content structure constructor coupled to receive the extracted visual blocks and the detected visual separators, and to use the extracted visual blocks and the detected visual separators to construct a content structure for the document. - View Dependent Claims (69, 70, 71, 72, 73)
-
-
74. A system comprising:
-
means for identifying a plurality of visual blocks in the document;
means for detecting one or more separators between the visual blocks of the plurality of visual blocks; and
means for constructing, based at least in part on the plurality of visual blocks and the one or more separators, a content structure for the document, wherein the content structure identifies the different visual blocks as different portions of semantic content of the document. - View Dependent Claims (75)
-
Specification