Systems and methods for tagging emails by discussions
First Claim
1. A method comprising:
- combining, by a processor, a plurality of electronically stored documents into a plurality of groups, wherein each group comprises a plurality of related documents, and wherein for each of the plurality of electronically stored documents the combining comprises;
comparing a feature vector of the electronically stored document to a set of potential categories, wherein the feature vector comprises one or more feature vector words of the electronically stored document,determining a subset of the potential categories based on a relevance score between the electronically stored document and a first category within the subset of the potential categories exceeds a first threshold, andassigning the electronically stored document to a second category within the subset of the potential categories based on a similarity distance between the feature vector of the electronically stored document and a feature vector of the second category exceeds a second threshold;
receiving, by the processor, review content provided by a user for a group of the plurality of groups;
associating the review content with the group; and
propagating the review content based on propagation information indicating documents in the group to which to propagate the review content based on a review of at least one of the related documents, wherein propagating the review content based on the propagation information comprises at least one of propagating the review content to each of the plurality of related documents or propagating the review content to a subset of the plurality of related documents, wherein the review content comprises redaction information to redact a portion of each of the plurality of related documents or to redact a portion of each document of the subset of the plurality of related documents, and wherein the redaction information specifies one or more locations corresponding to line numbers of each of the plurality of related documents or each document of the subset of the plurality of related documents that are not displayed in response to a query returning the documents.
8 Assignments
0 Petitions
Accused Products
Abstract
The invention provides for techniques to process and produce email documents. The techniques provide for organizing a first plurality of email documents into a plurality of document groups, reviewing a document group from the plurality of document groups, and associating a review content with the document group. The techniques provide for ways to propagate the review content to one or more email documents associated with the document group and producing a second plurality of email documents. The techniques provide for annotating one or more email documents in accordance with the review content. Depending on the embodiment, review content may include text, graphics, audio, tag, and multimedia information. Produced documents can be searched and browsed in accordance with information in the review content. Email documents can be grouped by information in meta information and/or header information associated with the email documents into various groups, including threads or conversations, for example.
90 Citations
19 Claims
-
1. A method comprising:
-
combining, by a processor, a plurality of electronically stored documents into a plurality of groups, wherein each group comprises a plurality of related documents, and wherein for each of the plurality of electronically stored documents the combining comprises; comparing a feature vector of the electronically stored document to a set of potential categories, wherein the feature vector comprises one or more feature vector words of the electronically stored document, determining a subset of the potential categories based on a relevance score between the electronically stored document and a first category within the subset of the potential categories exceeds a first threshold, and assigning the electronically stored document to a second category within the subset of the potential categories based on a similarity distance between the feature vector of the electronically stored document and a feature vector of the second category exceeds a second threshold; receiving, by the processor, review content provided by a user for a group of the plurality of groups; associating the review content with the group; and propagating the review content based on propagation information indicating documents in the group to which to propagate the review content based on a review of at least one of the related documents, wherein propagating the review content based on the propagation information comprises at least one of propagating the review content to each of the plurality of related documents or propagating the review content to a subset of the plurality of related documents, wherein the review content comprises redaction information to redact a portion of each of the plurality of related documents or to redact a portion of each document of the subset of the plurality of related documents, and wherein the redaction information specifies one or more locations corresponding to line numbers of each of the plurality of related documents or each document of the subset of the plurality of related documents that are not displayed in response to a query returning the documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
a processor; and a memory to store processor-executable instructions that cause the processor to; combine a plurality of electronically stored documents into a plurality of groups, wherein each group comprises a plurality of related documents, and wherein for each of the plurality of electronically stored documents the combining comprises; comparing a feature vector of the electronically stored document to a set of potential categories, wherein the feature vector comprises one or more feature vector words of the electronically stored document, determining a subset of the potential categories based on a relevance score between the electronically stored document and a first category within the subset of the potential categories exceeds a first threshold, and assigning the electronically stored document to a second category within the subset of the potential categories based on a similarity distance between the feature vector of the electronically stored document and a feature vector of the second category exceeds a second threshold; receive review content provided by a user for a group of the plurality of groups; associate the review content with the group; and propagate the review content based on propagation information indicating documents in the group to which to propagate the review content based on a review of at least one of the related documents, wherein propagating the review content based on the propagation information comprises at least one of propagating the review content to each of the plurality of related documents or propagating the review content to a subset of the plurality of related documents, wherein the review content comprises redaction information to redact a portion of each of the plurality of related documents or to redact a portion of each document of the subset of the plurality of related documents, and wherein the redaction information specifies one or more locations corresponding to line numbers of each of the plurality of related documents or each document of the subset of the plurality of related documents that are not displayed in response to a query returning the documents. - View Dependent Claims (11, 12, 13)
-
-
14. A non-transitory computer readable storage medium including instructions that, when executed by a processor, cause the processor to perform operations comprising:
-
combining a plurality of electronically stored documents into a plurality of groups, wherein each group comprises a plurality of related documents, and wherein for each of the plurality of electronically stored documents the combining comprises; comparing a feature vector of the electronically stored document to a set of potential categories, wherein the feature vector comprises one or more feature vector words of the electronically stored document, determining a subset of the potential categories based on a relevance score between the electronically stored document and a first category within the subset of the potential categories exceeds a first threshold, and assigning the electronically stored document to a second category within the subset of the potential categories based on a similarity distance between the feature vector of the electronically stored document and a feature vector of the second category exceeds a second threshold; receiving review content provided by a user for a group of the plurality of groups; associating the review content with the group; and propagating the review content based on propagation information indicating documents in the group to which to propagate the review content based on a review of at least one of the related documents, wherein propagating the review content based on the propagation information comprises at least one of propagating the review content to each of the plurality of related documents or propagating the review content to a subset of the plurality of related documents, wherein the review content comprises redaction information to redact a portion of each of the plurality of related documents or to redact a portion of each document of the subset of the plurality of related documents, and wherein the redaction information specifies one or more locations corresponding to line numbers of each of the plurality of related documents or each document of the subset of the plurality of related documents that are not displayed in response to a query returning the documents. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification