Identification of content by metadata

US 9,787,757 B2
Filed: 11/22/2016
Issued: 10/10/2017
Est. Priority Date: 12/31/2008
Status: Active Grant

First Claim

Patent Images

1. A method for filtering messages, the method comprising:

receiving a first electronic message via a network interface, the first electronic message including a first document file;

extracting a first metadata dataset characterizing the first document file;

retrieving a second metadata dataset from a database, the second metadata dataset characterizing a second document file included in a second electronic message;

identifying that the first metadata dataset matches the second metadata dataset within a previously specified margin of error, wherein the previously specified margin of error represents a previously specified range of variations between the first metadata dataset and the second metadata dataset; and

classifying the first electronic message as spam in response to the identification that the first metadata dataset matches the second metadata dataset within the previously specified margin of error.

View all claims

16 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for identifying content in electronic messages are provided. An electronic message may include certain content. The content is detected and analyzed to identify any metadata. The metadata may include a numerical signature characterizing the content. A thumbprint is generated based on the numerical signature. The thumbprint may then be compared to thumbprints of previously received messages. The comparison allows for classification of the electronic message as spam or not spam.

51 Citations

View as Search Results

20 Claims

1. A method for filtering messages, the method comprising:
- receiving a first electronic message via a network interface, the first electronic message including a first document file;
  
  extracting a first metadata dataset characterizing the first document file;
  
  retrieving a second metadata dataset from a database, the second metadata dataset characterizing a second document file included in a second electronic message;
  
  identifying that the first metadata dataset matches the second metadata dataset within a previously specified margin of error, wherein the previously specified margin of error represents a previously specified range of variations between the first metadata dataset and the second metadata dataset; and
  
  classifying the first electronic message as spam in response to the identification that the first metadata dataset matches the second metadata dataset within the previously specified margin of error.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, wherein the first document file is a portable document format (PDF) file.
  - 3. The method of claim 1, wherein the first document file is a text file.
  - 4. The method of claim 1, wherein the first document file is a spreadsheet file.
  - 5. The method of claim 1, wherein the first document file is a rich text file.
  - 6. The method of claim 1, wherein the first document file includes at least one of video or audio.
  - 7. The method of claim 1, wherein the first metadata dataset identifies an author of the first document file.
  - 8. The method of claim 1, wherein the first metadata dataset identifies a date that the first document file was created.
  - 9. The method of claim 1, wherein the first metadata dataset identifies a date that the first document file was modified.
  - 10. The method of claim 1, wherein the first metadata dataset identifies a size of the first document file.
  - 11. The method of claim 1, wherein the first metadata dataset identifies a dimension of the first document file.
  - 12. The method of claim 1, wherein the first metadata dataset identifies one or more colors within the first document file.
  - 13. The method of claim 1, wherein the first metadata dataset identifies at least a subset of the content of the first document file.
  - 14. The method of claim 1, wherein the first metadata dataset includes a numerical value characterizing the content of the first document file.

15. A system for filtering messages, the system comprising:
- a communication transceiver that receives a first electronic message over a communication network, the first electronic message including a first document file;
  
  a memory that stores instructions; and
  
  a processor, wherein execution of the instructions by the processor causes the system to;
  
  extract a first metadata dataset characterizing the first document file,retrieve a second metadata dataset from a database, the second metadata dataset characterizing a second document file included in a second electronic message,identify that the first metadata dataset matches the second metadata dataset within a previously specified margin of error, wherein the previously specified margin of error represents a previously specified range of variations between the first metadata dataset and the second metadata dataset, andclassify the first electronic message as spam in response to the identification that the first metadata dataset matches the second metadata dataset within the previously specified margin of error.
- View Dependent Claims (16, 17, 18)
- - 16. The system of claim 15, wherein the system is a network device, wherein the network device is communicatively coupled to a client device, and wherein the network device implements a firewall protecting the client device.
  - 17. The system of claim 15, wherein the system is a server implementing a service used by a client device.
  - 18. The system of claim 15, wherein the first document file is selected from a group including a portable document format (PDF) file, a text file, a spreadsheet file, a rich text file, a video file, and an audio file.

19. A method for filtering audiovisual content, the method comprising:
- receiving a first electronic message via a network interface, the first electronic message including a first video file;
  
  extracting a first metadata dataset characterizing the first video file;
  
  retrieving a second metadata dataset from a database, the second metadata dataset characterizing a second video file included in a second electronic message;
  
  identifying that the first metadata dataset matches the second metadata dataset within a previously specified margin of error, wherein the previously specified margin of error represents a previously specified range of variations between the first metadata dataset and the second metadata dataset; and
  
  classifying the first electronic message as spam in response to the identification that the first metadata dataset matches the second metadata dataset within the previously specified margin of error.
- View Dependent Claims (20)
- - 20. The method of claim 19, wherein at least one of first video file or the second video file includes audio.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SonicWALL US Holdings, Inc. (SonicWall Holdings Ltd.)
Original Assignee
SonicWALL, Inc. (SonicWall Holdings Ltd.)
Inventors
Yu, Sijie
Primary Examiner(s)
Shayanfar, Ali

Application Number

US15/358,872
Publication Number

US 20170142050A1
Time in Patent Office

322 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/24575   using context

G06F 16/285   Clustering or classification

G06F 16/35   Clustering; Classification

G06F 16/5838   using colour

G06F 16/5846   using extracted text

G06F 16/93   Document management systems

G06F 16/9535   Search customisation based ...

G06F 21/554   involving event detection a...

G06F 21/606   by securing the transmissio...

G06F 2221/2119   Authenticating web pages, e...

G06Q 10/107   Computer-aided management o...

H04L 51/212   using filtering or selectiv...

H04L 67/06   specially adapted for file ...

H04L 9/3247   involving digital signatures

Identification of content by metadata

First Claim

16 Assignments

0 Petitions

Accused Products

Abstract

51 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Identification of content by metadata

First Claim

16 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

51 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links