Methods and system for automatically obtaining information from a resume to update an online profile
First Claim
Patent Images
1. A method comprising:
- using at least one computer hardware processor to perform;
accessing an electronic version of a resume of a person;
automatically parsing the resume at least in part by;
identifying, based at least in part on formatting of the resume, a plurality of sections in the resume including a first section, wherein identifying the plurality of sections comprises identifying a plurality of section headings at least in part by;
identifying a plurality of section heading candidates at least in part by;
identifying a first phrase in the resume as a first section heading candidate based, at least in part, on content of the first phrase, and
identifying a second phrase in the resume as a second section heading candidate when at least a threshold number of formatting characteristics of the second phrase match those of the first phrase; and
selecting, based on one or more attributes of the plurality of section heading candidates, the plurality of section headings from the plurality of section heading candidates;
identifying, based at least in part on content in the first section and formatting of the content, a plurality of subsections of the first section including a first subsection and a second subsection, the identifying comprising;
generating one or more first formatting features and one or more first content features for each line of text of multiple lines of text in the first section,clustering the multiple lines of text in the first section based on the one or more first formatting features and the one or more first content features to obtain a first plurality of clusters including a first cluster, the first cluster comprising at least one line of text from the first subsection of the plurality of subsections of the first section and at least one line of text from the second subsection of the plurality of subsections of the first section,identifying, from the first plurality of clusters, a cluster containing beginning lines of text of multiple subsections of the first section, andidentifying the plurality of subsections of the first section based, at least in part, on the identified cluster containing the beginning lines of text of the multiple subsections of the first section; and
processing text in the plurality of subsections to identify a plurality of credentials and associated attributes; and
populating an online profile for the person to reflect the plurality of credentials and the associated attributes, wherein the online profile for the person is one of a plurality of online profiles associated with a plurality of users of an online service.
4 Assignments
0 Petitions
Accused Products
Abstract
Techniques involving accessing a resume of a person; automatically parsing the resume at least in part by: identifying, based at least in part on formatting of the resume, a plurality of sections in the resume including a first section; identifying, based at least in part on content in the first section and formatting of the content, a plurality of subsections of the first section; and processing text in the plurality of subsections to identify a plurality of credentials and associated attributes; and updating a profile for the person to reflect the plurality of credentials and the associated attributes.
78 Citations
23 Claims
-
1. A method comprising:
using at least one computer hardware processor to perform; accessing an electronic version of a resume of a person; automatically parsing the resume at least in part by; identifying, based at least in part on formatting of the resume, a plurality of sections in the resume including a first section, wherein identifying the plurality of sections comprises identifying a plurality of section headings at least in part by; identifying a plurality of section heading candidates at least in part by;
identifying a first phrase in the resume as a first section heading candidate based, at least in part, on content of the first phrase, and
identifying a second phrase in the resume as a second section heading candidate when at least a threshold number of formatting characteristics of the second phrase match those of the first phrase; andselecting, based on one or more attributes of the plurality of section heading candidates, the plurality of section headings from the plurality of section heading candidates; identifying, based at least in part on content in the first section and formatting of the content, a plurality of subsections of the first section including a first subsection and a second subsection, the identifying comprising; generating one or more first formatting features and one or more first content features for each line of text of multiple lines of text in the first section, clustering the multiple lines of text in the first section based on the one or more first formatting features and the one or more first content features to obtain a first plurality of clusters including a first cluster, the first cluster comprising at least one line of text from the first subsection of the plurality of subsections of the first section and at least one line of text from the second subsection of the plurality of subsections of the first section, identifying, from the first plurality of clusters, a cluster containing beginning lines of text of multiple subsections of the first section, and identifying the plurality of subsections of the first section based, at least in part, on the identified cluster containing the beginning lines of text of the multiple subsections of the first section; and processing text in the plurality of subsections to identify a plurality of credentials and associated attributes; and populating an online profile for the person to reflect the plurality of credentials and the associated attributes, wherein the online profile for the person is one of a plurality of online profiles associated with a plurality of users of an online service. - View Dependent Claims (2, 3, 4, 5, 6, 18, 21)
-
7. At least one non-transitory computer-readable storage medium storing processor executable instructions that, when executed using at least one computer hardware processor, cause the at least one computer hardware processor to perform a method comprising:
-
accessing an electronic version of a resume of a person; automatically parsing the resume at least in part by; identifying, based at least in part on formatting of the resume, a plurality of sections in the resume including a first section, wherein identifying the plurality of sections comprises identifying a plurality of section headings at least in part by; identifying a plurality of section heading candidates at least in part by; identifying a first phrase in the resume as a first section heading candidate based, at least in part, on content of the first phrase, and identifying a second phrase in the resume as a second section heading candidate when at least a threshold number of formatting characteristics of the second phrase match those of the first phrase; and selecting, based on one or more attributes of the plurality of section heading candidates, the plurality of section headings from the plurality of section heading candidates; identifying, based at least in part on content in the first section and formatting of the content, a plurality of subsections of the first section including a first subsection and a second subsection, the identifying comprising; generating one or more first formatting features and one or more first content features for each line of text of multiple lines of text in the first section, clustering the multiple lines of text in the first section based on the one or more first formatting features and the one or more first content features to obtain a first plurality of clusters including a first cluster, the first cluster comprising at least one line of text from the first subsection of the plurality of subsections of the first section and at least one line of text from the second subsection of the plurality of subsections of the first section, identifying, from the first plurality of clusters, a cluster containing beginning lines of text of multiple subsections of the first section, and identifying the plurality of subsections of the first section based, at least in part, on the identified cluster containing the beginning lines of text of the multiple subsections of the first section; and processing text in the plurality of subsections to identify a plurality of credentials and associated attributes; and populating an online profile for the person to reflect the plurality of credentials and the associated attributes, wherein the online profile for the person is one of a plurality of online profiles associated with a plurality of users of an online service. - View Dependent Claims (8, 9, 10, 11, 12, 19, 22)
-
-
13. A system comprising:
at least one computer hardware processor configured to perform; accessing an electronic version of a resume of a person; automatically parsing the resume at least in part by; identifying, based at least in part on formatting of the resume, a plurality of sections in the resume including a first section, wherein identifying the plurality of sections comprises identifying a plurality of section headings at least in part by; identifying a plurality of section heading candidates at least in part by;
identifying a first phrase in the resume as a first section heading candidate based, at least in part, on content of the first phrase, and
identifying a second phrase in the resume as a second section heading candidate when at least a threshold number of formatting characteristics of the second phrase match those of the first phrase; andselecting, based on one or more attributes of the plurality of section heading candidates, the plurality of section headings from the plurality of section heading candidates; identifying, based at least in part on content in the first section and formatting of the content, a plurality of subsections of the first section including a first subsection and a second subsection, the identifying comprising; generating one or more first formatting features and one or more first content features for each line of text of multiple lines of text in the first section, clustering the multiple lines of text in the first section based on the one or more first formatting features and the one or more first content features to obtain a first plurality of clusters including a first cluster, the first cluster comprising at least one line of text from the first subsection of the plurality of subsections of the first section and at least one line of text from the second subsection of the plurality of subsections of the first section, identifying, from the first plurality of clusters, a cluster containing beginning lines of text of multiple subsections of the first section, and identifying the plurality of subsections of the first section based, at least in part, on the identified cluster containing the beginning lines of text of the multiple subsections of the first section; and processing text in the plurality of subsections to identify a plurality of credentials and associated attributes; and populating an online profile for the person to reflect the plurality of credentials and the associated attributes, wherein the online profile for the person is one of a plurality of online profiles associated with a plurality of users of an online service. - View Dependent Claims (14, 15, 16, 17, 20, 23)
Specification