Systems and methods to develop training set of data based on resume corpus
First Claim
Patent Images
1. A computer-implemented method comprising:
- acquiring, by a computing system, a resume corpus;
processing, by the computing system, the resume corpus to generate resume tokens from the resume corpus, wherein the processing comprises;
determining a ratio based on co-occurrence of a first word and a second word of the resume corpus versus individual occurrence of the first word and the second word; and
determining, based on the ratio, the existence of a bigram including the first word and the second word to be used as training data;
training, by the computing system, a machine learning model to recommend a job classification based at least in part on the bigram; and
applying, by the computing system, the machine learning model to recommend a job classification based on evaluation data.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods, and non-transitory computer readable media are configured to acquire a resume corpus. The resume corpus is processed to generate resume tokens. A machine learning model is trained based on the resume tokens. The machine learning model is applied to recommend a job classification based on evaluation data.
19 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
acquiring, by a computing system, a resume corpus; processing, by the computing system, the resume corpus to generate resume tokens from the resume corpus, wherein the processing comprises; determining a ratio based on co-occurrence of a first word and a second word of the resume corpus versus individual occurrence of the first word and the second word; and determining, based on the ratio, the existence of a bigram including the first word and the second word to be used as training data; training, by the computing system, a machine learning model to recommend a job classification based at least in part on the bigram; and applying, by the computing system, the machine learning model to recommend a job classification based on evaluation data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform; acquiring a resume corpus; processing the resume corpus to generate resume tokens from the resume corpus, wherein the processing comprises; determining a ratio based on co-occurrence of a first word and a second word of the resume corpus versus individual occurrence of the first word and the second word; and determining, based on the ratio, the existence of a bigram including the first word and the second word to be used as training data; training a machine learning model to recommend a job classification based at least in part on the bigram; and applying the machine learning model to recommend a job classification based on evaluation data. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium including instructions that, when executed by at least one processor of a computing system, cause the computing system to perform a method comprising:
-
acquiring a resume corpus; processing the resume corpus to generate resume tokens from the resume corpus, wherein the processing comprises; determining a ratio based on co-occurrence of a first word and a second word of the resume corpus versus individual occurrence of the first word and the second word; and determining, based on the ratio, the existence of a bigram including the first word and the second word to be used as training data; training a machine learning model to recommend a job classification based at least in part on the bigram; and applying the machine learning model to recommend a job classification based on evaluation data. - View Dependent Claims (17, 18, 19, 20)
-
Specification