METHODS AND APPARATUS TO IMPROVE FEATURE ENGINEERING EFFICIENCY WITH METADATA UNIT OPERATIONS
First Claim
1. A computer-implemented method to apply feature engineering with metadata-driven unit operations, comprising:
- retrieving a log file in a first file format, the log file containing feature occurrence data;
generating a first unit operation based on the first file format to extract the feature occurrence data from the log file to a string, the first unit operation associated with a first metadata tag;
generating second unit operations to identify respective features from the feature occurrence data, the second unit operations associated with respective second metadata tags; and
generating a first sequence of the first metadata tag and the second metadata tags to create a first vector output file of the feature occurrence data.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, apparatus, systems and articles of manufacture are disclosed to improve feature engineering efficiency. An example method disclosed herein includes retrieving a log file in a first file format, the log file containing feature occurrence data, generating a first unit operation based on the first file format to extract the feature occurrence data from the log file to a string, the first unit operation associated with a first metadata tag, generating second unit operations to identify respective features from the feature occurrence data, the second unit operations associated with respective second metadata tags, and generating a first sequence of the first metadata tag and the second metadata tags to create a first vector output file of the feature occurrence data
-
Citations
33 Claims
-
1. A computer-implemented method to apply feature engineering with metadata-driven unit operations, comprising:
-
retrieving a log file in a first file format, the log file containing feature occurrence data; generating a first unit operation based on the first file format to extract the feature occurrence data from the log file to a string, the first unit operation associated with a first metadata tag; generating second unit operations to identify respective features from the feature occurrence data, the second unit operations associated with respective second metadata tags; and generating a first sequence of the first metadata tag and the second metadata tags to create a first vector output file of the feature occurrence data. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. (canceled)
-
9. (canceled)
-
10. (canceled)
-
11. (canceled)
-
12. An apparatus to apply feature engineering with metadata-driven unit operations, comprising:
-
a log file retriever to retrieve a log file in a first file format, the log file containing feature occurrence data; a file to string operation builder to generate a first unit operation based on the first file format to extract the feature occurrence data from the log file to a string, the first unit operation associated with a first metadata tag; an extraction operation builder to generate second unit operations to identify respective features from the feature occurrence data, the second unit operations associated with respective second metadata tags; and an operation flow builder to generate a first sequence of the first metadata tag and the second metadata tags to create a first vector output file of the feature occurrence data. - View Dependent Claims (13, 14, 19, 20, 21, 22)
-
-
15. (canceled)
-
16. (canceled)
-
17. (canceled)
-
18. (canceled)
-
23. A tangible computer readable storage medium comprising computer readable instructions which, when executed, cause a processor to at least:
-
retrieve a log file in a first file format, the log file containing feature occurrence data; generate a first unit operation based on the first file format to extract the feature occurrence data from the log file to a string, the first unit operation associated with a first metadata tag; generate second unit operations to identify respective features from the feature occurrence data, the second unit operations associated with respective second metadata tags; and generate a first sequence of the first metadata tag and the second metadata tags to create a first vector output file of the feature occurrence data. - View Dependent Claims (24, 25, 26, 27, 28)
-
-
29. (canceled)
-
30. (canceled)
-
31. (canceled)
-
32. (canceled)
-
33. (canceled)
Specification