Phrase extraction text analysis method and system
First Claim
Patent Images
1. A method for extracting a relevant phrase from text, comprising the steps of:
- accessing a vehicle information verbatim from a database with a server having a processor that is at least partially configured as a special purpose text analyzer;
tokenizing the vehicle information verbatim;
building a plurality of n-gram phrases from the vehicle information verbatim with the server, wherein the plurality of n-gram phrases include a seed from a seed list as a start, a middle, or an end of each n-gram phrase of the plurality of n-gram phrases, wherein the seed list includes a plurality of seeds, each seed being directed to a vehicle-related component or a vehicle-related functionality; and
filtering the plurality of n-gram phrases with the server to obtain the relevant phrase or an irrelevant phrase, wherein the filtering includes calculating an external relevance factor, an internal relevance factor, and a context pattern relevance factor, and the filtering includes using a weak filtering rule set or a strong filtering rule set,wherein the weak filtering rule set is used to conjunctively consider the external relevance factor, the internal relevance factor, and the context pattern relevance factor so that n-gram phrases are irrelevant if an irrelevance threshold is met for each of the external relevance factor, the internal relevance factor, and the context pattern relevance factor, andwherein the strong filtering rule set is used to disjunctively consider the external relevance factor, the internal relevance factor, and the context pattern relevance factor so that n-gram phrases are irrelevant if an irrelevance threshold is met for one of the external relevance factor, the internal relevance factor, or the context pattern relevance factor.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for extracting a relevant phrase from text. The system and method may build a plurality of n-gram phrases using a seed from a seed list as a start, a middle, or an end of each n-gram phrase. The seed list may be directed to a specific vehicle system and each seed may indicate a symptom, part, or action to extract relevant phrases from vehicle information verbatims. The plurality of n-gram phrases may be filtered to obtain one or more relevant phrases. The filtering process may include calculating an external relevance factor, an internal relevance factor, or a context pattern relevance factor.
27 Citations
13 Claims
-
1. A method for extracting a relevant phrase from text, comprising the steps of:
-
accessing a vehicle information verbatim from a database with a server having a processor that is at least partially configured as a special purpose text analyzer; tokenizing the vehicle information verbatim; building a plurality of n-gram phrases from the vehicle information verbatim with the server, wherein the plurality of n-gram phrases include a seed from a seed list as a start, a middle, or an end of each n-gram phrase of the plurality of n-gram phrases, wherein the seed list includes a plurality of seeds, each seed being directed to a vehicle-related component or a vehicle-related functionality; and filtering the plurality of n-gram phrases with the server to obtain the relevant phrase or an irrelevant phrase, wherein the filtering includes calculating an external relevance factor, an internal relevance factor, and a context pattern relevance factor, and the filtering includes using a weak filtering rule set or a strong filtering rule set, wherein the weak filtering rule set is used to conjunctively consider the external relevance factor, the internal relevance factor, and the context pattern relevance factor so that n-gram phrases are irrelevant if an irrelevance threshold is met for each of the external relevance factor, the internal relevance factor, and the context pattern relevance factor, and wherein the strong filtering rule set is used to disjunctively consider the external relevance factor, the internal relevance factor, and the context pattern relevance factor so that n-gram phrases are irrelevant if an irrelevance threshold is met for one of the external relevance factor, the internal relevance factor, or the context pattern relevance factor. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for extracting a relevant phrase from text, comprising the steps of:
-
accessing a verbatim from a database with a server having a processor that is at least partially configured as a special purpose text analyzer; tokenizing the verbatim; building a plurality of n-gram phrases with the server from the verbatim, wherein the plurality of n-gram phrases include a seed from a seed list as a start, a middle, or an end of each n-gram phrase of the plurality of n-gram phrases, wherein the seed list includes a plurality of seeds; calculating an external relevance factor for each n-gram phrase of the plurality of n-gram phrases; calculating an internal relevance factor for each n-gram phrase of the plurality of n-gram phrases; calculating a context pattern relevance factor for each n-gram phrase of the plurality of n-gram phrases; and using the external relevance factor, the internal relevance factor, the context pattern relevance factor, or a combination of one or more of the external relevance factor, the internal relevance factor, and the context pattern relevance factor to identify the relevant phrase or an irrelevant phrase from the plurality of n-gram phrases, wherein a weak filtering rule set is used to conjunctively consider the external relevance factor, the internal relevance factor, and the context pattern relevance factor so that n-gram phrases are irrelevant if an irrelevance threshold is met for each of the external relevance factor, the internal relevance factor, and the context pattern relevance factor. - View Dependent Claims (9, 11)
-
-
10. A method for extracting a relevant phrase from text, comprising the steps of:
-
accessing a verbatim from a database with a server having a processor that is at least partially configured as a special purpose text analyzer; tokenizing the verbatim; building a plurality of n-gram phrases with the server from the verbatim, wherein the plurality of n-gram phrases include a seed from a seed list as a start, a middle, or an end of each n-gram phrase of the plurality of n-gram phrases, wherein the seed list includes a plurality of seeds; calculating an external relevance factor for each n-gram phrase of the plurality of n-gram phrases; calculating an internal relevance factor for each n-gram phrase of the plurality of n-gram phrases; calculating a context pattern relevance factor for each n-gram phrase of the plurality of n-gram phrases; and using the external relevance factor, the internal relevance factor, the context pattern relevance factor, or a combination of one or more of the external relevance factor, the internal relevance factor, and the context pattern relevance factor to identify the relevant phrase or an irrelevant phrase from the plurality of n-gram phrases, wherein a strong filtering rule set is used to disjunctively consider the external relevance factor, the internal relevance factor, and the context pattern relevance factor so that n-gram phrases are irrelevant if an irrelevance threshold is met for one of the external relevance factor, the internal relevance factor, or the context pattern relevance factor. - View Dependent Claims (12, 13)
-
Specification