×

System and method for extracting information from unstructured text

  • US 10,002,129 B1
  • Filed: 03/30/2017
  • Issued: 06/19/2018
  • Est. Priority Date: 02/15/2017
  • Status: Active Grant
First Claim
Patent Images

1. A method for extracting subject-verb-object (SVO) chunked text from unstructured text, the method comprising:

  • identifying, by a SVO chunked text computing device, a plurality of part of speech (PoS) tokens in an unstructured text; and

    determining, by the SVO chunked text computing device, a SVO chunked text directly from the plurality of PoS tokens using a machine learning chunker model, wherein the machine learning chunker model is trained on an SVO annotated training data, wherein the SVO annotated training data comprises a plurality of tokens, a plurality of corresponding PoS tags, and a plurality of corresponding SVO tags, the plurality of corresponding SVO tags comprises one or more of a subject tag, a verb tag, an object tag, or an object-subject tag, and the plurality of corresponding SVO tags is in beginninginside-other (BIO) format, and wherein the SVO annotated training data is generated based on a plurality of corresponding span information for the plurality of tokens by for each of a plurality of PoS tokens in each of a plurality of sets of syntactically related PoS tokens in a sentence, detecting a span information for a PoS token and tagging the PoS token as a subject, a verb, an object, or an object-subject based on the span information and a pervious tagging of the PoS token.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×