×

Labeling of work of art titles in text for natural language processing

  • US 20080071519A1
  • Filed: 09/19/2006
  • Published: 03/20/2008
  • Est. Priority Date: 09/19/2006
  • Status: Active Grant
First Claim
Patent Images

1. A parser for parsing text comprising:

  • a tokenizing module which divides the text into an ordered sequence of linguistic tokens;

    a morphological module for associating parts of speech with the linguistic tokens;

    a detection module which applies rules for identifying expressions as candidate titles of works, each of the expressions comprising at least one of the linguistic tokens;

    a filtering module for filtering the candidate titles of works, the filtering module applying at least one rule which is formulated to exclude citations of direct speech from the candidate titles of works; and

    a comparison module for comparing remaining candidate titles of works with titles of works in an associated knowledge base and annotating the text to identify the candidate title as a nominative unit when a match with a title of a work is found in the associated knowledge base.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×