×

Recognizing chemical names in a chinese document

  • US 9,575,957 B2
  • Filed: 08/30/2012
  • Issued: 02/21/2017
  • Est. Priority Date: 08/31/2011
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method comprising:

  • a computer device receiving a Chinese document including chemical names;

    the computer device recognizing chemical name segments in said document;

    the computer device recognizing non-chemical name segments in said document, wherein the computer device recognizing said non-chemical name segments in said document comprises;

    segmenting said document into words;

    checking whether each segmented word is in a non-chemical name segment dictionary;

    provided that said segmented word is in said non-chemical name segment dictionary, determining said segmented word to be a non-chemical name segment; and

    recording position information of said non-chemical name segment; and

    the computer device combining said chemical name segments to get said chemical names based on said recognized chemical name segments and non-chemical name segments to recognize said chemical names in Chinese documents, wherein the computer device recognizing said chemical name segments in said document comprises;

    segmenting said document into sentences;

    matching all of said chemical name segments appearing in sentences of said document based on a chemical name segment dictionary;

    recording position information of said chemical name segments; and

    reducing said chemical name segments in a same sentence, wherein reducing said chemical name segments in a same sentence is performed according to a principle of matching the most chemical name segments with the least number of chemical name segments; and

    wherein the computer device combining said chemical name segments to get said chemical name based on said recognized chemical name segments and non-chemical name segments comprises;

    determining adjacent chemical name segments in a same sentence according to said position information of said chemical name segments;

    checking whether there are non-chemical name segments between said adjacent chemical name segments based on said position information of said chemical name segments and non-chemical name segments; and

    provided that there are no non-chemical name segments between said adjacent chemical name segments, combining said adjacent chemical name segments to get a chemical name.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×