Please download the dossier by clicking on the dossier button x
×

Generation of a semantic model from textual listings

  • US 9,244,908 B2
  • Filed: 08/24/2012
  • Issued: 01/26/2016
  • Est. Priority Date: 03/27/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, by a processing device, a corpus of textual listings,textual listings, in the corpus, including text without a grammatical structure;

    tokenizing, by the processing device, each textual listing of the textual listings,tokenizing each textual listing including tokenizing at least one of an alphanumeric token or a token that comprises uppercase and lowercase characters;

    identifying, by the processing device, main concept words and attribute words in the corpus after tokenizing each textual listing of the textual listings,identifying the main concept words and the attribute words including;

    tagging, in each textual listing of the textual listings, at least one word as a head noun word based on at least one of;

    a previously identified main concept word, ora head noun identification rule,tagging, in the textual listing and after tagging the at least one word, remaining nouns as at least one modifier word, andassigning one word of the at least one head noun word as a main concept word and one word of the at least one modifier word as an attribute word;

    clustering, by the processing device, words in the corpus based on at least one of the main concept words or the attribute words according to at least one clustering rule,the at least one clustering rule including at least one of;

    a first rule preventing two quantitative attribute tokens from being clustered based on a frequency of appearance of the two quantitative attribute tokens in a same listing,a second rule preventing clustering of a quantitative attribute token with a qualitative attribute token, ora third rule indicating that a first token is to be clustered with a second token when characters of the first token are included in the second token; and

    providing, by the processing device and after clustering the words, the main concept words and the attribute words as at least a portion of a semantic model.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×