×

Content search in complex language, such as Japanese

  • US 20060031207A1
  • Filed: 06/08/2005
  • Published: 02/09/2006
  • Est. Priority Date: 06/12/2004
  • Status: Active Grant
First Claim
Patent Images

1. A system for searching content requested using a text-based query including one or more scripts or orthographic forms associated with Japanese, the system comprising:

  • a hierarchically structured vocabulary knowledge base for storing vocabulary information associated with the one or more scripts or orthographic forms, wherein the vocabulary knowledge base is generated by a method comprising;

    assigning an identifier to a semantic concept;

    identifying a main orthographic form for the semantic concept, wherein the main orthographic form is based on kanji script, katakana script, hiragana script, or any combination of kanji script, katakana script, and hiragana script;

    for at least one of the one or more scripts or orthographic forms, associating at least one synonymous orthographic form with the semantic concept, wherein the synonymous orthographic form is at least partially distinct from the main orthographic form, and wherein the at least one synonymous orthographic form includes any one or more of;

    kanji script, katakana script, hiragana script, okurigana variant, romaji written form, phonetic variants associated with one or more of kanji script, katakana script, hiragana script, okurigana variant, and romaji written form, and/or hybrid variants associated with one or more of kanji script, katakana script, hiragana script, okurigana variant, and romaji written form;

    storing the identifier, the main orthographic form, and the at least one synonymous orthographic form in a data storage component associated with the system; and

    repeating the assigning, the identifying, the associating, and the storing for additional semantic concepts;

    an asset repository for storing information associated with searchable assets;

    a classification component for classifying the searchable assets in the asset repository to facilitate matching between the searchable assets and the vocabulary information, wherein the matching is based, at least in part, on identifiers assigned to semantic concepts included in the vocabulary knowledge base; and

    a search engine for receiving and executing queries for the searchable assets.

View all claims
  • 11 Assignments
Timeline View
Assignment View
    ×
    ×