×

Method and system for searching words in documents written in a source language as transcript of words in an origin language

  • US 10,042,843 B2
  • Filed: 06/10/2015
  • Issued: 08/07/2018
  • Est. Priority Date: 06/15/2014
  • Status: Active Grant
First Claim
Patent Images

1. Computer implemented method for searching words in documents written in a source language, words which are not meaningful in said source language, but are transcript of meaningful words in an origin language, the method is comprised of two processes:

  • a) preparation process executed for each new document, the preparation process is comprised of the following steps;

    i) reading the document;

    ii) extracting unrecognized words in the source language;

    iii) updating search indexes in the corpus for all document words;

    iv) for each new unrecognized word in the source language;

    1) removing prefixes and suffixes;

    2) performing phonetic conversion;

    3) checking frequency of unrecognized word spelling in System Hebraized Medical Lexicon (SHML);

    4) defining the most frequent spelling of the unrecognized word as the central term and connect it to other allowable and close spellings of that term;

    5) updating System Hebraized Medical Lexicon (SHML);

    b) search process which is comprised of the following steps;

    i) reading the search request and perform auto-complete for terms from the System Hebraized Medical Lexicon (SHML);

    ii) generating phonetic conversion for all words in query;

    iii) for each word;

    1) searching for similar phonetics in the corpus and find central terms;

    2) calculating the distance to the found similar words and order them in ascending order; and

    3) displaying relevant documents according to the distance.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×