×

SYSTEM AND METHOD FOR BUILDING DIVERSE LANGUAGE MODELS

  • US 20120232885A1
  • Filed: 03/08/2011
  • Published: 09/13/2012
  • Est. Priority Date: 03/08/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method of generating a diverse language model, the method comprising:

  • crawling, via a crawler operating on a computing device, a plurality of documents in a network of interconnected documents according to a visitation policy, wherein the visitation policy is configured to focus on novelty regions for a current language model built from at least one previous crawling cycle, and wherein the visitation policy is based on a vocabulary considered likely to fill gaps in the current language model; and

    generating a diverse language model based on the current language model and the plurality of documents.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×