×

Directed web crawler with machine learning

  • US 20020194161A1
  • Filed: 04/12/2002
  • Published: 12/19/2002
  • Est. Priority Date: 04/12/2001
  • Status: Abandoned Application
First Claim
Patent Images

1. A system having computer-readable code associated with a network computer environment and one or more servers having one or more databases associated therewith containing information about database content for providing a network search in response to a user'"'"'s input, said system comprising:

  • at least one computer, for receiving one or more queries, searching a plurality of databases, and displaying a specialized collection of documents related to said one or more queries;

    at least one network, operatively connected to said at least one computer, for accessing said plurality of databases and transferring information from said plurality of databases to said at least one network;

    at least one server, operatively connected to said at least one network, for storing said plurality of databases; and

    software means, operatively connected to said at least one computer, for preparing an affinity set related to said one or more queries, identifying information in said plurality of databases, creating an index relating to said information in said plurality of databases, creating a set of seed documents based on information in said plurality of databases, training a classifier to classify said information in said plurality of databases using said seed documents, searching said network for relevant documents using a binary system created by said classifier, creating said specialized collection of documents related to said one or more queries, creating a ranked list of said specialized collection of documents, and displaying said ranked list on said at least one computer.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×