Apparatus and method to support management of uniform resource locators and/or contents of database servers

US 6,725,214 B2
Filed: 01/16/2001
Issued: 04/20/2004
Est. Priority Date: 01/14/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A method of finding a Uniform Resource Locator (URL) that points to a most updated authoritative source of information contained in database systems, the method comprising:

crawling websites to determine likely publicly available records;

processing the likely publicly available records to determine a unique list of URLs each of which point to information content of crawled web sites that are likely to be the most updated authoritative source of the information content and wherein processing the likely publicly available records comprises applying an algorithm to information content of each crawled website to determine a likelihood of each crawled website as being the most authoritative source for providing specific information content.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and Systems for finding a Uniform Resource Locator (URL) that points to a most updated authoritative source of information contained in database systems, including crawling websites to determine likely publicly available records and processing the likely publicly available records to determine a unique list of URLs each of which point to information content of crawled web sites that are likely to be the most updated authoritative source of the information content.

Citations

28 Claims

1. A method of finding a Uniform Resource Locator (URL) that points to a most updated authoritative source of information contained in database systems, the method comprising:
- crawling websites to determine likely publicly available records;
  
  processing the likely publicly available records to determine a unique list of URLs each of which point to information content of crawled web sites that are likely to be the most updated authoritative source of the information content and wherein processing the likely publicly available records comprises applying an algorithm to information content of each crawled website to determine a likelihood of each crawled website as being the most authoritative source for providing specific information content.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 12)
- - 2. The method according to claim 1, further comprising:
3. The method according to claim 2, wherein the Internet search engine provides at least one URL from the submitted unique lists of URLs for a user after a search query.
4. The method according to claim 1, further comprising:
- registering web entities as authoritative sources for specific information content.
5. The method according to claim 4, further comprising:
- sending specific information content of a registered web entity to an Internet search engine.
6. The method according to claim 5, wherein the specific information content of the registered web entity is provided to a user after a search query.
7. The method according to claim 1, wherein processing the likely publicly available records includes performing a sanity check to detect information content which may be at least one of politically incorrect content and offensive content to a user.
8. The method according to claim 1, wherein the information contained within the database systems relate to Notes Storage Facility (.NSF) files containing records.
9. The method according to claim 8, wherein the information contained in the Notes Storage Facility records are identified by a ReplicaID and UNiversal ID.
12. The system according to claim 4 further comprising a memory storing a module configured to parse content of each location and determine a likelihood of the records containing private information.

10. A system for obtaining a URL representing an authoritative source of information comprising:
- a web crawler configured to search websites to compile a list content and location of records available for public viewing; and
  
  a processor configured to apply an algorithm to information content of each searched website to determine a likelihood of each searched website as being a most authoritative source for providing specific information content, the processor further configured to reduce the list of content and location into a unique list of URLs, each of which point to specific information content provided by the most likely authoritative source.
- View Dependent Claims (11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 11. The system according to claim 10, further comprising at least one Internet search engine communicating with the processor and configured to provide at least one URL of the unique list of URLs to a user in response to a user search query inquiring about the specific information content.
  - 13. The system according to claim 11, further comprising a memory communicating with the processor, the memory storing a module configured to remove hyped META tags from information submitted to the at least one Internet search engine to avoid problems affecting page positioning for declared contents, as opposed to actual contents, of the information submitted to the at least one Internet search engine.
  - 14. The system according to claim 11, further comprising a memory storing a module configured to verify searched website information is virus free before automatic submission to the at least one Internet search engine.
  - 15. The system according to claim 11, further comprising a memory communicating with the processor, the memory storing a module configured to determine all possible URLs hosted at a searched website using a native system call.
  - 16. The system according to claim 11, further comprising, one of a user terminal or an agent communicating with the at least one Internet search engine, the user terminal operative to accept the user'"'"'s search query.
  - 17. The system according to claim 11, wherein the information pertains to Notes Storage Facility containers containing information.
  - 18. The system according to claim 17, wherein the information in the Notes Storage Facility containers is uniquely identified by a ReplicaID and a UNIversalID.
  - 19. The system of claim 10, further comprising a memory communicating with the processor, the memory storing a module configured to determine copyrighted content of a searched website that is unlikely to be authorized to republish the copyrighted content.
  - 20. The system according to claim 10, further comprising a memory communicating with the processor, the memory storing a module configured to sanitize dynamic content of a searched website into static content.
  - 21. The system according to claim 10, further comprising a memory communicating with the processor, the memory storing a module configured to deduplicate URLs of searched web sites by eliminating path information of the URLs and reducing each URL into its bare indispensable information to ensure uniqueness.
  - 22. The system according to claim 10, further comprising a memory communicating with the processor, the memory storing a module configured to render URLs of searched websites to be independent of physical location within a server and allow relocation of information within the server for users consulting old addresses.

23. An information directory system accessible worldwide via the Internet, comprising:
- at least one website having a virtual container containing a list of virtualized URLs that are retrievable by an Internet entity, the virtualized URLs previously processed to identify a most likely authoritative source of content specific information stored in data base systems, a memory, accessible by the virtual container of the website, wherein the memory stores a virtualized replica of content specific information of at least one website of a register authoritative user, and a first module stored in the memory, the module configured to call an algorithm to determine a likelihood of whether a query to the commercial website, looking for specific information content, is likely to be one of a machine or crawler accessing the commercial website, or a human accessing the commercial website.
- View Dependent Claims (24, 25, 26, 27)
- - 24. The system according to claim 23, wherein the first module is configured to provide human relevant information which includes a web page with frames pertaining to the specific information content query, if it is determined that the query is likely made by the human.
  - 25. The system according to claim 24, wherein the first module is further configured to provide machine relevant information which includes text of a web page pertaining to the specific information content query, excluding information not necessary for machine processing, if it is determined that the query is likely made by the machine or crawler.
  - 26. The system according to claim 25, wherein the first module is further configured to first provide the machine relevant information and then the human relevant information if it cannot be determined that the likelihood of the query was made by either a machine or human.
  - 27. The system according to claim 26, wherein the machine relevant information is noframe format and wherein the human relevant information is frame format.

28. A method for virtualizing URLs identifying information sources in an Internet comprising:
- separating a URL into a combination of DNS and HTTP protocols;
  
  rearranging the DNS and HTTP protocols of the URL; and
  
  rewriting the URL as a unique identifier that allows seamless switching between the DNS and HTTP protocols.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dotnsf
Original Assignee
Dotnsf
Inventors
Garcia-Chiesa, Jorge
Primary Examiner(s)
Shah, Sanjiv

Application Number

US09/764,968
Publication Number

US 20020099723A1
Time in Patent Office

1,190 Days
Field of Search

707/3, 707/1, 707/100, 707/101, 707/7, 709/224, 715/513
US Class Current

1/1
CPC Class Codes

G06F 16/951   Indexing; Web crawling tech...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99937   Sorting

Apparatus and method to support management of uniform resource locators and/or contents of database servers

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus and method to support management of uniform resource locators and/or contents of database servers

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links