Unguided application crawling architecture

US 10,120,876 B2
Filed: 09/09/2015
Issued: 11/06/2018
Est. Priority Date: 05/13/2015
Status: Expired due to Fees

First Claim

Patent Images

1. An apparatus for automated acquisition of content from an application and improved searching of the content in response to a query of a user device, the apparatus comprising:

at least one processor, the at least one processor configured to track links, the tracking of links including;

controlling an executing instance of the application; and

for a selected state of the application;

controlling the executing application instance to navigate to the selected state, andidentifying a first set of application states reachable from the selected state, each of the first set of application states being reachable via a respective user interface interaction with the selected state,wherein the at least one processor is further configured to store records in a state storage based on the first set of application states, a first state record including;

(i) a representation of content of a first state of the first set of application states, and(ii) a unique identifier that uniquely identifies the first state within the records of the state storage, the unique identifier of the first state indicating a path followed within the executing application instance from a default state of the application to the first state, and the path including the user interface interaction corresponding to the first state,wherein the at least one processor is further configured to scrape records, including, for each of the stored records, extract text and metadata from the state, information based on the extracted text and metadata being stored in a data store, andwherein the at least one processor is further configured to provide at least one record in response to the query from the user device.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for automated acquisition of content from an application includes a link tracking module that controls an instance of the application executing within an emulator. For a selected state, the link tracking module controls the executing application instance to navigate to the selected state and identifies a first set of application states reachable by user interface interaction. A state storage module stores records based on the first set. A first state record includes content of a first state of the first set and a unique identifier that uniquely identifies the first state. The unique identifier indicates a path followed within the executing application instance from a default state to the first state, including corresponding user interface interaction. A scraper module, for each of the records in the state storage module, navigates to the state specified by the unique identifier using the indicated path and extracts text from the state.

45 Citations

View as Search Results

31 Claims

1. An apparatus for automated acquisition of content from an application and improved searching of the content in response to a query of a user device, the apparatus comprising:
- at least one processor, the at least one processor configured to track links, the tracking of links including;
  
  controlling an executing instance of the application; and
  
  for a selected state of the application;
  
  controlling the executing application instance to navigate to the selected state, andidentifying a first set of application states reachable from the selected state, each of the first set of application states being reachable via a respective user interface interaction with the selected state,wherein the at least one processor is further configured to store records in a state storage based on the first set of application states, a first state record including;
  
  (i) a representation of content of a first state of the first set of application states, and(ii) a unique identifier that uniquely identifies the first state within the records of the state storage, the unique identifier of the first state indicating a path followed within the executing application instance from a default state of the application to the first state, and the path including the user interface interaction corresponding to the first state,wherein the at least one processor is further configured to scrape records, including, for each of the stored records, extract text and metadata from the state, information based on the extracted text and metadata being stored in a data store, andwherein the at least one processor is further configured to provide at least one record in response to the query from the user device.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The apparatus of claim 1, wherein the link tracking by the at least one processor comprises executing the instance of the application within an emulator.
  - 3. The apparatus of claim 1 wherein the unique identifier of the first state is a uniform resource identifier (URI).
  - 4. The apparatus of claim 1 wherein the unique identifier of the first state is based on a concatenation, in order, of each user interface interaction triggered when navigating from the default state to the first state.
  - 5. The apparatus of claim 1 wherein the default state is a home state of the application and wherein, upon execution, the application presents the home state to a user of the application.
  - 6. The apparatus of claim 1 wherein the representation of the content of the first state requires less storage space than does the content of the first state.
  - 7. The apparatus of claim 6 wherein the representation of the content of the first state is based on a calculated hash of the content of the first state.
  - 8. The apparatus of claim 1 wherein:
    - the first state record includes a second unique identifier that uniquely identifies the first state within the records of the state storage; and
      
      the unique identifier of the first state indicates a second path followed within the application from the default state to the first state.
  - 9. The apparatus of claim 8 wherein the at least one processor is further configured to detect duplicate content, the detection comprising:
    - determining whether the path and the second path both arrive at the first state, andin response to determining that the path and the second path both arrive at the first state, adding the second unique identifier to the first state record instead of creating a second state record.
  - 10. The apparatus of claim 8, wherein the at least one processor is further configured to select a path, the selection of a path including selecting a single unique identifier for each of the records for use by the at least one processor when scraping records.
  - 11. The apparatus of claim 10 wherein the at least one processor is further configured to, when scraping records, for each of the records in the state storage, (i) navigate to the state specified by the selected single unique identifier using the path indicated by the selected single unique identifier and (ii) extract text and metadata from the state.
  - 12. The apparatus of claim 10 wherein the selection of a path comprises selecting, for each of the records, the single unique identifier that identifies a fastest path to the respective state.
  - 13. The apparatus of claim 1, wherein the link tracking by the at least one processor comprises:
    - identifying when an application programming interface call is available for navigating to the first state; and
      
      storing a second unique identifier in the first state record, wherein the second unique identifier indicates the application programming interface call.
  - 14. The apparatus of claim 13 wherein the link tracking by the at least one processor comprises, for a second state reachable from the first state via a first user interface interaction:
    - storing a second state record in the state storage including a unique identifier based on the application programming interface call and the first user interface interaction; and
      
      adding a second unique identifier to the second state record, wherein the second unique identifier indicates the first user interface interaction and user interface interactions followed from the default state to the first state.
  - 15. The apparatus of claim 1, wherein the providing of at least one record comprises:
    - in response to the query from the user device, selecting records from the data store to generate a set of records by forming a consideration set of records;
      
      processing the consideration set of records by assigning a score to each record of the consideration set of records; and
      
      generating results by responding to the user device with a subset of the consideration set of records,wherein the subset is selected based on the assigned scores, andwherein the subset identifies application states of applications that are relevant to the query.

16. A method for automated acquisition of content from an application and provision of content for searching by a user device, the method comprising:
- executing, using at least one processor of an apparatus for automated content acquisition and provision, an instance of the application;
  
  for a selected state of the application;
  
  (i) controlling, using the at least one processor, the executing application instance to navigate to the selected state, and(ii) identifying, using the at least one processor, a first set of application states reachable from the selected state, each of the first set of application states being reachable via a respective user interface interaction with the selected state;
  
  storing records in a state storage based on the first set of application states, wherein a first state record includes;
  
  (i) a representation of content of a first state of the first set of application states, and(ii) a unique identifier that uniquely identifies the first state within the stored records, the unique identifier of the first state indicating a path followed within the executing application instance from a default state of the application to the first state, and the path including the user interface interaction corresponding to the first state;
  
  for each of the stored records, extracting, using the at least one processor, text and metadata from the state; and
  
  storing information based on the extracted text and metadata in a data store.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
- - 17. The method of claim 16 wherein the instance of the application is executed within an emulator.
  - 18. The method of claim 16 wherein the unique identifier of the first state is a uniform resource identifier (URI).
  - 19. The method of claim 16 further comprising forming the unique identifier of the first state by concatenating, in order, each user interface interaction triggered when navigating from the default state to the first state.
  - 20. The method of claim 16 wherein the default state is a home state of the application and wherein, upon execution, the application presents the home state to a user of the application.
  - 21. The method of claim 16 wherein the representation of the content of the first state requires less storage space than does the content of the first state.
  - 22. The method of claim 21 further comprising generating the representation of the content of the first state by calculating a hash of the content of the first state.
  - 23. The method of claim 16 wherein:
    - the first state record includes a second unique identifier that uniquely identifies the first state within the stored records; and
      
      the unique identifier of the first state indicates a second path followed within the application from the default state to the first state.
  - 24. The method of claim 23 further comprising:
    - determining whether the path and the second path both arrive at the first state; and
      
      in response to determining that the path and the second path both arrive at the first state, adding the second unique identifier to the first state record instead of creating a second state record.
  - 25. The method of claim 23 further comprising selecting a single unique identifier for each of the records for use in the extracting.
  - 26. The method of claim 25 further comprising, for each of the stored records, navigating to the state specified by the selected single unique identifier using the path indicated by the selected single unique identifier before extracting text and metadata from the state.
  - 27. The method of claim 25 wherein the selecting includes, for each of the records, selecting the single unique identifier that identifies a fastest path to the respective state.
  - 28. The method of claim 16 further comprising:
    - identifying when an application programming interface call is available for navigating to the first state; and
      
      storing a second unique identifier in the first state record, wherein the second unique identifier indicates the application programming interface call.
  - 29. The method of claim 28 further comprising, for a second state reachable from the first state via a first user interface interaction:
    - storing a second state record including a unique identifier based on the application programming interface call and the first user interface interaction; and
      
      adding a second unique identifier to the second state record, wherein the second unique identifier indicates the first user interface interaction and user interface interactions followed from the default state to the first state.
  - 30. The method of claim 16, further comprising:
    - in response to receiving a query from a user device, selecting records from the data store to form a consideration set of records;
      
      assigning a score to each record of the consideration set of records; and
      
      responding to the user device with a subset of the consideration set of records, wherein the subset is selected based on the assigned scores, and wherein the subset identifies application states of applications that are relevant to the query.
  - 31. A non-transitory computer-readable medium storing processor-executable instructions configured to perform the method of claim 16.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Samsung Electronics Co. Ltd.
Inventors
Desineni, Kalyan, Sankaranarasimhan, Manikandan, Singh, Brahm, Mohan, Sudhir
Primary Examiner(s)
Aspinwall, Evan

Application Number

US14/849,540
Publication Number

US 20160335333A1
Time in Patent Office

1,154 Days
Field of Search

707602
US Class Current
CPC Class Codes

G06F 16/1748   De-duplication implemented ...

G06F 16/24556   Aggregation; Duplicate elim...

G06F 16/24578   using ranking

G06F 16/254   Extract, transform and load...

G06F 16/9024   Graphs; Linked lists G06F16...

G06F 16/951   Indexing; Web crawling tech...

G06F 16/9535   Search customisation based ...

G06F 16/9538   Presentation of query results

G06F 16/954   Navigation, e.g. using cate...

G06F 16/9558   Details of hyperlinks; Mana...

G06F 3/04842   Selection of displayed obje...

G06F 3/04847   Interaction techniques to c...

H04L 67/02   based on web technology, e....

Unguided application crawling architecture

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

45 Citations

31 Claims

Specification

Use Cases

Quick Links

Others

Unguided application crawling architecture

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

45 Citations

31 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others