QUERY-BY-EXAMPLE IN LARGE-SCALE CODE REPOSITORIES
First Claim
1. A system configured to perform query-by-example, the system comprising a processor and a memory coupled with the processor, wherein the memory is configured to provide the processor with instructions for:
- maintaining, by a query module executing on the system, a source code repository containing a plurality of source code files, wherein each of the plurality of source code files is associated with a corresponding source syntax structure generated based on said each of the plurality of source code files;
receiving, by the query module, a query snippet;
generating, by the query module, a query syntax structure based on the query snippet; and
identifying, by the query module, a first source code file from the plurality of source code files for being relevant to the query snippet, wherein the being relevant to the query snippet is determined by a first relevance score which is calculated based on the query syntax structure and the first source code file'"'"'s corresponding source syntax structure.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for performing query-by-example are described. A query module executing on the system may maintain a source code repository containing a plurality of source code files. Each of the plurality of source code files is associated with a corresponding source syntax structure generated based on said each of the plurality of source code files. The query module may receive a query snippet, and generate a query syntax structure based on the query snippet. The query module may then identify a first source code file from the plurality of source code files for being relevant to the query snippet. The being relevant to the query snippet is determined by a first relevance score which is calculated based on the query syntax structure and the first source code file'"'"'s corresponding source syntax structure.
-
Citations
20 Claims
-
1. A system configured to perform query-by-example, the system comprising a processor and a memory coupled with the processor, wherein the memory is configured to provide the processor with instructions for:
-
maintaining, by a query module executing on the system, a source code repository containing a plurality of source code files, wherein each of the plurality of source code files is associated with a corresponding source syntax structure generated based on said each of the plurality of source code files; receiving, by the query module, a query snippet; generating, by the query module, a query syntax structure based on the query snippet; and identifying, by the query module, a first source code file from the plurality of source code files for being relevant to the query snippet, wherein the being relevant to the query snippet is determined by a first relevance score which is calculated based on the query syntax structure and the first source code file'"'"'s corresponding source syntax structure. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10)
-
-
2. The system of claim 2, wherein the query syntax structure is a syntax tree containing a plurality of hierarchical nodes associated with syntactic elements extracted from the query snippet.
-
11. A method for performing query-by-example, the method being performed in a system comprising a processor and a memory coupled with the processor, the method comprising:
-
maintaining, by a query module executing on the system, a plurality of source code files, wherein each of the plurality of source code files is associated with a corresponding plurality of source characteristic vectors generated based on said each of the plurality of source code files; receiving, by a query module, a query snippet; generating, by the query module, a plurality of query characteristic vectors based on the query snippet; and identifying, by the query module, a first source code file from the plurality of source code files for being relevant to the query snippet, wherein the being relevant to the query snippet is determined by a first relevance score calculated based on the plurality of query characteristic vectors and the first source code file'"'"'s corresponding plurality of source characteristic vectors. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A non-transitory machine readable storage medium embodying computer software, the computer software causing a computer to perform a method, the method comprising
maintaining, by a query module executing on the system, a plurality of source code files, wherein each of the plurality of source code files is associated with a corresponding plurality of source characteristic vectors generated based on said each of the plurality of source code files; -
receiving, by a query module, a query snippet; generating, by the query module, a plurality of query characteristic vectors based on the query snippet; and identifying, by the query module, a first source code file from the plurality of source code files for being relevant to the query snippet, wherein the being relevant to the query snippet is determined by a first relevance score calculated based on the plurality of query characteristic vectors and the first source code file'"'"'s corresponding plurality of source characteristic vectors. - View Dependent Claims (19, 20)
-
Specification