Systems and methods for finding project-related information by clustering applications into related concept categories
First Claim
Patent Images
1. A computer-implemented method of determining whether applications are similar, comprising:
- receiving, by a computer, source code for a plurality of applications;
associating, for each application, semantic anchors found in the source code for that application with the application,wherein associating semantic anchors comprises building at least one weighted term document matrix from the semantic anchors, the at least one weighted term document matrix comprising at least a first term weighted based on at least a number of the plurality of applications in which a first semantic anchor is present in the source code for those applications;
comparing, based on the semantic anchors, a similarity of the first application to a second application; and
assigning, based on the comparison, a number representing the similarity of the first and second applications.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method, and computer-readable medium, is described that finds similarities among programming applications based on semantic anchors found within the source code of such applications. The semantic anchors may be API calls, such as Java'"'"'s package and class calls of the JDK. Latent Semantic Indexing may be used to process the application and semantic anchor data and automatically develop a similarity matrix that contains numbers representing the similarity of one program to another.
36 Citations
44 Claims
-
1. A computer-implemented method of determining whether applications are similar, comprising:
-
receiving, by a computer, source code for a plurality of applications; associating, for each application, semantic anchors found in the source code for that application with the application, wherein associating semantic anchors comprises building at least one weighted term document matrix from the semantic anchors, the at least one weighted term document matrix comprising at least a first term weighted based on at least a number of the plurality of applications in which a first semantic anchor is present in the source code for those applications; comparing, based on the semantic anchors, a similarity of the first application to a second application; and assigning, based on the comparison, a number representing the similarity of the first and second applications. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system of calculating whether program applications have similarities, comprising:
-
a non-transitory memory storing instructions; and a processor executing the instructions to cause the system to perform a method comprising; receiving, by a computer, source code for a plurality of applications; associating, for each application, semantic anchors found in the source code for that application with the application, wherein associating semantic anchors comprises building at least one weighted term document matrix from the semantic anchors and source code, the at least one weighted term document matrix comprising at least a first term weighted based on at least a number of the plurality of applications in which a first semantic anchor is present in the source code for those applications; comparing, based on the semantic anchors, a similarity of the first application to a second application; and assigning, based on the comparison, a number representing the similarity of the first and second applications. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A non-transitory computer-readable storage medium containing instructions which, when executed on a processor, perform a method comprising:
-
receiving, by a computer, source code for a plurality of applications; associating, for each application, semantic anchors found in the source code for that application with the application, wherein associating semantic anchors comprises building at least one weighted term document matrix from the semantic anchors, the at least one weighted term document matrix comprising at least a first term weighted based on at least a number of the plurality of applications in which a first semantic anchor is present in the source code for those applications; comparing, based on the semantic anchors, a similarity of the first application to a second application; and assigning, based on the comparison, a number representing the similarity of the first and second applications. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A method for providing similar applications to a user, comprising:
-
receiving from the user a search request; sending to the user, a first list of applications based on the search request; receiving from the user a selection of one of the applications on the first list; finding related applications, based on a similarity matrix and the selection; and sending to the user, a second list of related applications, wherein the similarity matrix is determined by a method comprising; receiving, by a computer, source code for a plurality of applications; associating, for each application, semantic anchors found in the source code for that application with the application, wherein associating semantic anchors comprises building at least one weighted term document matrix from the semantic anchors, the at least one weighted term document matrix comprising at least a first term weighted based on at least a number of the plurality of applications in which a first semantic anchor is present in the source code for those applications; comparing, based on the semantic anchors, a similarity of the first application to a second application; and assigning, based on the comparison, a number representing the similarity of the first and second applications. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
Specification