Systems and methods for finding project-related information by clustering applications into related concept categories
First Claim
Patent Images
1. A system, comprising:
- a computer including;
a non-transitory memory device storing instructions and a similarity matrix, the similarity matrix combining;
a first categorization matrix defining a similarity between computer applications in a plurality of computer applications according to a first categorization of computer program calls,a second categorization matrix defining a similarity between the computer applications according to a second categorization of the computer program calls, andwherein the computer program calls are defined hierarchically and the first categorization corresponds to a first level of a hierarchy and the second categorization corresponds to a second level of the hierarchy; and
a processor that executes the instructions, causing the computer to;
receive a selection of one of the plurality of computer applications, andindicate at least one of the plurality of computer applicationsusing the stored similarity matrix and based on the selectedone of the plurality of computer applications.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method, and computer-readable medium, is described that finds similarities among programming applications based on semantic anchors found within the source code of such applications. The semantic anchors may be API calls, such as Java'"'"'s package and class calls of the JDK. Latent Semantic Indexing may be used to process the application and semantic anchor data and automatically develop a similarity matrix that contains numbers representing the similarity of one program to another.
-
Citations
20 Claims
-
1. A system, comprising:
-
a computer including; a non-transitory memory device storing instructions and a similarity matrix, the similarity matrix combining; a first categorization matrix defining a similarity between computer applications in a plurality of computer applications according to a first categorization of computer program calls, a second categorization matrix defining a similarity between the computer applications according to a second categorization of the computer program calls, and wherein the computer program calls are defined hierarchically and the first categorization corresponds to a first level of a hierarchy and the second categorization corresponds to a second level of the hierarchy; and a processor that executes the instructions, causing the computer to; receive a selection of one of the plurality of computer applications, and indicate at least one of the plurality of computer applications using the stored similarity matrix and based on the selected one of the plurality of computer applications. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for determining similar computer applications, comprising:
-
one or more memories to store a plurality of computer applications; one or more processor to implement; a metadata extractor configured to receive the plurality of computer applications from the memory and to generate tuples associating at least one of the plurality of computer applications with computer program calls, the computer program calls being defined hierarchically with a first categorization corresponding to a first level of a hierarchy and a second categorization corresponding to a second level of the hierarchy; a term document matrix builder configured to receive the tuples and to generate a first term document matrix according to the first categorization and a second term document matrix according to the second categorization; a similarity matrix builder configured to receive the first term document matrix and the second term document matrix and to generate a similarity matrix using singular value decomposition of the first term document matrix and the second term document matrix; and a search engine configured to receive a selection of one of the plurality of computer applications and provide an indication of at least one of the computer applications using the similarity matrix and based on the selected one of the plurality of computer applications. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A method comprising:
-
generating a similarity matrix using a computer server, comprising; receiving a plurality of computer applications from a computer application database, generating tuples associating at least one of the plurality of computer applications with computer program calls, categories of a first categorization, and categories of a second categorization, wherein the computer program calls are defined hierarchically, the first categorization corresponding to a first level of the hierarchy and the second categorization corresponding to a second level of the hierarchy, generating, based on the tuples, a first term document matrix according to the first categorization and a second term document matrix according to the second categorization, and generating the similarity matrix from the first term document matrix and the second term document matrix, elements of the similarity matrix indicating a similarity between computer applications corresponding to rows of the similarity matrix and computer applications corresponding to columns of the similarity matrix; and providing an indication of relevant computer applications using the computer server, comprising; providing a web service for processing code search requests, indicating first relevant computer applications based on a code search request, and indicating second relevant computer applications using the similarity matrix based on a selection of one of the first relevant computer applications. - View Dependent Claims (20)
-
Specification