Methods and Systems for Protein and Peptide Evidence Assembly
First Claim
1. A method of identifying proteins comprising,a. receiving mass spectrometry data comprising a list of putative proteins, and for each protein in said list, a list of peptides contained in each protein and an associated confidence value for each peptide in said list of peptides in each protein in said list,b. calculating a first score for each putative protein based on the confidence values associated with each peptide in each putative protein,c. setting a second score for each putative protein equal to said first score,d. creating a ranked list of the putative proteins where the ranking is in descending order of each putative proteins second score,e. associate a first protein group with the first putative protein on the ranked list, where the members of said first group are all other putative proteins that have a peptide in common with said first putative protein on the ranked list,f. for all putative proteins except the putative protein with the highest second score, subtracting from their second score any contributions to the second score that is based on the confidence values associated any peptides in common with the putative protein with the highest score,g. create one or more additional protein groups using steps e-g for subsequent putative proteins on said ranked list,h. report to the end-user all putative proteins with a non-zero second score.
7 Assignments
0 Petitions
Accused Products
Abstract
The present teachings provide methods and systems for the identification of proteins via peptide analysis. Some embodiments analyze proteins identified by analysis techniques such as mass spectrometry and build protein groups out of results. Groups can be formed by collecting like proteins and examining the group so as to identify if it is likely that only one form of a protein is present or, if there is enough evidence to support the presence of alternate forms. Various embodiments provide visual reports that can be interactive. These reports can allow a user to visualize relationships between proteins both intra- and inter-group. Methods are also introduced that can reduce the identification of false positives by taking into account a priori information.
-
Citations
1 Claim
-
1. A method of identifying proteins comprising,
a. receiving mass spectrometry data comprising a list of putative proteins, and for each protein in said list, a list of peptides contained in each protein and an associated confidence value for each peptide in said list of peptides in each protein in said list, b. calculating a first score for each putative protein based on the confidence values associated with each peptide in each putative protein, c. setting a second score for each putative protein equal to said first score, d. creating a ranked list of the putative proteins where the ranking is in descending order of each putative proteins second score, e. associate a first protein group with the first putative protein on the ranked list, where the members of said first group are all other putative proteins that have a peptide in common with said first putative protein on the ranked list, f. for all putative proteins except the putative protein with the highest second score, subtracting from their second score any contributions to the second score that is based on the confidence values associated any peptides in common with the putative protein with the highest score, g. create one or more additional protein groups using steps e-g for subsequent putative proteins on said ranked list, h. report to the end-user all putative proteins with a non-zero second score.
Specification