Quality Control Analysis
Introduction
In shotgun experiments, it is vital to verify key quality control metrics in order to be sure that nothing went wrong in
the identification process. PeptideShaker allows you to screen metrics at the protein, peptide and peptide to spectrum match
level. If a metric of interest is missing, please contact the developers and we will try to add it.
PeptideShaker sorts the selected population according to the selected metric. Matches are separated into three classes:
-
Confident: the amount of validated confident matches. Note that the validation settings can be tuned in the Validation tab.
-
Doubtful: the amount of doubtful matches, meaning proteins, peptide and PSMs that have passed the validation filter, but
where the amount of evidence is low. For example a protein identified by only one validated peptide.
-
Not Validated: the amount of matches that did not pass the statistical validation.
These plots will also help you defining thresholds for downstream analysis like inclusion/exclusion lists generation,
quantification analysis or PTM investigations.
More on Quality Control
Proteins
For all proteins, it is possible to inspect:
-
Number of Validated Peptides: One should only trust protein matches with many different peptides. The more
peptides per protein, the better the identification. Proteins presenting many confident peptides will be particularly
interesting for follow-up analysis like quantification studies or can be excluded in downstream research using
exclusion lists.
-
MS//MS Quantification Scores: Proteins are usually present in a high dynamic range, indicated by the spread in
spectrum counting score. The higher the spread, the higher the quantification dynamic range. Note that both
spectrum counting methods usually present similar QC plots.
-
Sequence Coverage: A heavily covered protein is more confident. Also, its coverage will be interesting for
further processing like PTM screening.
-
Sequence Length: Gives an overview of the protein sequence lengths.
More on Quality Control
Peptides
For all peptides, it is possible to inspect:
-
Number of PSMs per Peptide:
If this number is too low, the confidence in peptide identification might be limited. This is especially dangerous
for PTM analysis. If this metric is too high, the amount of different peptides found might be limited. The correct
balance is typically tuned using the number of most intense peaks selected for fragmentation and the exclusion time
set after a spectrum record.
-
Missed Cleavages: The amount of missed cleavages found per peptide informs about the quality of the digestion.
Note however that the amount of false positives will be underestimated for PSMs generated in the second pass search of X!Tandem.
-
Peptide Length: Displays the distribution of the length of the detected peptides. If there are numerous
peptides below the minimum length and/or above the maximum peptide length it may be worth looking into extending the
allowed peptide length range in the Import Filters settings found in the New Project dialog when opening
a new project.
-
Number of Modifications: The number of peptides modified by the searched modifications.
-
Modifications Efficiency: The percentage of possible modification cites modified by the searched modifications.
-
Modifications Specificity: The percentage of peptides that could be targeted by the given modifications that were actually modified by the given modifications.
More on Quality Control
PSMs
For all Peptide to Spectrum Matches (PSMs), it is possible to inspect:
-
Precursor m/z Error: this plot can indicate a problem in the instrument calibration. Also, it can help to fine
tune the search settings as detailed in Vaudel et al.:
Peptide identification quality control, Proteomics 2011;11(10):2105-14.
-
Precursor Charge: the precursor charge plot will allow you to verify the quality of the peptide ionization.
Modifications like iTRAQ or phorphorylation might have an impact on the precursor ability to carry a charge. This
plot will also allow you to fine tune your search settings. Note that PSMs found in second pass searches (typically
high charges) are not compatible with Target/Decoy analysis. The amount of false positive PSMs is thus underestimated!
More on Quality Control