The standard groups inspected by PeptideShaker are PSMs, peptides and proteins. However, if statistical
significance is ensured PSMs will be separated according to their charge. Similarly, peptides can be separated
based on their modification status.
This grouping strategy will allow you to increase the sensitivity of the processing without compromising robustness.
Note however that changes at the PSM level will affect results at the Peptide and Protein level. Similarly changes
at the Peptide level affects the Protein level.
It is thus important to apply the upstream changes first!
For more information about peptide grouping see
Vaudel et al: Peptide identification quality control,
Proteomics 2011;11(10):2105-14.
The estimator plots will help you improve the accuracy of confidence estimation by adjusting the bin size used to estimate the PEP.
When the PEP value is confidently estimated, probabilistic estimators provide a smoothed version of the classical estimators. However, sometimes the PEP cannot be
accurately estimated, e.g., for small populations. The confidence and probabilistic estimators will then no longer
be reliable.
It is advised to keep the PEP and FDR estimator advanced settings at the default values.
The score threshold used, illustrated by a red vertical line in the confidence plot, can be changed to meet three kinds of requirements:
Confidence: all hits with a confidence greater than the threshold will be validated.
By default the threshold is set to 1% FDR.
The identification summary provides essential metrics for the selected group:
This plot displays the confidence plotted against the score of the selected group's identifications. If the confidence is fluctuating,
the confidence estimation might not be robust enough and should be optimized as described above.
The red vertical line indicates the chosen threshold. The red area on the left of the threshold illustrates the amount of retained
true positives. The green area on the right of the threshold illustrates the amount of potential true positives not validated, i.e., the
false negatives.
Tip: It is important to verify that the confidence reaches 0, otherwise the total number of true positives will be under-estimated.
No red line is displayed? You should use a less restrictive threshold.
This plot displays the two FDR estimators and the FNR estimator plotted against the score of the selected group identifications.
If the two FDR estimators do not agree, the confidence estimation might not be robust enough and should be optimized.
Three points indicate the FDR and FNR of the validated identifications.
This plot displays the benefit which can be expected, the proportion of retained true positives (1-FNR), plotted against the cost of
the selected benefit, the proportion of false positive identifications (FDR). In other words it is a
ROC curve for the selected group.
A point indicates the performance at the selected threshold. It is possible to move this point along the curve (by moving the slider
below the plot) in order to optimize the threshold balancing between quality and quantity. If the point diverges away from the curve
the confidence estimation should be optimized.
This plot displays the Posterior Error Probability (PEP) plotted against the score of the selected group. If the PEP is fluctuating the confidence estimation is not robust enough and should be optimized.
This plot displays the probabilistic FDR plotted against the classical FDR for identifications with a confidence >0. The curve should closely follow the black diagonal. If this is not the case the confidence estimation should be optimized.