Class PeptideUtils
java.lang.Object
com.compomics.util.experiment.identification.utils.PeptideUtils
This class groups functions that can be used to work with peptides.
- Author:
- Marc Vaudel, Harald Barsnes
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiongetAaAfter(Peptide peptide, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.static StringgetAaAfter(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string.getAaBefore(Peptide peptide, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.static StringgetAaBefore(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.static StringgetCtermAsString(boolean useShortName, int length, String[]... modificationArrays) Returns the C-terminal annotation as string.static StringgetFixedModificationsAsString(Peptide peptide, ModificationParameters modificationParameters, SequenceProvider sequenceProvider, SequenceMatchingParameters modificationSequenceMatchingParameters) Returns the peptide modifications as a string.static intgetModifiedAaIndex(int modSite, int sequenceLength) Returns the index of a modification on the amino acid sequence, where 0 is the first amino acid.static intgetNEnzymaticTermini(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme) Returns the number of enzymatic termini for the given peptide coordinates and enzyme on this protein.static StringgetNtermAsString(boolean useShortName, String[]... modificationArrays) Returns the N-terminal annotation as string.static StringgetTaggedModifiedSequence(Peptide peptide, ModificationParameters modificationParameters, String[] allFixedModifications, String[] allVariableModifications, String[] confidentModificationSites, String[] representativeAmbiguousModificationSites, String[] secondaryAmbiguousModificationSites, String[] fixedModificationSites, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName) Returns the modified sequence as an tagged string with potential modification sites color coded or with Modification tags, e.g, <mox>.static StringgetVariableModificationsAsString(Peptide peptide) Returns the peptide modifications as a string.static StringgetVariableModificationsAsString(ModificationMatch[] modificationMatches) Returns the peptide modifications as a string.static booleanisCterm(Peptide peptide, SequenceProvider sequenceProvider) Indicates whether a peptide is at the C-terminus of a protein.static booleanisCterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider) Indicates whether a peptide is at the C-terminus of a given protein.static booleanisCtermEnzymatic(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme) Returns whether the N-terminus of the given peptide is enzymatic at the given coordinates and enzyme on this protein.static booleanisDecoy(Peptide peptide, SequenceProvider sequenceProvider) Returns a boolean indicating whether the peptide matches a decoy sequence.static booleanisEnzymatic(Peptide peptide, SequenceProvider sequenceProvider, ArrayList<Enzyme> enzymes) Returns a boolean indicating whether the peptide is enzymatic in at least one protein using one of the given enzymes.static booleanisEnzymatic(Peptide peptide, String proteinAccession, String proteinSequence, ArrayList<Enzyme> enzymes) Returns a boolean indicating whether the peptide is enzymatic using one of the given enzymes.static booleanisNterm(Peptide peptide, SequenceProvider sequenceProvider) Indicates whether a peptide is at the N-terminus of a protein.static booleanisNterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider) Indicates whether a peptide is at the N-terminus of a given protein.static booleanisNtermEnzymatic(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme) Returns whether the N-terminus of the given peptide is enzymatic at the given coordinates and enzyme on this protein.static booleanReturns a boolean indicating whether the peptide needs variants to be mapped to the given protein.
-
Constructor Details
-
PeptideUtils
public PeptideUtils()Empty default constructor.
-
-
Method Details
-
isDecoy
Returns a boolean indicating whether the peptide matches a decoy sequence.- Parameters:
peptide- the peptidesequenceProvider- a sequence provider.- Returns:
- a boolean indicating whether the peptide matches a decoy sequence
-
getAaBefore
public static String getAaBefore(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.- Parameters:
peptide- the peptideaccession- the accession of the proteinindex- the position of the peptide on the protein sequencenAa- the number of amino acids to includesequenceProvider- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getAaBefore
public static TreeMap<String,String[]> getAaBefore(Peptide peptide, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.- Parameters:
peptide- the peptidenAa- the number of amino acids to includesequenceProvider- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getAaAfter
public static String getAaAfter(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string.- Parameters:
peptide- the peptideaccession- the accession of the proteinindex- the position of the peptide on the protein sequencenAa- the number of amino acids to includesequenceProvider- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getAaAfter
public static TreeMap<String,String[]> getAaAfter(Peptide peptide, int nAa, SequenceProvider sequenceProvider) Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.- Parameters:
peptide- the peptidenAa- the number of amino acids to includesequenceProvider- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getVariableModificationsAsString
Returns the peptide modifications as a string.- Parameters:
modificationMatches- the modification matches- Returns:
- the peptide modifications as a string
-
getVariableModificationsAsString
Returns the peptide modifications as a string.- Parameters:
peptide- the peptide- Returns:
- the peptide modifications as a string
-
getFixedModificationsAsString
public static String getFixedModificationsAsString(Peptide peptide, ModificationParameters modificationParameters, SequenceProvider sequenceProvider, SequenceMatchingParameters modificationSequenceMatchingParameters) Returns the peptide modifications as a string.- Parameters:
peptide- the peptidemodificationParameters- the modification parameterssequenceProvider- a provider for the protein sequencesmodificationSequenceMatchingParameters- the sequence matching preferences for modification to peptide mapping- Returns:
- the peptide modifications as a string
-
getTaggedModifiedSequence
public static String getTaggedModifiedSequence(Peptide peptide, ModificationParameters modificationParameters, String[] allFixedModifications, String[] allVariableModifications, String[] confidentModificationSites, String[] representativeAmbiguousModificationSites, String[] secondaryAmbiguousModificationSites, String[] fixedModificationSites, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName) Returns the modified sequence as an tagged string with potential modification sites color coded or with Modification tags, e.g, <mox>. /!\ This method will work only if the Modification found in the peptide are in the ModificationFactory. Modifications should be provided indexed by site as follows: N-term modifications are at index 0, C-term at sequence length + 1, and amino acid at 1-based index on the sequence.- Parameters:
peptide- the peptide to annotatemodificationParameters- the modification profile of the searchallFixedModifications- All fixed modifications in an array representing the amino acid sequence.allVariableModifications- all the variable modificationsconfidentModificationSites- the confidently localized variable modification sites indexed by site.representativeAmbiguousModificationSites- the representative site of the ambiguously localized variable modifications in a map: aa number > list of modifications (1 is the first AA) (can be null)secondaryAmbiguousModificationSites- the secondary sites of the ambiguously localized variable modifications in a map: aa number > list of modifications (1 is the first AA) (can be null)fixedModificationSites- The fixed modifications to display in an array representing the amino acid sequence.useHtmlColorCoding- if true, color coded HTML is used, otherwise Modification tags, e.g, <mox>, are usedincludeHtmlStartEndTags- if true, start and end HTML tags are addeduseShortName- if true the short names are used in the tags- Returns:
- the tagged modified sequence as a string
-
getNtermAsString
Returns the N-terminal annotation as string.- Parameters:
useShortName- if true the short names are used in the tagsmodificationArrays- modifications to annotate in arrays corresponding to the peptide sequence with N-terminus at index 0- Returns:
- the N-terminal annotation as string
-
getCtermAsString
public static String getCtermAsString(boolean useShortName, int length, String[]... modificationArrays) Returns the C-terminal annotation as string.- Parameters:
useShortName- if true the short names are used in the tagslength- the length of the peptidemodificationArrays- modifications to annotate in arrays corresponding to the peptide sequence with C-terminus at index length + 2- Returns:
- the C-terminal annotation as string
-
isNtermEnzymatic
public static boolean isNtermEnzymatic(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme) Returns whether the N-terminus of the given peptide is enzymatic at the given coordinates and enzyme on this protein.- Parameters:
peptideStart- the 0 based index of the peptide start on the proteinpeptideEnd- the 0 based index of the peptide end on the proteinproteinSequence- the protein sequenceenzyme- the enzyme to use- Returns:
- the number of enzymatic termini for the given peptide coordinates and enzyme on this protein
-
isCtermEnzymatic
public static boolean isCtermEnzymatic(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme) Returns whether the N-terminus of the given peptide is enzymatic at the given coordinates and enzyme on this protein.- Parameters:
peptideStart- the 0 based index of the peptide start on the proteinpeptideEnd- the 0 based index of the peptide end on the proteinproteinSequence- the protein sequenceenzyme- the enzyme to use- Returns:
- the number of enzymatic termini for the given peptide coordinates and enzyme on this protein
-
getNEnzymaticTermini
public static int getNEnzymaticTermini(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme) Returns the number of enzymatic termini for the given peptide coordinates and enzyme on this protein.- Parameters:
peptideStart- the 0-based index of the peptide start on the proteinpeptideEnd- the 0-based index of the peptide end on the proteinproteinSequence- the protein sequenceenzyme- the enzyme to use- Returns:
- the number of enzymatic termini for the given peptide coordinates and enzyme on this protein
-
isEnzymatic
public static boolean isEnzymatic(Peptide peptide, String proteinAccession, String proteinSequence, ArrayList<Enzyme> enzymes) Returns a boolean indicating whether the peptide is enzymatic using one of the given enzymes.- Parameters:
peptide- the peptideproteinAccession- the accession of the proteinproteinSequence- the sequence of the proteinenzymes- the enzymes used for digestion- Returns:
- a boolean indicating whether the peptide is enzymatic using one of the given enzymes
-
isEnzymatic
public static boolean isEnzymatic(Peptide peptide, SequenceProvider sequenceProvider, ArrayList<Enzyme> enzymes) Returns a boolean indicating whether the peptide is enzymatic in at least one protein using one of the given enzymes.- Parameters:
peptide- the peptidesequenceProvider- the sequence providerenzymes- the enzymes used for digestion- Returns:
- a boolean indicating whether the peptide is enzymatic using one of the given enzymes
-
isVariant
Returns a boolean indicating whether the peptide needs variants to be mapped to the given protein.- Parameters:
peptide- the peptideaccession- the accession of the protein- Returns:
- a boolean indicating whether the peptide needs variants to be mapped to the given protein
-
isNterm
Indicates whether a peptide is at the N-terminus of a protein.- Parameters:
peptide- the peptidesequenceProvider- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the N-terminus of a protein
-
isNterm
public static boolean isNterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider) Indicates whether a peptide is at the N-terminus of a given protein.- Parameters:
peptide- the peptideproteinAccession- the accession of the proteinsequenceProvider- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the N-terminus of a given protein
-
isCterm
Indicates whether a peptide is at the C-terminus of a protein.- Parameters:
peptide- the peptidesequenceProvider- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the C-terminus of a protein
-
isCterm
public static boolean isCterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider) Indicates whether a peptide is at the C-terminus of a given protein.- Parameters:
peptide- the peptideproteinAccession- the accession of the proteinsequenceProvider- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the N-terminus of a given protein
-
getModifiedAaIndex
public static int getModifiedAaIndex(int modSite, int sequenceLength) Returns the index of a modification on the amino acid sequence, where 0 is the first amino acid. For modification on amino acids, the index on the sequence starting from 1 is expected as site. For terminal modifications, 0 and sequenceLength+1 are expected for N-term and C-term modifications, respectively.- Parameters:
modSite- the modification sitesequenceLength- the length of the peptide sequence- Returns:
- the index of a modification on the amino acid sequence
-