Class PeptideUtils
java.lang.Object
com.compomics.util.experiment.identification.utils.PeptideUtils
public class PeptideUtils extends Object
This class groups functions that can be used to work with peptides.
- Author:
- Marc Vaudel, Harald Barsnes
-
Constructor Summary
Constructors Constructor Description PeptideUtils()
Empty default constructor. -
Method Summary
Modifier and Type Method Description static TreeMap<String,String[]>
getAaAfter(Peptide peptide, int nAa, SequenceProvider sequenceProvider)
Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.static String
getAaAfter(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider)
Returns the amino acids before the given peptide as a string.static TreeMap<String,String[]>
getAaBefore(Peptide peptide, int nAa, SequenceProvider sequenceProvider)
Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.static String
getAaBefore(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider)
Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.static String
getCtermAsString(boolean useShortName, int length, String[]... modificationArrays)
Returns the C-terminal annotation as string.static String
getFixedModificationsAsString(Peptide peptide, ModificationParameters modificationParameters, SequenceProvider sequenceProvider, SequenceMatchingParameters modificationSequenceMatchingParameters)
Returns the peptide modifications as a string.static int
getModifiedAaIndex(int modSite, int sequenceLength)
Returns the index of a modification on the amino acid sequence.static int
getNEnzymaticTermini(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme)
Returns the number of enzymatic termini for the given peptide coordinates and enzyme on this protein.static String
getNtermAsString(boolean useShortName, String[]... modificationArrays)
Returns the N-terminal annotation as string.static String
getTaggedModifiedSequence(Peptide peptide, ModificationParameters modificationParameters, String[] allFixedModifications, String[] allVariableModifications, String[] confidentModificationSites, String[] representativeAmbiguousModificationSites, String[] secondaryAmbiguousModificationSites, String[] fixedModificationSites, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName)
Returns the modified sequence as an tagged string with potential modification sites color coded or with Modification tags, e.g, <mox>.static String
getVariableModificationsAsString(Peptide peptide)
Returns the peptide modifications as a string.static String
getVariableModificationsAsString(ModificationMatch[] modificationMatches)
Returns the peptide modifications as a string.static boolean
isCterm(Peptide peptide, SequenceProvider sequenceProvider)
Indicates whether a peptide is at the C-terminus of a protein.static boolean
isCterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider)
Indicates whether a peptide is at the C-terminus of a given protein.static boolean
isDecoy(Peptide peptide, SequenceProvider sequenceProvider)
Returns a boolean indicating whether the peptide matches a decoy sequence.static boolean
isEnzymatic(Peptide peptide, SequenceProvider sequenceProvider, ArrayList<Enzyme> enzymes)
Returns a boolean indicating whether the peptide is enzymatic in at least one protein using one of the given enzymes.static boolean
isEnzymatic(Peptide peptide, String proteinAccession, String proteinSequence, ArrayList<Enzyme> enzymes)
Returns a boolean indicating whether the peptide is enzymatic using one of the given enzymes.static boolean
isNterm(Peptide peptide, SequenceProvider sequenceProvider)
Indicates whether a peptide is at the N-terminus of a protein.static boolean
isNterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider)
Indicates whether a peptide is at the N-terminus of a given protein.static boolean
isVariant(Peptide peptide, String accession)
Returns a boolean indicating whether the peptide needs variants to be mapped to the given protein.
-
Constructor Details
-
PeptideUtils
public PeptideUtils()Empty default constructor.
-
-
Method Details
-
isDecoy
Returns a boolean indicating whether the peptide matches a decoy sequence.- Parameters:
peptide
- the peptidesequenceProvider
- a sequence provider.- Returns:
- a boolean indicating whether the peptide matches a decoy sequence
-
getAaBefore
public static String getAaBefore(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider)Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.- Parameters:
peptide
- the peptideaccession
- the accession of the proteinindex
- the position of the peptide on the protein sequencenAa
- the number of amino acids to includesequenceProvider
- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getAaBefore
public static TreeMap<String,String[]> getAaBefore(Peptide peptide, int nAa, SequenceProvider sequenceProvider)Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.- Parameters:
peptide
- the peptidenAa
- the number of amino acids to includesequenceProvider
- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getAaAfter
public static String getAaAfter(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider)Returns the amino acids before the given peptide as a string.- Parameters:
peptide
- the peptideaccession
- the accession of the proteinindex
- the position of the peptide on the protein sequencenAa
- the number of amino acids to includesequenceProvider
- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getAaAfter
public static TreeMap<String,String[]> getAaAfter(Peptide peptide, int nAa, SequenceProvider sequenceProvider)Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.- Parameters:
peptide
- the peptidenAa
- the number of amino acids to includesequenceProvider
- the sequence provider- Returns:
- the amino acids before the given peptide as a string in a map based on the peptide protein mapping
-
getVariableModificationsAsString
Returns the peptide modifications as a string.- Parameters:
modificationMatches
- the modification matches- Returns:
- the peptide modifications as a string
-
getVariableModificationsAsString
Returns the peptide modifications as a string.- Parameters:
peptide
- the peptide- Returns:
- the peptide modifications as a string
-
getFixedModificationsAsString
public static String getFixedModificationsAsString(Peptide peptide, ModificationParameters modificationParameters, SequenceProvider sequenceProvider, SequenceMatchingParameters modificationSequenceMatchingParameters)Returns the peptide modifications as a string.- Parameters:
peptide
- the peptidemodificationParameters
- the modification parameterssequenceProvider
- a provider for the protein sequencesmodificationSequenceMatchingParameters
- the sequence matching preferences for modification to peptide mapping- Returns:
- the peptide modifications as a string
-
getTaggedModifiedSequence
public static String getTaggedModifiedSequence(Peptide peptide, ModificationParameters modificationParameters, String[] allFixedModifications, String[] allVariableModifications, String[] confidentModificationSites, String[] representativeAmbiguousModificationSites, String[] secondaryAmbiguousModificationSites, String[] fixedModificationSites, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName)Returns the modified sequence as an tagged string with potential modification sites color coded or with Modification tags, e.g, <mox>. /!\ This method will work only if the Modification found in the peptide are in the ModificationFactory. Modifications should be provided indexed by site as follows: N-term modifications are at index 0, C-term at sequence length + 1, and amino acid at 1-based index on the sequence.- Parameters:
modificationParameters
- the modification profile of the searchincludeHtmlStartEndTags
- if true, start and end HTML tags are addedpeptide
- the peptide to annotateallFixedModifications
- all the fixed modificationsallVariableModifications
- all the variable modificationsconfidentModificationSites
- the confidently localized variable modification sites indexed by site.representativeAmbiguousModificationSites
- the representative site of the ambiguously localized variable modifications in a map: aa number > list of modifications (1 is the first AA) (can be null)secondaryAmbiguousModificationSites
- the secondary sites of the ambiguously localized variable modifications in a map: aa number > list of modifications (1 is the first AA) (can be null)fixedModificationSites
- the fixed modification sites in a map: aa number > list of modifications (1 is the first AA) (can be null)useHtmlColorCoding
- if true, color coded HTML is used, otherwise Modification tags, e.g, <mox>, are useduseShortName
- if true the short names are used in the tags- Returns:
- the tagged modified sequence as a string
-
getNtermAsString
Returns the N-terminal annotation as string.- Parameters:
useShortName
- if true the short names are used in the tagsmodificationArrays
- modifications to annotate in arrays corresponding to the peptide sequence with N-terminus at index 0- Returns:
- the N-terminal annotation as string
-
getCtermAsString
public static String getCtermAsString(boolean useShortName, int length, String[]... modificationArrays)Returns the C-terminal annotation as string.- Parameters:
useShortName
- if true the short names are used in the tagslength
- the length of the peptidemodificationArrays
- modifications to annotate in arrays corresponding to the peptide sequence with C-terminus at index length + 2- Returns:
- the C-terminal annotation as string
-
getNEnzymaticTermini
public static int getNEnzymaticTermini(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme)Returns the number of enzymatic termini for the given peptide coordinates and enzyme on this protein.- Parameters:
peptideStart
- the 0 based index of the peptide start on the proteinpeptideEnd
- the 0 based index of the peptide end on the proteinproteinSequence
- the protein sequenceenzyme
- the enzyme to use- Returns:
- the number of enzymatic termini for the given peptide coordinates and enzyme on this protein
-
isEnzymatic
public static boolean isEnzymatic(Peptide peptide, String proteinAccession, String proteinSequence, ArrayList<Enzyme> enzymes)Returns a boolean indicating whether the peptide is enzymatic using one of the given enzymes.- Parameters:
peptide
- the peptideproteinAccession
- the accession of the proteinproteinSequence
- the sequence of the proteinenzymes
- the enzymes used for digestion- Returns:
- a boolean indicating whether the peptide is enzymatic using one of the given enzymes
-
isEnzymatic
public static boolean isEnzymatic(Peptide peptide, SequenceProvider sequenceProvider, ArrayList<Enzyme> enzymes)Returns a boolean indicating whether the peptide is enzymatic in at least one protein using one of the given enzymes.- Parameters:
peptide
- the peptidesequenceProvider
- the sequence providerenzymes
- the enzymes used for digestion- Returns:
- a boolean indicating whether the peptide is enzymatic using one of the given enzymes
-
isVariant
Returns a boolean indicating whether the peptide needs variants to be mapped to the given protein.- Parameters:
peptide
- the peptideaccession
- the accession of the protein- Returns:
- a boolean indicating whether the peptide needs variants to be mapped to the given protein
-
isNterm
Indicates whether a peptide is at the N-terminus of a protein.- Parameters:
peptide
- the peptidesequenceProvider
- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the N-terminus of a protein
-
isNterm
public static boolean isNterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider)Indicates whether a peptide is at the N-terminus of a given protein.- Parameters:
peptide
- the peptideproteinAccession
- the accession of the proteinsequenceProvider
- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the N-terminus of a given protein
-
isCterm
Indicates whether a peptide is at the C-terminus of a protein.- Parameters:
peptide
- the peptidesequenceProvider
- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the C-terminus of a protein
-
isCterm
public static boolean isCterm(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider)Indicates whether a peptide is at the C-terminus of a given protein.- Parameters:
peptide
- the peptideproteinAccession
- the accession of the proteinsequenceProvider
- a sequence provider- Returns:
- a boolean indicating whether a peptide is at the N-terminus of a given protein
-
getModifiedAaIndex
public static int getModifiedAaIndex(int modSite, int sequenceLength)Returns the index of a modification on the amino acid sequence. 0 is the first amino acid. The modification site is expected to be the zero-based index on the sequence. -1 and sequenceLength for N-term and C-term modifications, respectively.- Parameters:
modSite
- the modification sitesequenceLength
- the length of the peptide sequence- Returns:
- the index of a modification on the amino acid sequence
-