java.lang.Object
com.compomics.util.experiment.identification.utils.PeptideUtils

public class PeptideUtils
extends Object
This class groups functions that can be used to work with peptides.
Author:
Marc Vaudel, Harald Barsnes
  • Constructor Details

    • PeptideUtils

      public PeptideUtils()
      Empty default constructor.
  • Method Details

    • isDecoy

      public static boolean isDecoy​(Peptide peptide, SequenceProvider sequenceProvider)
      Returns a boolean indicating whether the peptide matches a decoy sequence.
      Parameters:
      peptide - the peptide
      sequenceProvider - a sequence provider.
      Returns:
      a boolean indicating whether the peptide matches a decoy sequence
    • getAaBefore

      public static String getAaBefore​(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider)
      Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.
      Parameters:
      peptide - the peptide
      accession - the accession of the protein
      index - the position of the peptide on the protein sequence
      nAa - the number of amino acids to include
      sequenceProvider - the sequence provider
      Returns:
      the amino acids before the given peptide as a string in a map based on the peptide protein mapping
    • getAaBefore

      public static TreeMap<String,​String[]> getAaBefore​(Peptide peptide, int nAa, SequenceProvider sequenceProvider)
      Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.
      Parameters:
      peptide - the peptide
      nAa - the number of amino acids to include
      sequenceProvider - the sequence provider
      Returns:
      the amino acids before the given peptide as a string in a map based on the peptide protein mapping
    • getAaAfter

      public static String getAaAfter​(Peptide peptide, String accession, int index, int nAa, SequenceProvider sequenceProvider)
      Returns the amino acids before the given peptide as a string.
      Parameters:
      peptide - the peptide
      accession - the accession of the protein
      index - the position of the peptide on the protein sequence
      nAa - the number of amino acids to include
      sequenceProvider - the sequence provider
      Returns:
      the amino acids before the given peptide as a string in a map based on the peptide protein mapping
    • getAaAfter

      public static TreeMap<String,​String[]> getAaAfter​(Peptide peptide, int nAa, SequenceProvider sequenceProvider)
      Returns the amino acids before the given peptide as a string in a map based on the peptide protein mapping.
      Parameters:
      peptide - the peptide
      nAa - the number of amino acids to include
      sequenceProvider - the sequence provider
      Returns:
      the amino acids before the given peptide as a string in a map based on the peptide protein mapping
    • getVariableModificationsAsString

      public static String getVariableModificationsAsString​(ModificationMatch[] modificationMatches)
      Returns the peptide modifications as a string.
      Parameters:
      modificationMatches - the modification matches
      Returns:
      the peptide modifications as a string
    • getVariableModificationsAsString

      public static String getVariableModificationsAsString​(Peptide peptide)
      Returns the peptide modifications as a string.
      Parameters:
      peptide - the peptide
      Returns:
      the peptide modifications as a string
    • getFixedModificationsAsString

      public static String getFixedModificationsAsString​(Peptide peptide, ModificationParameters modificationParameters, SequenceProvider sequenceProvider, SequenceMatchingParameters modificationSequenceMatchingParameters)
      Returns the peptide modifications as a string.
      Parameters:
      peptide - the peptide
      modificationParameters - the modification parameters
      sequenceProvider - a provider for the protein sequences
      modificationSequenceMatchingParameters - the sequence matching preferences for modification to peptide mapping
      Returns:
      the peptide modifications as a string
    • getTaggedModifiedSequence

      public static String getTaggedModifiedSequence​(Peptide peptide, ModificationParameters modificationParameters, String[] allFixedModifications, String[] allVariableModifications, String[] confidentModificationSites, String[] representativeAmbiguousModificationSites, String[] secondaryAmbiguousModificationSites, String[] fixedModificationSites, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName)
      Returns the modified sequence as an tagged string with potential modification sites color coded or with Modification tags, e.g, <mox>. /!\ This method will work only if the Modification found in the peptide are in the ModificationFactory. Modifications should be provided indexed by site as follows: N-term modifications are at index 0, C-term at sequence length + 1, and amino acid at 1-based index on the sequence.
      Parameters:
      modificationParameters - the modification profile of the search
      includeHtmlStartEndTags - if true, start and end HTML tags are added
      peptide - the peptide to annotate
      allFixedModifications - all the fixed modifications
      allVariableModifications - all the variable modifications
      confidentModificationSites - the confidently localized variable modification sites indexed by site.
      representativeAmbiguousModificationSites - the representative site of the ambiguously localized variable modifications in a map: aa number > list of modifications (1 is the first AA) (can be null)
      secondaryAmbiguousModificationSites - the secondary sites of the ambiguously localized variable modifications in a map: aa number > list of modifications (1 is the first AA) (can be null)
      fixedModificationSites - the fixed modification sites in a map: aa number > list of modifications (1 is the first AA) (can be null)
      useHtmlColorCoding - if true, color coded HTML is used, otherwise Modification tags, e.g, <mox>, are used
      useShortName - if true the short names are used in the tags
      Returns:
      the tagged modified sequence as a string
    • getNtermAsString

      public static String getNtermAsString​(boolean useShortName, String[]... modificationArrays)
      Returns the N-terminal annotation as string.
      Parameters:
      useShortName - if true the short names are used in the tags
      modificationArrays - modifications to annotate in arrays corresponding to the peptide sequence with N-terminus at index 0
      Returns:
      the N-terminal annotation as string
    • getCtermAsString

      public static String getCtermAsString​(boolean useShortName, int length, String[]... modificationArrays)
      Returns the C-terminal annotation as string.
      Parameters:
      useShortName - if true the short names are used in the tags
      length - the length of the peptide
      modificationArrays - modifications to annotate in arrays corresponding to the peptide sequence with C-terminus at index length + 2
      Returns:
      the C-terminal annotation as string
    • getNEnzymaticTermini

      public static int getNEnzymaticTermini​(int peptideStart, int peptideEnd, String proteinSequence, Enzyme enzyme)
      Returns the number of enzymatic termini for the given peptide coordinates and enzyme on this protein.
      Parameters:
      peptideStart - the 0 based index of the peptide start on the protein
      peptideEnd - the 0 based index of the peptide end on the protein
      proteinSequence - the protein sequence
      enzyme - the enzyme to use
      Returns:
      the number of enzymatic termini for the given peptide coordinates and enzyme on this protein
    • isEnzymatic

      public static boolean isEnzymatic​(Peptide peptide, String proteinAccession, String proteinSequence, ArrayList<Enzyme> enzymes)
      Returns a boolean indicating whether the peptide is enzymatic using one of the given enzymes.
      Parameters:
      peptide - the peptide
      proteinAccession - the accession of the protein
      proteinSequence - the sequence of the protein
      enzymes - the enzymes used for digestion
      Returns:
      a boolean indicating whether the peptide is enzymatic using one of the given enzymes
    • isEnzymatic

      public static boolean isEnzymatic​(Peptide peptide, SequenceProvider sequenceProvider, ArrayList<Enzyme> enzymes)
      Returns a boolean indicating whether the peptide is enzymatic in at least one protein using one of the given enzymes.
      Parameters:
      peptide - the peptide
      sequenceProvider - the sequence provider
      enzymes - the enzymes used for digestion
      Returns:
      a boolean indicating whether the peptide is enzymatic using one of the given enzymes
    • isVariant

      public static boolean isVariant​(Peptide peptide, String accession)
      Returns a boolean indicating whether the peptide needs variants to be mapped to the given protein.
      Parameters:
      peptide - the peptide
      accession - the accession of the protein
      Returns:
      a boolean indicating whether the peptide needs variants to be mapped to the given protein
    • isNterm

      public static boolean isNterm​(Peptide peptide, SequenceProvider sequenceProvider)
      Indicates whether a peptide is at the N-terminus of a protein.
      Parameters:
      peptide - the peptide
      sequenceProvider - a sequence provider
      Returns:
      a boolean indicating whether a peptide is at the N-terminus of a protein
    • isNterm

      public static boolean isNterm​(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider)
      Indicates whether a peptide is at the N-terminus of a given protein.
      Parameters:
      peptide - the peptide
      proteinAccession - the accession of the protein
      sequenceProvider - a sequence provider
      Returns:
      a boolean indicating whether a peptide is at the N-terminus of a given protein
    • isCterm

      public static boolean isCterm​(Peptide peptide, SequenceProvider sequenceProvider)
      Indicates whether a peptide is at the C-terminus of a protein.
      Parameters:
      peptide - the peptide
      sequenceProvider - a sequence provider
      Returns:
      a boolean indicating whether a peptide is at the C-terminus of a protein
    • isCterm

      public static boolean isCterm​(Peptide peptide, String proteinAccession, SequenceProvider sequenceProvider)
      Indicates whether a peptide is at the C-terminus of a given protein.
      Parameters:
      peptide - the peptide
      proteinAccession - the accession of the protein
      sequenceProvider - a sequence provider
      Returns:
      a boolean indicating whether a peptide is at the N-terminus of a given protein
    • getModifiedAaIndex

      public static int getModifiedAaIndex​(int modSite, int sequenceLength)
      Returns the index of a modification on the amino acid sequence. 0 is the first amino acid. The modification site is expected to be the zero-based index on the sequence. -1 and sequenceLength for N-term and C-term modifications, respectively.
      Parameters:
      modSite - the modification site
      sequenceLength - the length of the peptide sequence
      Returns:
      the index of a modification on the amino acid sequence