com.compomics.util.experiment.biology
Class Peptide

java.lang.Object
  extended by com.compomics.util.experiment.personalization.ExperimentObject
      extended by com.compomics.util.experiment.biology.Peptide
All Implemented Interfaces:
Serializable, Cloneable

public class Peptide
extends ExperimentObject

This class models a peptide.

Author:
Marc Vaudel
See Also:
Serialized Form

Field Summary
static String MODIFICATION_LOCALIZATION_SEPARATOR
          Separator preceding confident localization of the confident localization of a modification
static String MODIFICATION_SEPARATOR
          Separator used to separate modifications in peptide keys
 
Constructor Summary
Peptide()
          Constructor for the peptide.
Peptide(String aSequence, ArrayList<ModificationMatch> modifications)
          Constructor for the peptide.
Peptide(String aSequence, ArrayList<String> parentProteins, ArrayList<ModificationMatch> modifications)
          Constructor for the peptide.
Peptide(String aSequence, Double mass, ArrayList<String> parentProteins, ArrayList<ModificationMatch> modifications)
          Deprecated. use the constructor without mass. The mass will be recalculated.
 
Method Summary
 void addModificationMatch(ModificationMatch modificationMatch)
          Adds a modification match.
 void clearModificationMatches()
          Clears the list of imported modification matches.
 void estimateTheoreticMass()
          Estimates the theoretic mass of the peptide.
 String getCTerminal()
          Returns the C-terminal of the peptide as a String.
 HashMap<Integer,ArrayList<String>> getIndexedFixedModifications()
          Returns an indexed map of all fixed modifications amino acid, (1 is the first) -> list of modification names.
 String getKey()
          Returns the index of a peptide.
 Double getMass()
          Getter for the mass.
static int getModificationCount(String peptideKey, String modification)
          Returns how many of the given modification was found in the given peptide.
static ArrayList<String> getModificationFamily(String peptideKey)
          Returns a list of names of the variable modifications found in the key of a peptide.
 ArrayList<ModificationMatch> getModificationMatches()
          Getter for the modifications carried by this peptide.
 ArrayList<Integer> getModifiedIndexes()
          Returns the indexes of the residues in the peptide that contain at least one variable modification.
 ArrayList<Integer> getModifiedIndexes(boolean excludeFixed)
          Returns the indexes of the residues in the peptide that contain at least one modification.
 int getNMissedCleavages(Enzyme enzyme)
          Returns the number of missed cleavages using the specified enzyme.
static int getNMissedCleavages(String sequence, Enzyme enzyme)
          Returns the number of missed cleavages using the specified enzyme for the given sequence.
static ArrayList<Integer> getNModificationLocalized(String peptideKey, String modification)
          Returns the list of modifications confidently localized or inferred for the peptide indexed by the given key.
static Peptide getNoModPeptide(Peptide peptide, ArrayList<PTM> ptms)
          Returns a version of the peptide which does not contain the inspected PTMs.
 String getNTerminal()
          Returns the N-terminal of the peptide as a String.
 ArrayList<String> getParentProteins()
          Getter for the parent proteins.
 ArrayList<Integer> getPotentialModificationSites(PTM ptm, ProteinMatch.MatchingType matchingType, Double massTolerance)
          Returns the potential modification sites as an ordered list of string.
static ArrayList<Integer> getPotentialModificationSites(String sequence, PTM ptm)
          Returns the potential modification sites as an ordered list of string.
 String getSequence()
          Getter for the sequence.
static String getSequence(String peptideKey)
          Returns the sequence of the peptide indexed by the given key.
 AminoAcidPattern getSequenceAsPattern()
          Returns the sequence of this peptide as AminoAcidPattern.
static AminoAcidPattern getSequenceAsPattern(String sequence)
          Returns the given sequence as AminoAcidPattern.
 String getTaggedModifiedSequence(ModificationProfile modificationProfile, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName)
          Returns the modified sequence as an tagged string with potential modification sites color coded or with PTM tags, e.g, <mox>.
 String getTaggedModifiedSequence(ModificationProfile modificationProfile, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName, boolean excludeAllFixedPtms)
          Returns the modified sequence as an tagged string with potential modification sites color coded or with PTM tags, e.g, <mox>.
static String getTaggedModifiedSequence(ModificationProfile modificationProfile, Peptide peptide, HashMap<Integer,ArrayList<String>> mainModificationSites, HashMap<Integer,ArrayList<String>> secondaryModificationSites, HashMap<Integer,ArrayList<String>> fixedModificationSites, boolean useHtmlColorCoding, boolean includeHtmlStartEndTags, boolean useShortName)
          Returns the modified sequence as an tagged string with potential modification sites color coded or with PTM tags, e.g, <mox>.
 ArrayList<String> isCterm(ProteinMatch.MatchingType matchingType, Double massTolerance)
          Returns a list of proteins where this peptide can be found in the C-terminus.
 boolean isDecoy()
          Indicates whether a peptide can be derived from a decoy protein.
 boolean isModifiable(PTM ptm, ProteinMatch.MatchingType matchingType, Double massTolerance)
          Indicates whether the given modification can be found on the peptide.
static boolean isModified(String peptideKey)
          Returns a boolean indicating whether the peptide has variable modifications based on its key.
static boolean isModified(String peptideKey, String modification)
          Returns a boolean indicating whether the peptide has the given variable modification based on its key.
 ArrayList<String> isNterm(ProteinMatch.MatchingType matchingType, Double massTolerance)
          Returns a list of proteins where this peptide can be found in the N-terminus.
 boolean isSameAs(Peptide anotherPeptide)
          A method which compares two peptides.
 boolean isSameModificationStatus(Peptide anotherPeptide)
          Indicates whether another peptide has the same variable modifications as this peptide.
 boolean isSameSequence(Peptide anotherPeptide)
          Returns a boolean indicating whether another peptide has the same sequence as the given peptide
 boolean isSameSequenceAndModificationStatus(Peptide anotherPeptide)
          Indicates whether another peptide has the same sequence and modification status without accounting for modification localization.
 boolean sameModificationsAs(Peptide anotherPeptide)
          Indicates whether another peptide has the same modifications at the same localization as this peptide.
 boolean sameModificationsAs(Peptide anotherPeptide, ArrayList<String> ptms)
          Indicates whether another peptide has the same modifications at the same localization as this peptide.
 void setParentProteins(ArrayList<String> parentProteins)
          Sets the parent proteins.
 
Methods inherited from class com.compomics.util.experiment.personalization.ExperimentObject
addUrParam, getParameterKey, getUrParam
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MODIFICATION_LOCALIZATION_SEPARATOR

public static final String MODIFICATION_LOCALIZATION_SEPARATOR
Separator preceding confident localization of the confident localization of a modification

See Also:
Constant Field Values

MODIFICATION_SEPARATOR

public static final String MODIFICATION_SEPARATOR
Separator used to separate modifications in peptide keys

See Also:
Constant Field Values
Constructor Detail

Peptide

public Peptide()
Constructor for the peptide.


Peptide

public Peptide(String aSequence,
               ArrayList<String> parentProteins,
               ArrayList<ModificationMatch> modifications)
        throws IllegalArgumentException
Constructor for the peptide.

Parameters:
aSequence - The peptide sequence
parentProteins - The parent proteins, cannot be null or empty
modifications - The PTM of this peptide
Throws:
IllegalArgumentException - Thrown if the peptide sequence contains unknown amino acids

Peptide

public Peptide(String aSequence,
               ArrayList<ModificationMatch> modifications)
        throws IllegalArgumentException
Constructor for the peptide.

Parameters:
aSequence - The peptide sequence
modifications - The PTM of this peptide
Throws:
IllegalArgumentException - Thrown if the peptide sequence contains unknown amino acids

Peptide

public Peptide(String aSequence,
               Double mass,
               ArrayList<String> parentProteins,
               ArrayList<ModificationMatch> modifications)
Deprecated. use the constructor without mass. The mass will be recalculated.

Constructor for the peptide.

Parameters:
aSequence - The peptide sequence
mass - The peptide mass
parentProteins - The parent proteins, cannot be null or empty
modifications - The PTM of this peptide
Method Detail

getMass

public Double getMass()
Getter for the mass.

Returns:
the peptide mass

getModificationMatches

public ArrayList<ModificationMatch> getModificationMatches()
Getter for the modifications carried by this peptide.

Returns:
the modifications matches as found by the search engine

clearModificationMatches

public void clearModificationMatches()
Clears the list of imported modification matches.


addModificationMatch

public void addModificationMatch(ModificationMatch modificationMatch)
Adds a modification match.

Parameters:
modificationMatch - the modification match to add

getSequence

public String getSequence()
Getter for the sequence.

Returns:
the peptide sequence

getNMissedCleavages

public int getNMissedCleavages(Enzyme enzyme)
Returns the number of missed cleavages using the specified enzyme.

Parameters:
enzyme - the enzyme used
Returns:
the amount of missed cleavages

getNMissedCleavages

public static int getNMissedCleavages(String sequence,
                                      Enzyme enzyme)
Returns the number of missed cleavages using the specified enzyme for the given sequence.

Parameters:
sequence - the peptide sequence
enzyme - the enzyme used
Returns:
the amount of missed cleavages

getParentProteins

public ArrayList<String> getParentProteins()
Getter for the parent proteins.

Returns:
the parent proteins

setParentProteins

public void setParentProteins(ArrayList<String> parentProteins)
Sets the parent proteins.

Parameters:
parentProteins - the parent proteins as list, cannot be null or empty

getKey

public String getKey()
Returns the index of a peptide. index = SEQUENCE_mod1_mod2 with modifications ordered alphabetically.

Returns:
the index of a peptide

isModified

public static boolean isModified(String peptideKey)
Returns a boolean indicating whether the peptide has variable modifications based on its key.

Parameters:
peptideKey - the peptide key
Returns:
a boolean indicating whether the peptide has variable modifications

isModified

public static boolean isModified(String peptideKey,
                                 String modification)
Returns a boolean indicating whether the peptide has the given variable modification based on its key.

Parameters:
peptideKey - the peptide key
modification - the name of the modification
Returns:
a boolean indicating whether the peptide has variable modifications

getModificationCount

public static int getModificationCount(String peptideKey,
                                       String modification)
Returns how many of the given modification was found in the given peptide.

Parameters:
peptideKey - the peptide key
modification - the name of the modification
Returns:
the number of modifications

getNModificationLocalized

public static ArrayList<Integer> getNModificationLocalized(String peptideKey,
                                                           String modification)
Returns the list of modifications confidently localized or inferred for the peptide indexed by the given key.

Parameters:
peptideKey - the peptide key
modification - the name of the modification
Returns:
the number of modifications confidently localized

getSequence

public static String getSequence(String peptideKey)
Returns the sequence of the peptide indexed by the given key.

Parameters:
peptideKey - the peptide key
Returns:
the corresponding sequence

getModificationFamily

public static ArrayList<String> getModificationFamily(String peptideKey)
Returns a list of names of the variable modifications found in the key of a peptide.

Parameters:
peptideKey - the key of a peptide
Returns:
a list of names of the variable modifications found in the key

isModifiable

public boolean isModifiable(PTM ptm,
                            ProteinMatch.MatchingType matchingType,
                            Double massTolerance)
                     throws IOException,
                            IllegalArgumentException,
                            InterruptedException,
                            FileNotFoundException,
                            ClassNotFoundException
Indicates whether the given modification can be found on the peptide. For instance, 'oxidation of M' cannot be found on sequence "PEPTIDE". For the inspection of protein termini and peptide C-terminus the proteins sequences must be accessible from the sequence factory.

Parameters:
ptm - the PTM of interest
matchingType - the type of sequence matching
massTolerance - the mass tolerance for matching type 'indistiguishibleAminoAcids'. Can be null otherwise
Returns:
a boolean indicating whether the given modification can be found on the peptide
Throws:
IOException - exception thrown whenever an error occurred while reading a protein sequence
IllegalArgumentException - exception thrown whenever an error occurred while reading a protein sequence
InterruptedException - exception thrown whenever an error occurred while reading a protein sequence
FileNotFoundException
ClassNotFoundException

getPotentialModificationSites

public ArrayList<Integer> getPotentialModificationSites(PTM ptm,
                                                        ProteinMatch.MatchingType matchingType,
                                                        Double massTolerance)
                                                 throws IOException,
                                                        IllegalArgumentException,
                                                        InterruptedException,
                                                        FileNotFoundException,
                                                        ClassNotFoundException
Returns the potential modification sites as an ordered list of string. 1 is the first amino acid. An empty list is returned if no possibility was found. This method does not account for protein terminal modifications.

Parameters:
ptm - the PTM considered
matchingType - the type of sequence matching
massTolerance - the mass tolerance for matching type 'indistiguishibleAminoAcids'. Can be null otherwise
Returns:
a list of potential modification sites
Throws:
IOException - exception thrown whenever an error occurred while reading a protein sequence
IllegalArgumentException - exception thrown whenever an error occurred while reading a protein sequence
InterruptedException - exception thrown whenever an error occurred while reading a protein sequence
FileNotFoundException
ClassNotFoundException

getPotentialModificationSites

public static ArrayList<Integer> getPotentialModificationSites(String sequence,
                                                               PTM ptm)
                                                        throws IllegalArgumentException
Returns the potential modification sites as an ordered list of string. 1 is the first aa. an empty list is returned if no possibility was found. This method does not account for protein terminal modifications. Only works if the modification pattern can be fully found in the sequence (single amino acid or terminal patterns smaller than the sequence). Otherwise an IllegalArgumentException will be thrown. Use the non static method then.

Parameters:
sequence - the sequence of the peptide of interest
ptm - the PTM considered
Returns:
a list of potential modification sites
Throws:
IllegalArgumentException

isSameAs

public boolean isSameAs(Peptide anotherPeptide)
A method which compares two peptides. Two same peptides present the same sequence and same modifications. The localization of the modification is accounted only if the PTM is modification matches are confidently localized.

Parameters:
anotherPeptide - another peptide
Returns:
a boolean indicating if the other peptide is the same.

isSameSequenceAndModificationStatus

public boolean isSameSequenceAndModificationStatus(Peptide anotherPeptide)
Indicates whether another peptide has the same sequence and modification status without accounting for modification localization.

Parameters:
anotherPeptide - the other peptide to compare to this instance
Returns:
a boolean indicating whether the other peptide has the same sequence and modification status.

isSameSequence

public boolean isSameSequence(Peptide anotherPeptide)
Returns a boolean indicating whether another peptide has the same sequence as the given peptide

Parameters:
anotherPeptide - the other peptide to compare
Returns:
a boolean indicating whether the other peptide has the same sequence

isSameModificationStatus

public boolean isSameModificationStatus(Peptide anotherPeptide)
Indicates whether another peptide has the same variable modifications as this peptide. The localization of the PTM is not accounted for.

Parameters:
anotherPeptide - the other peptide
Returns:
a boolean indicating whether the other peptide has the same variable modifications as the peptide of interest

sameModificationsAs

public boolean sameModificationsAs(Peptide anotherPeptide,
                                   ArrayList<String> ptms)
Indicates whether another peptide has the same modifications at the same localization as this peptide. This method comes as a complement of isSameAs, here the localization of all PTMs is taken into account.

Parameters:
anotherPeptide - another peptide
ptms - the PTMs
Returns:
true if the other peptide has the same positions at the same location as the considered peptide

sameModificationsAs

public boolean sameModificationsAs(Peptide anotherPeptide)
Indicates whether another peptide has the same modifications at the same localization as this peptide. This method comes as a complement of isSameAs, here the localization of all PTMs is taken into account.

Parameters:
anotherPeptide - another peptide
Returns:
true if the other peptide has the same positions at the same location as the considered peptide

getNTerminal

public String getNTerminal()
Returns the N-terminal of the peptide as a String. Returns "NH2" if the terminal is not modified, otherwise returns the name of the modification. /!\ this method will work only if the PTM found in the peptide are in the PTMFactory.

Returns:
the N-terminal of the peptide as a String, e.g., "NH2"

getCTerminal

public String getCTerminal()
Returns the C-terminal of the peptide as a String. Returns "COOH" if the terminal is not modified, otherwise returns the name of the modification. /!\ This method will work only if the PTM found in the peptide are in the PTMFactory.

Returns:
the C-terminal of the peptide as a String, e.g., "COOH"

getTaggedModifiedSequence

public String getTaggedModifiedSequence(ModificationProfile modificationProfile,
                                        boolean useHtmlColorCoding,
                                        boolean includeHtmlStartEndTags,
                                        boolean useShortName,
                                        boolean excludeAllFixedPtms)
Returns the modified sequence as an tagged string with potential modification sites color coded or with PTM tags, e.g, <mox>. /!\ this method will work only if the PTM found in the peptide are in the PTMFactory. /!\ This method uses the modifications as set in the modification matches of this peptide and displays all of them.

Parameters:
modificationProfile - the modification profile of the search
useHtmlColorCoding - if true, color coded HTML is used, otherwise PTM tags, e.g, <mox>, are used
includeHtmlStartEndTags - if true, start and end HTML tags are added
useShortName - if true the short names are used in the tags
excludeAllFixedPtms - if true, all fixed PTMs are excluded
Returns:
the modified sequence as a tagged string

getTaggedModifiedSequence

public String getTaggedModifiedSequence(ModificationProfile modificationProfile,
                                        boolean useHtmlColorCoding,
                                        boolean includeHtmlStartEndTags,
                                        boolean useShortName)
Returns the modified sequence as an tagged string with potential modification sites color coded or with PTM tags, e.g, <mox>. /!\ this method will work only if the PTM found in the peptide are in the PTMFactory. /!\ This method uses the modifications as set in the modification matches of this peptide and displays all of them.

Parameters:
modificationProfile - the modification profile of the search
useHtmlColorCoding - if true, color coded HTML is used, otherwise PTM tags, e.g, <mox>, are used
includeHtmlStartEndTags - if true, start and end HTML tags are added
useShortName - if true the short names are used in the tags
Returns:
the modified sequence as a tagged string

getTaggedModifiedSequence

public static String getTaggedModifiedSequence(ModificationProfile modificationProfile,
                                               Peptide peptide,
                                               HashMap<Integer,ArrayList<String>> mainModificationSites,
                                               HashMap<Integer,ArrayList<String>> secondaryModificationSites,
                                               HashMap<Integer,ArrayList<String>> fixedModificationSites,
                                               boolean useHtmlColorCoding,
                                               boolean includeHtmlStartEndTags,
                                               boolean useShortName)
Returns the modified sequence as an tagged string with potential modification sites color coded or with PTM tags, e.g, <mox>. /!\ This method will work only if the PTM found in the peptide are in the PTMFactory. /!\ This method uses the modifications as set in the modification matches of this peptide and displays all of them.

Parameters:
modificationProfile - the modification profile of the search
includeHtmlStartEndTags - if true, start and end HTML tags are added
peptide - the peptide to annotate
mainModificationSites - the main variable modification sites in a map: aa number -> list of modifications (1 is the first AA) (can be null)
secondaryModificationSites - the secondary variable modification sites in a map: aa number -> list of modifications (1 is the first AA) (can be null)
fixedModificationSites - the fixed modification sites in a map: aa number -> list of modifications (1 is the first AA) (can be null)
useHtmlColorCoding - if true, color coded HTML is used, otherwise PTM tags, e.g, <mox>, are used
useShortName - if true the short names are used in the tags
Returns:
the tagged modified sequence as a string

getModifiedIndexes

public ArrayList<Integer> getModifiedIndexes()
Returns the indexes of the residues in the peptide that contain at least one variable modification.

Returns:
the indexes of the modified residues

getModifiedIndexes

public ArrayList<Integer> getModifiedIndexes(boolean excludeFixed)
Returns the indexes of the residues in the peptide that contain at least one modification.

Parameters:
excludeFixed - exclude fixed PTMs
Returns:
the indexes of the modified residues

getIndexedFixedModifications

public HashMap<Integer,ArrayList<String>> getIndexedFixedModifications()
Returns an indexed map of all fixed modifications amino acid, (1 is the first) -> list of modification names.

Returns:
an indexed map of all fixed modifications amino acid

estimateTheoreticMass

public void estimateTheoreticMass()
                           throws IllegalArgumentException
Estimates the theoretic mass of the peptide. The previous version is silently overwritten.

Throws:
IllegalArgumentException - if the peptide sequence contains unknown amino acids

isNterm

public ArrayList<String> isNterm(ProteinMatch.MatchingType matchingType,
                                 Double massTolerance)
                          throws IOException,
                                 IllegalArgumentException,
                                 InterruptedException,
                                 FileNotFoundException,
                                 ClassNotFoundException
Returns a list of proteins where this peptide can be found in the N-terminus. The proteins must be accessible via the sequence factory. If none found, an empty list is returned.

Parameters:
matchingType - the type of sequence matching
massTolerance - the mass tolerance for matching type 'indistiguishibleAminoAcids'. Can be null otherwise
Returns:
a list of proteins where this peptide can be found in the N-terminus
Throws:
IOException - exception thrown whenever an error occurred while reading the protein sequence
IllegalArgumentException - exception thrown whenever an error occurred while reading the protein sequence
InterruptedException - exception thrown whenever an error occurred while reading the protein sequence
FileNotFoundException
ClassNotFoundException

isCterm

public ArrayList<String> isCterm(ProteinMatch.MatchingType matchingType,
                                 Double massTolerance)
                          throws IOException,
                                 IllegalArgumentException,
                                 InterruptedException,
                                 FileNotFoundException,
                                 ClassNotFoundException
Returns a list of proteins where this peptide can be found in the C-terminus. The proteins must be accessible via the sequence factory. If none found, an empty list is returned.

Parameters:
matchingType - the type of sequence matching
massTolerance - the mass tolerance for matching type 'indistiguishibleAminoAcids'. Can be null otherwise
Returns:
a list of proteins where this peptide can be found in the C-terminus
Throws:
IOException - exception thrown whenever an error occurred while reading a protein sequence
IllegalArgumentException - exception thrown whenever an error occurred while reading a protein sequence
InterruptedException - exception thrown whenever an error occurred while reading a protein sequence
FileNotFoundException
ClassNotFoundException

getSequenceAsPattern

public AminoAcidPattern getSequenceAsPattern()
Returns the sequence of this peptide as AminoAcidPattern.

Returns:
the sequence of this peptide as AminoAcidPattern

getSequenceAsPattern

public static AminoAcidPattern getSequenceAsPattern(String sequence)
Returns the given sequence as AminoAcidPattern.

Parameters:
sequence - the sequence of interest
Returns:
the sequence as AminoAcidPattern

isDecoy

public boolean isDecoy()
Indicates whether a peptide can be derived from a decoy protein.

Returns:
whether a peptide can be derived from a decoy protein

getNoModPeptide

public static Peptide getNoModPeptide(Peptide peptide,
                                      ArrayList<PTM> ptms)
Returns a version of the peptide which does not contain the inspected PTMs.

Parameters:
peptide - the original peptide
ptms - list of inspected PTMs
Returns:
a not modified version of the peptide


Copyright © 2013. All Rights Reserved.