java.lang.Object
com.compomics.util.experiment.personalization.ExperimentObject
com.compomics.util.experiment.biology.aminoacids.sequence.AminoAcidPattern
All Implemented Interfaces:
Serializable

public class AminoAcidPattern
extends ExperimentObject
An amino acid pattern is a sequence of amino acids. For example for trypsin: Target R or K not followed by P. IMPORTANT: the index for the target residue is by default 0.
Author:
Marc Vaudel, Dominik Kopczynsk
See Also:
Serialized Form
  • Constructor Details

    • AminoAcidPattern

      public AminoAcidPattern()
      Creates a blank pattern. All maps are null.
    • AminoAcidPattern

      public AminoAcidPattern​(AminoAcidPattern aminoAcidPattern)
      Creates a pattern from another pattern.
      Parameters:
      aminoAcidPattern - the other pattern
    • AminoAcidPattern

      public AminoAcidPattern​(ArrayList<String> targetResidues)
      Convenience constructor giving a list of targeted residues as input. For instance (S, T, Y)
      Parameters:
      targetResidues - a list of targeted residues
  • Method Details

    • getAminoAcidPatternFromString

      public static AminoAcidPattern getAminoAcidPatternFromString​(String aminoAcidPatternAsString)
      Parses the amino acid pattern from the given string as created by the toString() method.
      Parameters:
      aminoAcidPatternAsString - the amino acid pattern as created by the toString() method
      Returns:
      the amino acid pattern
    • getAminoAcidPatternFromString

      public static AminoAcidPattern getAminoAcidPatternFromString​(String aminoAcidPatternAsString, int startIndex)
      Parses the amino acid pattern from the given string as created by the toString() method.
      Parameters:
      aminoAcidPatternAsString - the amino acid pattern as created by the toString() method
      startIndex - the start index of the pattern
      Returns:
      the amino acid pattern
    • getAaTargeted

      public HashMap<Integer,​ArrayList<Character>> getAaTargeted()
      Returns the map of targeted amino acids. Null if not set.
      Returns:
      the map of targeted amino acids
    • swapRows

      public void swapRows​(int fromRow, int toRow)
      Swap two rows in the pattern. The first amino acid is 0.
      Parameters:
      fromRow - from row
      toRow - to row
    • getTarget

      public int getTarget()
      Returns the index of the amino acid of interest in the pattern. Null if none.
      Returns:
      the index of the amino acid of interest in the pattern.
    • getMinIndex

      public int getMinIndex()
      Returns the minimal index where amino acids are found.
      Returns:
      the minimal index where amino acids are found
    • getMaxIndex

      public int getMaxIndex()
      Returns the maximal index where amino acids are found.
      Returns:
      the maximal index where amino acids are found
    • setTarget

      public void setTarget​(Integer target)
      Sets the index of the amino acid of interest in the pattern.
      Parameters:
      target - the index of the amino acid of interest in the pattern.
    • getAminoAcidsAtTarget

      public ArrayList<Character> getAminoAcidsAtTarget()
      Returns the targeted amino acids at position "target". An empty list if none.
      Returns:
      the targeted amino acids at position "target"
    • getAminoAcidsAtTargetSet

      public HashSet<Character> getAminoAcidsAtTargetSet()
      Returns a set containing the amino acids at target.
      Returns:
      a set containing the amino acids at target
    • setTargeted

      public void setTargeted​(int index, ArrayList<Character> targets)
      Sets the amino acids targeted at a given index. The first amino acid is 0. Previous value will be silently overwritten.
      Parameters:
      index - the index in the pattern
      targets - the amino acids targeted
    • setExcluded

      public void setExcluded​(int index, ArrayList<Character> exceptions)
      Excludes the given amino acids from the targeted amino acids at the given index.
      Parameters:
      index - the index of the excluded amino acid
      exceptions - the amino acids to exclude
    • getTargetedAA

      public ArrayList<Character> getTargetedAA​(int index)
      Returns the targeted amino acids at a given index in the pattern. The first amino acid is 0.
      Parameters:
      index - the index in the pattern
      Returns:
      the targeted amino acids
    • getNTargetedAA

      public int getNTargetedAA​(int index)
      Returns the number of targeted amino acids at the given index. The first amino acid is 0.
      Parameters:
      index - the index of interest
      Returns:
      the number of excluded amino acids
    • removeAA

      public void removeAA​(int index)
      Removes an amino acid index from the pattern. The first amino acid is 0.
      Parameters:
      index - the index of the amino acid to remove
    • getAsStringPattern

      public Pattern getAsStringPattern​(SequenceMatchingParameters sequenceMatchingParameters, boolean includeMutations)
      Returns the amino acid pattern as case insensitive pattern for String matching.
      Parameters:
      sequenceMatchingParameters - the sequence matching preferences
      includeMutations - if true mutated amino acids will be included
      Returns:
      the amino acid pattern as java string pattern
    • getPrositeFormat

      public String getPrositeFormat()
      Returns the pattern in the PROSITE format.
      Returns:
      the pattern in the PROSITE format
    • getIndexes

      public int[] getIndexes​(String input, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the indexes where the amino acid pattern was found in the input. 1 is the first amino acid.
      Parameters:
      input - the amino acid input sequence as string
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a list of indexes where the amino acid pattern was found
    • getIndexes

      public ArrayList<Integer> getIndexes​(AminoAcidPattern input, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the indexes where the amino acid pattern was found in the input. 1 is the first amino acid.
      Parameters:
      input - the amino acid input sequence as AminoAcidPattern
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a list of indexes where the amino acid pattern was found
    • firstIndex

      public int firstIndex​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex​(AminoAcidSequence aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex​(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidPattern - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • contains

      public boolean contains​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern contains a subsequence of amino acids.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look for
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • contains

      public boolean contains​(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern contains a subsequence of amino acids.
      Parameters:
      aminoAcidPattern - the amino acid sequence to look for
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      startIndex - the start index where to start looking for
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex​(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)
      Returns the first index where the amino acid pattern is found in the given pattern. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidPattern - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      startIndex - the start index where to start looking for
      Returns:
      the first index where the amino acid pattern is found
    • isTargeted

      public boolean isTargeted​(Character aa, int index, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid at the given index of the pattern is targeted without accounting for mutations.
      Parameters:
      aa - the amino acid as character
      index - the index in the pattern
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      true if the given amino acid at the given index of the pattern is targeted
    • matchesIn

      public boolean matchesIn​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern is found in the given amino acid sequence.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern is found in the given amino acid sequence
    • matchesIn

      public boolean matchesIn​(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern is found in the given amino acid sequence.
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern is found in the given amino acid sequence
    • matchesAt

      public boolean matchesAt​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int index)
      Indicates whether the pattern is found in the given amino acid sequence at the given index, where 0 is the first amino acid. Returns false if the entire pattern cannot be mapped to the sequence.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      index - the index at which the matching should be done
      Returns:
      a boolean indicating whether the pattern is found in the given amino acid sequence at the given index
    • matches

      public boolean matches​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern matches the given amino acid sequence.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern matches the given amino acid sequence
    • matches

      public boolean matches​(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern matches the given amino acid sequence
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern matches the given amino acid sequence
    • isStarting

      public boolean isStarting​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence starts with the pattern.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence starts with the pattern
    • isStarting

      public boolean isStarting​(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence starts with the pattern.
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence starts with the pattern
    • isEnding

      public boolean isEnding​(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence ends with the pattern.
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence ends with the pattern
    • isEnding

      public boolean isEnding​(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence ends with the pattern.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence ends with the pattern
    • isSameAs

      public boolean isSameAs​(AminoAcidPattern anotherPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether another AminoAcidPattern targets the same pattern. Modifications are considered equal when of same mass. Modifications should be loaded in the Modification factory.
      Parameters:
      anotherPattern - the other AminoAcidPattern
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      true if the other AminoAcidPattern targets the same pattern
    • length

      public int length()
      Returns the length of the pattern in amino acids.
      Returns:
      the length of the pattern in amino acids
    • getStandardSearchPattern

      public AminoAcidPattern getStandardSearchPattern()
      Computes a pattern which can be searched by standard search engines, i.e., a pattern targeting a single amino acid and not a complex pattern.
      Returns:
      a pattern which can be searched by standard search engines
    • getTrypsinExample

      public static AminoAcidPattern getTrypsinExample()
      Returns the trypsin example as amino acid pattern.
      Returns:
      the trypsin example as amino acid pattern
    • merge

      public void merge​(AminoAcidPattern otherPattern)
      Simple merger for two patterns. Example: this: target{0>S} otherPattern: target{0>T} result (this): target{0>S|T}
      Parameters:
      otherPattern - another pattern to be merged with this
    • append

      public void append​(AminoAcidPattern otherPattern)
      Appends another pattern at the end of this pattern.
      Parameters:
      otherPattern - the other pattern to append.
    • merge

      public static AminoAcidPattern merge​(AminoAcidPattern pattern1, AminoAcidPattern pattern2)
      Convenience method merging two different patterns (see public void merge(AminoAcidPattern otherPattern) for detailed information of the merging procedure).
      Parameters:
      pattern1 - the first pattern
      pattern2 - the second pattern
      Returns:
      a merged version of the two patterns
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • asStringBuilder

      public StringBuilder asStringBuilder()
      Returns the sequence represented by this amino acid pattern in a new string builder.
      Returns:
      the sequence represented by this amino acid pattern in a new string builder
    • asSequence

      public String asSequence​(int index)
      Returns the component of the amino acid pattern at the given index. 0 is the first amino acid.
      Parameters:
      index - the index in the pattern. 0 is the first amino acid
      Returns:
      the component of the amino acid pattern at the given index
    • addModificationSite

      public void addModificationSite​(int localization, ArrayList<Character> ModificationSite)
      Adds a list of modifications to one of the amino acid pattern.
      Parameters:
      localization - the index of the amino acid residue site
      ModificationSite - valid amino acids for this site
    • getAllPossibleSequences

      public ArrayList<String> getAllPossibleSequences()
      Returns all possible sequences which can be obtained from the targeted amino acids. Missing amino acids will be denoted as 'X'. This does not implement excluded amino acids.
      Returns:
      all possible sequences which can be obtained from the targeted amino acids
    • getSubPattern

      public AminoAcidPattern getSubPattern​(int startIndex, int endIndex, boolean updateTarget)
      Returns a sub pattern of the pattern.
      Parameters:
      startIndex - the start index, inclusive (0 is the first amino acid)
      endIndex - the end index, inclusive
      updateTarget - boolean indicating whether the target of the pattern shall be updated. If yes it will be shifted by startIndex, simply copied otherwise.
      Returns:
      a sub pattern
    • getSubPattern

      public AminoAcidPattern getSubPattern​(int startIndex, boolean updateTarget)
      Returns a sub pattern of the pattern.
      Parameters:
      startIndex - the start index, inclusive (0 is the first amino acid)
      updateTarget - boolean indicating whether the target of the pattern shall be updated. If yes it will be shifted by startIndex, simply copied otherwise.
      Returns:
      a sub pattern
    • reverse

      public AminoAcidPattern reverse()
      Returns an amino acid pattern which is a reversed version of the current pattern.
      Returns:
      an amino acid pattern which is a reversed version of the current pattern