java.lang.Object
com.compomics.util.experiment.personalization.ExperimentObject
com.compomics.util.experiment.biology.aminoacids.sequence.AminoAcidPattern
All Implemented Interfaces:
Serializable

public class AminoAcidPattern extends ExperimentObject
An amino acid pattern is a sequence of amino acids. For example for trypsin: Target R or K not followed by P. IMPORTANT: the index for the target residue is by default 0.
Author:
Marc Vaudel, Dominik Kopczynsk
See Also:
  • Constructor Details

    • AminoAcidPattern

      public AminoAcidPattern()
      Creates a blank pattern. All maps are null.
    • AminoAcidPattern

      public AminoAcidPattern(AminoAcidPattern aminoAcidPattern)
      Creates a pattern from another pattern.
      Parameters:
      aminoAcidPattern - the other pattern
    • AminoAcidPattern

      public AminoAcidPattern(ArrayList<String> targetResidues)
      Convenience constructor giving a list of targeted residues as input. For instance (S, T, Y)
      Parameters:
      targetResidues - a list of targeted residues
  • Method Details

    • getAminoAcidPatternFromString

      public static AminoAcidPattern getAminoAcidPatternFromString(String aminoAcidPatternAsString)
      Parses the amino acid pattern from the given string as created by the toString() method.
      Parameters:
      aminoAcidPatternAsString - the amino acid pattern as created by the toString() method
      Returns:
      the amino acid pattern
    • getAminoAcidPatternFromString

      public static AminoAcidPattern getAminoAcidPatternFromString(String aminoAcidPatternAsString, int startIndex)
      Parses the amino acid pattern from the given string as created by the toString() method.
      Parameters:
      aminoAcidPatternAsString - the amino acid pattern as created by the toString() method
      startIndex - the start index of the pattern
      Returns:
      the amino acid pattern
    • getAaTargeted

      public HashMap<Integer,ArrayList<Character>> getAaTargeted()
      Returns the map of targeted amino acids. Null if not set.
      Returns:
      the map of targeted amino acids
    • swapRows

      public void swapRows(int fromRow, int toRow)
      Swap two rows in the pattern. The first amino acid is 0.
      Parameters:
      fromRow - from row
      toRow - to row
    • getTarget

      public int getTarget()
      Returns the index of the amino acid of interest in the pattern. Null if none.
      Returns:
      the index of the amino acid of interest in the pattern.
    • getMinIndex

      public int getMinIndex()
      Returns the minimal index where amino acids are found.
      Returns:
      the minimal index where amino acids are found
    • getMaxIndex

      public int getMaxIndex()
      Returns the maximal index where amino acids are found.
      Returns:
      the maximal index where amino acids are found
    • setTarget

      public void setTarget(Integer target)
      Sets the index of the amino acid of interest in the pattern.
      Parameters:
      target - the index of the amino acid of interest in the pattern.
    • getAminoAcidsAtTarget

      public ArrayList<Character> getAminoAcidsAtTarget()
      Returns the targeted amino acids at position "target". An empty list if none.
      Returns:
      the targeted amino acids at position "target"
    • getAminoAcidsAtTargetSet

      public HashSet<Character> getAminoAcidsAtTargetSet()
      Returns a set containing the amino acids at target.
      Returns:
      a set containing the amino acids at target
    • setTargeted

      public void setTargeted(int index, ArrayList<Character> targets)
      Sets the amino acids targeted at a given index. The first amino acid is 0. Previous value will be silently overwritten.
      Parameters:
      index - the index in the pattern
      targets - the amino acids targeted
    • setExcluded

      public void setExcluded(int index, ArrayList<Character> exceptions)
      Excludes the given amino acids from the targeted amino acids at the given index.
      Parameters:
      index - the index of the excluded amino acid
      exceptions - the amino acids to exclude
    • getTargetedAA

      public ArrayList<Character> getTargetedAA(int index)
      Returns the targeted amino acids at a given index in the pattern. The first amino acid is 0.
      Parameters:
      index - the index in the pattern
      Returns:
      the targeted amino acids
    • getNTargetedAA

      public int getNTargetedAA(int index)
      Returns the number of targeted amino acids at the given index. The first amino acid is 0.
      Parameters:
      index - the index of interest
      Returns:
      the number of excluded amino acids
    • removeAA

      public void removeAA(int index)
      Removes an amino acid index from the pattern. The first amino acid is 0.
      Parameters:
      index - the index of the amino acid to remove
    • getAsStringPattern

      public Pattern getAsStringPattern(SequenceMatchingParameters sequenceMatchingParameters, boolean includeMutations)
      Returns the amino acid pattern as case insensitive pattern for String matching.
      Parameters:
      sequenceMatchingParameters - the sequence matching preferences
      includeMutations - if true mutated amino acids will be included
      Returns:
      the amino acid pattern as java string pattern
    • getPrositeFormat

      public String getPrositeFormat()
      Returns the pattern in the PROSITE format.
      Returns:
      the pattern in the PROSITE format
    • getIndexes

      public int[] getIndexes(String input, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the indexes where the amino acid pattern was found in the input. 1 is the first amino acid.
      Parameters:
      input - the amino acid input sequence as string
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a list of indexes where the amino acid pattern was found
    • getIndexes

      public ArrayList<Integer> getIndexes(AminoAcidPattern input, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the indexes where the amino acid pattern was found in the input. 1 is the first amino acid.
      Parameters:
      input - the amino acid input sequence as AminoAcidPattern
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a list of indexes where the amino acid pattern was found
    • firstIndex

      public int firstIndex(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex(AminoAcidSequence aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidPattern - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • contains

      public boolean contains(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern contains a subsequence of amino acids.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look for
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • contains

      public boolean contains(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern contains a subsequence of amino acids.
      Parameters:
      aminoAcidPattern - the amino acid sequence to look for
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)
      Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidSequence - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      startIndex - the start index where to start looking for
      Returns:
      the first index where the amino acid pattern is found
    • firstIndex

      public int firstIndex(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)
      Returns the first index where the amino acid pattern is found in the given pattern. -1 if not found. 0 is the first amino acid.
      Parameters:
      aminoAcidPattern - the amino acid sequence to look into
      sequenceMatchingParameters - the sequence matching preferences
      startIndex - the start index where to start looking for
      Returns:
      the first index where the amino acid pattern is found
    • isTargeted

      public boolean isTargeted(Character aa, int index, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid at the given index of the pattern is targeted without accounting for mutations.
      Parameters:
      aa - the amino acid as character
      index - the index in the pattern
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      true if the given amino acid at the given index of the pattern is targeted
    • matchesIn

      public boolean matchesIn(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern is found in the given amino acid sequence.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern is found in the given amino acid sequence
    • matchesIn

      public boolean matchesIn(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern is found in the given amino acid sequence.
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern is found in the given amino acid sequence
    • matchesAt

      public boolean matchesAt(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int index)
      Indicates whether the pattern is found in the given amino acid sequence at the given index, where 0 is the first amino acid. Returns false if the entire pattern cannot be mapped to the sequence.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      index - the index at which the matching should be done
      Returns:
      a boolean indicating whether the pattern is found in the given amino acid sequence at the given index
    • matches

      public boolean matches(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern matches the given amino acid sequence.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern matches the given amino acid sequence
    • matches

      public boolean matches(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the pattern matches the given amino acid sequence
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the pattern matches the given amino acid sequence
    • isStarting

      public boolean isStarting(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence starts with the pattern.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence starts with the pattern
    • isStarting

      public boolean isStarting(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence starts with the pattern.
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence starts with the pattern
    • isEnding

      public boolean isEnding(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence ends with the pattern.
      Parameters:
      aminoAcidPattern - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence ends with the pattern
    • isEnding

      public boolean isEnding(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether the given amino acid sequence ends with the pattern.
      Parameters:
      aminoAcidSequence - the amino acid sequence
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      a boolean indicating whether the given amino acid sequence ends with the pattern
    • isSameAs

      public boolean isSameAs(AminoAcidPattern anotherPattern, SequenceMatchingParameters sequenceMatchingParameters)
      Indicates whether another AminoAcidPattern targets the same pattern. Modifications are considered equal when of same mass. Modifications should be loaded in the Modification factory.
      Parameters:
      anotherPattern - the other AminoAcidPattern
      sequenceMatchingParameters - the sequence matching preferences
      Returns:
      true if the other AminoAcidPattern targets the same pattern
    • length

      public int length()
      Returns the length of the pattern in amino acids.
      Returns:
      the length of the pattern in amino acids
    • getStandardSearchPattern

      public AminoAcidPattern getStandardSearchPattern()
      Computes a pattern which can be searched by standard search engines, i.e., a pattern targeting a single amino acid and not a complex pattern.
      Returns:
      a pattern which can be searched by standard search engines
    • getTrypsinExample

      public static AminoAcidPattern getTrypsinExample()
      Returns the trypsin example as amino acid pattern.
      Returns:
      the trypsin example as amino acid pattern
    • merge

      public void merge(AminoAcidPattern otherPattern)
      Simple merger for two patterns. Example: this: target{0>S} otherPattern: target{0>T} result (this): target{0>S|T}
      Parameters:
      otherPattern - another pattern to be merged with this
    • append

      public void append(AminoAcidPattern otherPattern)
      Appends another pattern at the end of this pattern.
      Parameters:
      otherPattern - the other pattern to append.
    • merge

      public static AminoAcidPattern merge(AminoAcidPattern pattern1, AminoAcidPattern pattern2)
      Convenience method merging two different patterns (see public void merge(AminoAcidPattern otherPattern) for detailed information of the merging procedure).
      Parameters:
      pattern1 - the first pattern
      pattern2 - the second pattern
      Returns:
      a merged version of the two patterns
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • asStringBuilder

      public StringBuilder asStringBuilder()
      Returns the sequence represented by this amino acid pattern in a new string builder.
      Returns:
      the sequence represented by this amino acid pattern in a new string builder
    • asSequence

      public String asSequence(int index)
      Returns the component of the amino acid pattern at the given index. 0 is the first amino acid.
      Parameters:
      index - the index in the pattern. 0 is the first amino acid
      Returns:
      the component of the amino acid pattern at the given index
    • addModificationSite

      public void addModificationSite(int localization, ArrayList<Character> ModificationSite)
      Adds a list of modifications to one of the amino acid pattern.
      Parameters:
      localization - the index of the amino acid residue site
      ModificationSite - valid amino acids for this site
    • getAllPossibleSequences

      public ArrayList<String> getAllPossibleSequences()
      Returns all possible sequences which can be obtained from the targeted amino acids. Missing amino acids will be denoted as 'X'. This does not implement excluded amino acids.
      Returns:
      all possible sequences which can be obtained from the targeted amino acids
    • getSubPattern

      public AminoAcidPattern getSubPattern(int startIndex, int endIndex, boolean updateTarget)
      Returns a sub pattern of the pattern.
      Parameters:
      startIndex - the start index, inclusive (0 is the first amino acid)
      endIndex - the end index, inclusive
      updateTarget - boolean indicating whether the target of the pattern shall be updated. If yes it will be shifted by startIndex, simply copied otherwise.
      Returns:
      a sub pattern
    • getSubPattern

      public AminoAcidPattern getSubPattern(int startIndex, boolean updateTarget)
      Returns a sub pattern of the pattern.
      Parameters:
      startIndex - the start index, inclusive (0 is the first amino acid)
      updateTarget - boolean indicating whether the target of the pattern shall be updated. If yes it will be shifted by startIndex, simply copied otherwise.
      Returns:
      a sub pattern
    • reverse

      public AminoAcidPattern reverse()
      Returns an amino acid pattern which is a reversed version of the current pattern.
      Returns:
      an amino acid pattern which is a reversed version of the current pattern