Class AminoAcidPattern
java.lang.Object
com.compomics.util.experiment.personalization.ExperimentObject
com.compomics.util.experiment.biology.aminoacids.sequence.AminoAcidPattern
- All Implemented Interfaces:
Serializable
public class AminoAcidPattern extends ExperimentObject
An amino acid pattern is a sequence of amino acids. For example for trypsin:
Target R or K not followed by P. IMPORTANT: the index for the target residue
is by default 0.
- Author:
- Marc Vaudel, Dominik Kopczynsk
- See Also:
- Serialized Form
-
Field Summary
-
Constructor Summary
Constructors Constructor Description AminoAcidPattern()
Creates a blank pattern.AminoAcidPattern(AminoAcidPattern aminoAcidPattern)
Creates a pattern from another pattern.AminoAcidPattern(ArrayList<String> targetResidues)
Convenience constructor giving a list of targeted residues as input. -
Method Summary
Modifier and Type Method Description void
addModificationSite(int localization, ArrayList<Character> ModificationSite)
Adds a list of modifications to one of the amino acid pattern.void
append(AminoAcidPattern otherPattern)
Appends another pattern at the end of this pattern.String
asSequence(int index)
Returns the component of the amino acid pattern at the given index.StringBuilder
asStringBuilder()
Returns the sequence represented by this amino acid pattern in a new string builder.boolean
contains(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the pattern contains a subsequence of amino acids.boolean
contains(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the pattern contains a subsequence of amino acids.int
firstIndex(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
Returns the first index where the amino acid pattern is found.int
firstIndex(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)
Returns the first index where the amino acid pattern is found in the given pattern.int
firstIndex(AminoAcidSequence aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
Returns the first index where the amino acid pattern is found.int
firstIndex(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
Returns the first index where the amino acid pattern is found.int
firstIndex(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)
Returns the first index where the amino acid pattern is found.HashMap<Integer,ArrayList<Character>>
getAaTargeted()
Returns the map of targeted amino acids.ArrayList<String>
getAllPossibleSequences()
Returns all possible sequences which can be obtained from the targeted amino acids.static AminoAcidPattern
getAminoAcidPatternFromString(String aminoAcidPatternAsString)
Parses the amino acid pattern from the given string as created by the toString() method.static AminoAcidPattern
getAminoAcidPatternFromString(String aminoAcidPatternAsString, int startIndex)
Parses the amino acid pattern from the given string as created by the toString() method.ArrayList<Character>
getAminoAcidsAtTarget()
Returns the targeted amino acids at position "target".HashSet<Character>
getAminoAcidsAtTargetSet()
Returns a set containing the amino acids at target.Pattern
getAsStringPattern(SequenceMatchingParameters sequenceMatchingParameters, boolean includeMutations)
Returns the amino acid pattern as case insensitive pattern for String matching.ArrayList<Integer>
getIndexes(AminoAcidPattern input, SequenceMatchingParameters sequenceMatchingParameters)
Returns the indexes where the amino acid pattern was found in the input.int[]
getIndexes(String input, SequenceMatchingParameters sequenceMatchingParameters)
Returns the indexes where the amino acid pattern was found in the input.int
getMaxIndex()
Returns the maximal index where amino acids are found.int
getMinIndex()
Returns the minimal index where amino acids are found.int
getNTargetedAA(int index)
Returns the number of targeted amino acids at the given index.String
getPrositeFormat()
Returns the pattern in the PROSITE format.AminoAcidPattern
getStandardSearchPattern()
Computes a pattern which can be searched by standard search engines, i.e., a pattern targeting a single amino acid and not a complex pattern.AminoAcidPattern
getSubPattern(int startIndex, boolean updateTarget)
Returns a sub pattern of the pattern.AminoAcidPattern
getSubPattern(int startIndex, int endIndex, boolean updateTarget)
Returns a sub pattern of the pattern.int
getTarget()
Returns the index of the amino acid of interest in the pattern.ArrayList<Character>
getTargetedAA(int index)
Returns the targeted amino acids at a given index in the pattern.static AminoAcidPattern
getTrypsinExample()
Returns the trypsin example as amino acid pattern.boolean
isEnding(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the given amino acid sequence ends with the pattern.boolean
isEnding(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the given amino acid sequence ends with the pattern.boolean
isSameAs(AminoAcidPattern anotherPattern, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether another AminoAcidPattern targets the same pattern.boolean
isStarting(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the given amino acid sequence starts with the pattern.boolean
isStarting(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the given amino acid sequence starts with the pattern.boolean
isTargeted(Character aa, int index, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the given amino acid at the given index of the pattern is targeted without accounting for mutations.int
length()
Returns the length of the pattern in amino acids.boolean
matches(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the pattern matches the given amino acid sequenceboolean
matches(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the pattern matches the given amino acid sequence.boolean
matchesAt(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int index)
Indicates whether the pattern is found in the given amino acid sequence at the given index, where 0 is the first amino acid.boolean
matchesIn(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the pattern is found in the given amino acid sequence.boolean
matchesIn(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)
Indicates whether the pattern is found in the given amino acid sequence.void
merge(AminoAcidPattern otherPattern)
Simple merger for two patterns.static AminoAcidPattern
merge(AminoAcidPattern pattern1, AminoAcidPattern pattern2)
Convenience method merging two different patterns (see public void merge(AminoAcidPattern otherPattern) for detailed information of the merging procedure).void
removeAA(int index)
Removes an amino acid index from the pattern.AminoAcidPattern
reverse()
Returns an amino acid pattern which is a reversed version of the current pattern.void
setExcluded(int index, ArrayList<Character> exceptions)
Excludes the given amino acids from the targeted amino acids at the given index.void
setTarget(Integer target)
Sets the index of the amino acid of interest in the pattern.void
setTargeted(int index, ArrayList<Character> targets)
Sets the amino acids targeted at a given index.void
swapRows(int fromRow, int toRow)
Swap two rows in the pattern.String
toString()
Methods inherited from class com.compomics.util.experiment.personalization.ExperimentObject
addUrParam, asLong, clearParametersMap, getId, getUrParam, getUrParams, removeUrParam, setId, setUrParams
-
Constructor Details
-
AminoAcidPattern
public AminoAcidPattern()Creates a blank pattern. All maps are null. -
AminoAcidPattern
Creates a pattern from another pattern.- Parameters:
aminoAcidPattern
- the other pattern
-
AminoAcidPattern
Convenience constructor giving a list of targeted residues as input. For instance (S, T, Y)- Parameters:
targetResidues
- a list of targeted residues
-
-
Method Details
-
getAminoAcidPatternFromString
Parses the amino acid pattern from the given string as created by the toString() method.- Parameters:
aminoAcidPatternAsString
- the amino acid pattern as created by the toString() method- Returns:
- the amino acid pattern
-
getAminoAcidPatternFromString
public static AminoAcidPattern getAminoAcidPatternFromString(String aminoAcidPatternAsString, int startIndex)Parses the amino acid pattern from the given string as created by the toString() method.- Parameters:
aminoAcidPatternAsString
- the amino acid pattern as created by the toString() methodstartIndex
- the start index of the pattern- Returns:
- the amino acid pattern
-
getAaTargeted
Returns the map of targeted amino acids. Null if not set.- Returns:
- the map of targeted amino acids
-
swapRows
public void swapRows(int fromRow, int toRow)Swap two rows in the pattern. The first amino acid is 0.- Parameters:
fromRow
- from rowtoRow
- to row
-
getTarget
public int getTarget()Returns the index of the amino acid of interest in the pattern. Null if none.- Returns:
- the index of the amino acid of interest in the pattern.
-
getMinIndex
public int getMinIndex()Returns the minimal index where amino acids are found.- Returns:
- the minimal index where amino acids are found
-
getMaxIndex
public int getMaxIndex()Returns the maximal index where amino acids are found.- Returns:
- the maximal index where amino acids are found
-
setTarget
Sets the index of the amino acid of interest in the pattern.- Parameters:
target
- the index of the amino acid of interest in the pattern.
-
getAminoAcidsAtTarget
Returns the targeted amino acids at position "target". An empty list if none.- Returns:
- the targeted amino acids at position "target"
-
getAminoAcidsAtTargetSet
Returns a set containing the amino acids at target.- Returns:
- a set containing the amino acids at target
-
setTargeted
Sets the amino acids targeted at a given index. The first amino acid is 0. Previous value will be silently overwritten.- Parameters:
index
- the index in the patterntargets
- the amino acids targeted
-
setExcluded
Excludes the given amino acids from the targeted amino acids at the given index.- Parameters:
index
- the index of the excluded amino acidexceptions
- the amino acids to exclude
-
getTargetedAA
Returns the targeted amino acids at a given index in the pattern. The first amino acid is 0.- Parameters:
index
- the index in the pattern- Returns:
- the targeted amino acids
-
getNTargetedAA
public int getNTargetedAA(int index)Returns the number of targeted amino acids at the given index. The first amino acid is 0.- Parameters:
index
- the index of interest- Returns:
- the number of excluded amino acids
-
removeAA
public void removeAA(int index)Removes an amino acid index from the pattern. The first amino acid is 0.- Parameters:
index
- the index of the amino acid to remove
-
getAsStringPattern
public Pattern getAsStringPattern(SequenceMatchingParameters sequenceMatchingParameters, boolean includeMutations)Returns the amino acid pattern as case insensitive pattern for String matching.- Parameters:
sequenceMatchingParameters
- the sequence matching preferencesincludeMutations
- if true mutated amino acids will be included- Returns:
- the amino acid pattern as java string pattern
-
getPrositeFormat
Returns the pattern in the PROSITE format.- Returns:
- the pattern in the PROSITE format
-
getIndexes
Returns the indexes where the amino acid pattern was found in the input. 1 is the first amino acid.- Parameters:
input
- the amino acid input sequence as stringsequenceMatchingParameters
- the sequence matching preferences- Returns:
- a list of indexes where the amino acid pattern was found
-
getIndexes
public ArrayList<Integer> getIndexes(AminoAcidPattern input, SequenceMatchingParameters sequenceMatchingParameters)Returns the indexes where the amino acid pattern was found in the input. 1 is the first amino acid.- Parameters:
input
- the amino acid input sequence as AminoAcidPatternsequenceMatchingParameters
- the sequence matching preferences- Returns:
- a list of indexes where the amino acid pattern was found
-
firstIndex
public int firstIndex(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.- Parameters:
aminoAcidSequence
- the amino acid sequence to look intosequenceMatchingParameters
- the sequence matching preferences- Returns:
- the first index where the amino acid pattern is found
-
firstIndex
public int firstIndex(AminoAcidSequence aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.- Parameters:
aminoAcidSequence
- the amino acid sequence to look intosequenceMatchingParameters
- the sequence matching preferences- Returns:
- the first index where the amino acid pattern is found
-
firstIndex
public int firstIndex(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.- Parameters:
aminoAcidPattern
- the amino acid sequence to look intosequenceMatchingParameters
- the sequence matching preferences- Returns:
- the first index where the amino acid pattern is found
-
contains
public boolean contains(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the pattern contains a subsequence of amino acids.- Parameters:
aminoAcidSequence
- the amino acid sequence to look forsequenceMatchingParameters
- the sequence matching preferences- Returns:
- the first index where the amino acid pattern is found
-
contains
public boolean contains(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the pattern contains a subsequence of amino acids.- Parameters:
aminoAcidPattern
- the amino acid sequence to look forsequenceMatchingParameters
- the sequence matching preferences- Returns:
- the first index where the amino acid pattern is found
-
firstIndex
public int firstIndex(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)Returns the first index where the amino acid pattern is found. -1 if not found. 0 is the first amino acid.- Parameters:
aminoAcidSequence
- the amino acid sequence to look intosequenceMatchingParameters
- the sequence matching preferencesstartIndex
- the start index where to start looking for- Returns:
- the first index where the amino acid pattern is found
-
firstIndex
public int firstIndex(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters, int startIndex)Returns the first index where the amino acid pattern is found in the given pattern. -1 if not found. 0 is the first amino acid.- Parameters:
aminoAcidPattern
- the amino acid sequence to look intosequenceMatchingParameters
- the sequence matching preferencesstartIndex
- the start index where to start looking for- Returns:
- the first index where the amino acid pattern is found
-
isTargeted
public boolean isTargeted(Character aa, int index, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the given amino acid at the given index of the pattern is targeted without accounting for mutations.- Parameters:
aa
- the amino acid as characterindex
- the index in the patternsequenceMatchingParameters
- the sequence matching preferences- Returns:
- true if the given amino acid at the given index of the pattern is targeted
-
matchesIn
public boolean matchesIn(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the pattern is found in the given amino acid sequence.- Parameters:
aminoAcidSequence
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the pattern is found in the given amino acid sequence
-
matchesIn
public boolean matchesIn(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the pattern is found in the given amino acid sequence.- Parameters:
aminoAcidPattern
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the pattern is found in the given amino acid sequence
-
matchesAt
public boolean matchesAt(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters, int index)Indicates whether the pattern is found in the given amino acid sequence at the given index, where 0 is the first amino acid. Returns false if the entire pattern cannot be mapped to the sequence.- Parameters:
aminoAcidSequence
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferencesindex
- the index at which the matching should be done- Returns:
- a boolean indicating whether the pattern is found in the given amino acid sequence at the given index
-
matches
public boolean matches(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the pattern matches the given amino acid sequence.- Parameters:
aminoAcidSequence
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the pattern matches the given amino acid sequence
-
matches
public boolean matches(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the pattern matches the given amino acid sequence- Parameters:
aminoAcidPattern
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the pattern matches the given amino acid sequence
-
isStarting
public boolean isStarting(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the given amino acid sequence starts with the pattern.- Parameters:
aminoAcidSequence
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the given amino acid sequence starts with the pattern
-
isStarting
public boolean isStarting(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the given amino acid sequence starts with the pattern.- Parameters:
aminoAcidPattern
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the given amino acid sequence starts with the pattern
-
isEnding
public boolean isEnding(AminoAcidPattern aminoAcidPattern, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the given amino acid sequence ends with the pattern.- Parameters:
aminoAcidPattern
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the given amino acid sequence ends with the pattern
-
isEnding
public boolean isEnding(String aminoAcidSequence, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether the given amino acid sequence ends with the pattern.- Parameters:
aminoAcidSequence
- the amino acid sequencesequenceMatchingParameters
- the sequence matching preferences- Returns:
- a boolean indicating whether the given amino acid sequence ends with the pattern
-
isSameAs
public boolean isSameAs(AminoAcidPattern anotherPattern, SequenceMatchingParameters sequenceMatchingParameters)Indicates whether another AminoAcidPattern targets the same pattern. Modifications are considered equal when of same mass. Modifications should be loaded in the Modification factory.- Parameters:
anotherPattern
- the other AminoAcidPatternsequenceMatchingParameters
- the sequence matching preferences- Returns:
- true if the other AminoAcidPattern targets the same pattern
-
length
public int length()Returns the length of the pattern in amino acids.- Returns:
- the length of the pattern in amino acids
-
getStandardSearchPattern
Computes a pattern which can be searched by standard search engines, i.e., a pattern targeting a single amino acid and not a complex pattern.- Returns:
- a pattern which can be searched by standard search engines
-
getTrypsinExample
Returns the trypsin example as amino acid pattern.- Returns:
- the trypsin example as amino acid pattern
-
merge
Simple merger for two patterns. Example: this: target{0>S} otherPattern: target{0>T} result (this): target{0>S|T}- Parameters:
otherPattern
- another pattern to be merged with this
-
append
Appends another pattern at the end of this pattern.- Parameters:
otherPattern
- the other pattern to append.
-
merge
Convenience method merging two different patterns (see public void merge(AminoAcidPattern otherPattern) for detailed information of the merging procedure).- Parameters:
pattern1
- the first patternpattern2
- the second pattern- Returns:
- a merged version of the two patterns
-
toString
-
asStringBuilder
Returns the sequence represented by this amino acid pattern in a new string builder.- Returns:
- the sequence represented by this amino acid pattern in a new string builder
-
asSequence
Returns the component of the amino acid pattern at the given index. 0 is the first amino acid.- Parameters:
index
- the index in the pattern. 0 is the first amino acid- Returns:
- the component of the amino acid pattern at the given index
-
addModificationSite
Adds a list of modifications to one of the amino acid pattern.- Parameters:
localization
- the index of the amino acid residue siteModificationSite
- valid amino acids for this site
-
getAllPossibleSequences
Returns all possible sequences which can be obtained from the targeted amino acids. Missing amino acids will be denoted as 'X'. This does not implement excluded amino acids.- Returns:
- all possible sequences which can be obtained from the targeted amino acids
-
getSubPattern
Returns a sub pattern of the pattern.- Parameters:
startIndex
- the start index, inclusive (0 is the first amino acid)endIndex
- the end index, inclusiveupdateTarget
- boolean indicating whether the target of the pattern shall be updated. If yes it will be shifted by startIndex, simply copied otherwise.- Returns:
- a sub pattern
-
getSubPattern
Returns a sub pattern of the pattern.- Parameters:
startIndex
- the start index, inclusive (0 is the first amino acid)updateTarget
- boolean indicating whether the target of the pattern shall be updated. If yes it will be shifted by startIndex, simply copied otherwise.- Returns:
- a sub pattern
-
reverse
Returns an amino acid pattern which is a reversed version of the current pattern.- Returns:
- an amino acid pattern which is a reversed version of the current pattern
-