com.compomics.util.experiment.identification.protein_inference.proteintree
Class ProteinTree

java.lang.Object
  extended by com.compomics.util.experiment.identification.protein_inference.proteintree.ProteinTree

public class ProteinTree
extends Object

This class sorts the proteins into groups.

Author:
Marc Vaudel

Nested Class Summary
 class ProteinTree.PeptideIterator
          Alphabetical iterator for the tree.
 
Field Summary
static String version
          The version of the protein tree.
 
Constructor Summary
ProteinTree(int memoryAllocation)
          Creates a tree based on the proteins present in the sequence factory.
 
Method Summary
 void close()
          Closes all connections to files.
 void emptyCache()
          Empties the cache.
 int getCacheSize()
          Returns the size of the cache used for peptide mappings (note that there are two of them).
 HashMap<String,ArrayList<Integer>> getMatchedPeptideSequences(String peptideSequence, String proteinAccession, ProteinMatch.MatchingType matchingType, Double massTolerance)
          Returns a list of peptides matched using the given peptide sequence in the given protein according the provided matching settings.
 ProteinTree.PeptideIterator getPeptideIterator()
          Returns a PeptideIterator which iterates alphabetically all peptides corresponding to the end of a branch in the tree.
 HashMap<String,ArrayList<Integer>> getProteinMapping(String peptideSequence)
          Returns the protein mapping in the sequence factory for the given peptide sequence based on string matching only.
 HashMap<String,HashMap<String,ArrayList<Integer>>> getProteinMapping(String peptideSequence, ProteinMatch.MatchingType matchingType, Double massTolerance)
          Returns the protein mapping in the sequence factory for the given peptide sequence.
 void initiateTree(int initialTagSize, int maxNodeSize, int maxPeptideSize, Enzyme enzyme, WaitingHandler waitingHandler, boolean printExpectedImportTime)
          Initiates the tree.
 void initiateTree(int initialTagSize, int maxNodeSize, int maxPeptideSize, WaitingHandler waitingHandler, boolean printExpectedImportTime)
          Initiates the tree.
 void setCacheSize(int cacheSize)
          Sets the size of the cache used for peptide mappings (note that there are two of them).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

version

public static final String version
The version of the protein tree.

See Also:
Constant Field Values
Constructor Detail

ProteinTree

public ProteinTree(int memoryAllocation)
            throws IOException
Creates a tree based on the proteins present in the sequence factory.

Parameters:
memoryAllocation - the number of MB available for the tree in memory.
Throws:
IOException
Method Detail

initiateTree

public void initiateTree(int initialTagSize,
                         int maxNodeSize,
                         int maxPeptideSize,
                         WaitingHandler waitingHandler,
                         boolean printExpectedImportTime)
                  throws IOException,
                         IllegalArgumentException,
                         InterruptedException,
                         ClassNotFoundException,
                         SQLException
Initiates the tree.

Parameters:
initialTagSize - the initial tag size
maxNodeSize - the maximal size of a node. large nodes will be fast to initiate but slow to query. I typically use 500 giving an approximate query time <20ms.
maxPeptideSize - the maximum peptide size
waitingHandler - the waiting handler used to display progress to the user. Can be null but strongly recommended :)
printExpectedImportTime - if true the expected import time will be printed to the waiting handler
Throws:
IOException
IllegalArgumentException
InterruptedException
ClassNotFoundException
SQLException

initiateTree

public void initiateTree(int initialTagSize,
                         int maxNodeSize,
                         int maxPeptideSize,
                         Enzyme enzyme,
                         WaitingHandler waitingHandler,
                         boolean printExpectedImportTime)
                  throws IOException,
                         IllegalArgumentException,
                         InterruptedException,
                         IOException,
                         IllegalArgumentException,
                         InterruptedException,
                         ClassNotFoundException,
                         SQLException
Initiates the tree. Note: speed and memory are calibrated for the no enzyme case.

Parameters:
initialTagSize - the initial size of peptide tag. Large initial size are fast to query, low initial size are fast to initiate. I typically use 3 for databases containing less than 100 000 proteins giving an approximate initiation time of 60ms per accession.
maxNodeSize - the maximal size of a node. large nodes will be fast to initiate but slow to query. I typically use 500 giving an approximate query time <20ms.
maxPeptideSize - the maximum peptide size
enzyme - the enzyme used to select peptides. If null all possible peptides will be indexed
waitingHandler - the waiting handler used to display progress to the user. Can be null.
printExpectedImportTime - if true the expected import time will be printed to the waiting handler
Throws:
IOException
IllegalArgumentException
InterruptedException
ClassNotFoundException
SQLException

getProteinMapping

public HashMap<String,ArrayList<Integer>> getProteinMapping(String peptideSequence)
                                                     throws IOException,
                                                            InterruptedException,
                                                            ClassNotFoundException,
                                                            SQLException
Returns the protein mapping in the sequence factory for the given peptide sequence based on string matching only.

Parameters:
peptideSequence - the peptide sequence
Returns:
the peptide to protein mapping: Accession -> list of indexes where the peptide can be found on the sequence. An empty map if not found.
Throws:
IOException
InterruptedException
ClassNotFoundException
SQLException

getProteinMapping

public HashMap<String,HashMap<String,ArrayList<Integer>>> getProteinMapping(String peptideSequence,
                                                                            ProteinMatch.MatchingType matchingType,
                                                                            Double massTolerance)
                                                                     throws IOException,
                                                                            InterruptedException,
                                                                            ClassNotFoundException,
                                                                            SQLException
Returns the protein mapping in the sequence factory for the given peptide sequence.

Parameters:
peptideSequence - the peptide sequence
matchingType - the matching type
massTolerance - the mass tolerance for matching type 'indistiguishibleAminoAcids'. Can be null otherwise
Returns:
the peptide to protein mapping: peptide sequence -> protein accession -> index in the protein An empty map if not
Throws:
IOException
InterruptedException
ClassNotFoundException
SQLException

close

public void close()
           throws IOException,
                  SQLException
Closes all connections to files.

Throws:
IOException
SQLException

getCacheSize

public int getCacheSize()
Returns the size of the cache used for peptide mappings (note that there are two of them).

Returns:
the size of the cache used for peptide mappings

setCacheSize

public void setCacheSize(int cacheSize)
Sets the size of the cache used for peptide mappings (note that there are two of them).

Parameters:
cacheSize - the size of the cache used for peptide mappings

emptyCache

public void emptyCache()
Empties the cache.


getMatchedPeptideSequences

public HashMap<String,ArrayList<Integer>> getMatchedPeptideSequences(String peptideSequence,
                                                                     String proteinAccession,
                                                                     ProteinMatch.MatchingType matchingType,
                                                                     Double massTolerance)
                                                              throws IOException,
                                                                     InterruptedException,
                                                                     ClassNotFoundException,
                                                                     SQLException
Returns a list of peptides matched using the given peptide sequence in the given protein according the provided matching settings.

Parameters:
peptideSequence - the original peptide sequence
proteinAccession - the accession of the protein of interest
matchingType - the matching type
massTolerance - the mass tolerance for indistinguishable amino acids matching mode
Returns:
a list of peptides matched and their indexes in the protein sequence
Throws:
IOException
InterruptedException
SQLException
ClassNotFoundException

getPeptideIterator

public ProteinTree.PeptideIterator getPeptideIterator()
                                               throws SQLException,
                                                      IOException,
                                                      ClassNotFoundException
Returns a PeptideIterator which iterates alphabetically all peptides corresponding to the end of a branch in the tree.

Returns:
a PeptideIterator which iterates alphabetically all peptides corresponding to the end of a branch in the tree
Throws:
SQLException
IOException
ClassNotFoundException


Copyright © 2013. All Rights Reserved.