public class SequenceFactory extends Object
| Modifier and Type | Class and Description |
|---|---|
class |
SequenceFactory.HeaderIterator
Convenience iterator iterating the headers of a FASTA file without using
the cache.
|
class |
SequenceFactory.ProteinIterator
Convenience iterator iterating all proteins in a FASTA file without using
index or cache.
|
| Modifier and Type | Field and Description |
|---|---|
static int |
minProteinCount
The minimal protein count required for reliable target/decoy based
statistics.
|
static long |
TIME_OUT
The time out in milliseconds when querying the file.
|
| Modifier and Type | Method and Description |
|---|---|
void |
appendDecoySequences(File destinationFile)
Appends decoy sequences to the desired file.
|
void |
appendDecoySequences(File destinationFile,
WaitingHandler waitingHandler)
Appends decoy sequences to the desired file while displaying progress.
|
void |
clearFactory()
Clears the factory getInstance() needs to be called afterwards.
|
void |
closeFile()
Closes the opened file.
|
double |
computeMolecularWeight(String accession)
Returns the protein's molecular weight in kDa.
|
boolean |
concatenatedTargetDecoy()
Indicates whether the database loaded contains decoy sequences.
|
boolean |
deleteProteinTree(ExceptionHandler exceptionHandler)
Try to delete the default protein tree.
|
void |
emptyCache()
Empties the cache of the factory.
|
HashMap<String,Integer> |
getAAOccurrences(JProgressBar progressBar)
Returns the occurrence of every amino acid in the database.
|
Set<String> |
getAccessions()
Returns the sequences present in the database.
|
File |
getCurrentFastaFile()
Returns the currently loaded FASTA file.
|
FastaIndex |
getCurrentFastaIndex()
Returns the FASTA index of the currently loaded file.
|
Protein |
getDecoyProteinFromTarget(String accession,
boolean reindex)
Returns a decoy protein from a target protein or looks for the sequence
in the cache if not found.
|
Protein |
getDecoyProteinFromTargetSynchronized(String accession,
boolean reindex)
Returns a decoy protein from a target protein or looks for the sequence
in the cache if not found.
|
static String |
getDefaultDecoyAccession(String targetAccession)
Returns the default decoy accession for a target accession.
|
static String |
getDefaultDecoyAccessionSuffix()
Returns the default suffix for a decoy accession.
|
static String |
getDefaultDecoyDescription(String targetDescription)
Returns the default description for a decoy protein.
|
ProteinTree |
getDefaultProteinTree()
Returns the default protein tree.
|
ProteinTree |
getDefaultProteinTree(int nThreads,
WaitingHandler waitingHandler,
ExceptionHandler exceptionHandler)
Returns the default protein tree corresponding to the database loaded in
factory, creates a new one if none found.
|
ProteinTree |
getDefaultProteinTree(int nThreads,
WaitingHandler waitingHandler,
ExceptionHandler exceptionHandler,
boolean displayProgress)
Returns the default protein tree corresponding to the database loaded in
factory, creates a new one if none found.
|
ProteinTree |
getDefaultProteinTree(WaitingHandler waitingHandler,
ExceptionHandler exceptionHandler)
Returns the default protein tree corresponding to the database loaded in
factory, creates a new one if none found.
|
static String |
getDefaultTargetAccession(String decoyAccession)
Returns the default target accession of a given decoy protein.
|
static FastaIndex |
getFastaIndex(File fastaFile,
boolean overwrite,
WaitingHandler waitingHandler)
Returns the file index of the given FASTA file.
|
String |
getFileName()
Returns the name of the loaded FASTA file.
|
Header |
getHeader(String accession)
Returns the desired header for the protein in the FASTA file.
|
SequenceFactory.HeaderIterator |
getHeaderIterator(boolean targetOnly)
Returns an iterator of all the headers in the FASTA file.
|
static String |
getIndexName(String fastaName)
Returns the name of the FASTA index corresponding to the given FASTA file
name.
|
static SequenceFactory |
getInstance()
Static method returning the instance of the factory.
|
static SequenceFactory |
getInstance(int nCache)
Returns the instance of the factory with the specified cache size.
|
int |
getnCache()
Returns the size of the cache.
|
int |
getNodesInCache()
Returns the number of nodes currently loaded in cache.
|
int |
getNSequences()
Returns the number of sequences in the FASTA file.
|
int |
getNTargetSequences()
Returns the number of target sequences in the database.
|
Protein |
getProtein(String accession)
Returns the desired protein.
|
SequenceFactory.ProteinIterator |
getProteinIterator(boolean targetOnly)
Returns an iterator of all the proteins in the FASTA file.
|
boolean |
hasEnoughSequences()
Indicates whether the database contained enough protein sequences for
reliability of the target/decoy based statistics.
|
boolean |
isClosed()
Indicates whether the connection to the random access file has been
closed.
|
static boolean |
isDecoy(String proteinAccession,
String decoyFlag)
Returns a boolean indicating whether a protein is decoy or not based on
the protein accession and a given decoy flag.
|
boolean |
isDecoyAccession(String proteinAccession)
Indicates whether a protein is a decoy in the selected loaded FASTA file.
|
boolean |
isDecoyInMemory()
Returns whether decoys should be kept in memory.
|
boolean |
isDefaultReversed()
Indicates whether the decoy sequences are reversed versions of the target
and the decoy accessions built based on the sequence factory methods.
|
void |
loadFastaFile(File fastaFile)
Loads a new FASTA file in the factory.
|
void |
loadFastaFile(File fastaFile,
WaitingHandler waitingHandler)
Loads a new FASTA file in the factory.
|
void |
reduceNodeCacheSize(double share)
Reduces the node cache size of the protein tree by the given share.
|
void |
resetConnection()
Resets the connection to the random access file.
|
static String |
reverseSequence(String sequence)
Reverses a protein sequence.
|
void |
saveIndex()
Saves the index.
|
void |
setDecoyInMemory(boolean decoyInMemory)
Sets whether decoys should be kept in memory.
|
void |
setnCache(int nCache)
Sets the size of the cache.
|
static void |
writeIndex(FastaIndex fastaIndex,
File directory)
Serializes the FASTA file index in a given directory.
|
public static final long TIME_OUT
public static int minProteinCount
public static SequenceFactory getInstance()
public static SequenceFactory getInstance(int nCache)
nCache - the new cache sizepublic boolean hasEnoughSequences()
public void clearFactory()
throws IOException,
SQLException
IOException - if an IOException occursSQLException - if an SQLException occurspublic void emptyCache()
public void reduceNodeCacheSize(double share)
share - the share of the cache to remove. 0.5 means 50%public int getNodesInCache()
public Protein getProtein(String accession) throws IOException, IllegalArgumentException, InterruptedException, FileNotFoundException
accession - accession of the desired proteinIOException - thrown whenever an error is encountered while reading
the FASTA fileIllegalArgumentException - thrown whenever an error is encountered
while reading the FASTA fileInterruptedException - if an InterruptedException occursFileNotFoundException - if a FileNotFoundException occurspublic Protein getDecoyProteinFromTargetSynchronized(String accession, boolean reindex) throws IOException, IllegalArgumentException, FileNotFoundException
accession - the accession of the decoy protein to look forreindex - a boolean indicating whether the database should be
re-indexed in case the protein is not found.IOException - if an IOException occursIllegalArgumentException - if an IllegalArgumentException occursFileNotFoundException - if a FileNotFoundException occurspublic Protein getDecoyProteinFromTarget(String accession, boolean reindex) throws IOException, IllegalArgumentException, FileNotFoundException
accession - the accession of the decoy protein to look forreindex - a boolean indicating whether the database should be
re-indexed in case the protein is not found.IOException - if an IOException occursIllegalArgumentException - if an IllegalArgumentException occursFileNotFoundException - if a FileNotFoundException occurspublic Header getHeader(String accession) throws IOException, InterruptedException
accession - accession of the desired proteinIOException - exception thrown whenever an error occurred while
reading the FASTA fileInterruptedException - exception thrown whenever an error
occurred while waiting for the connection to the FASTA file to recover.public void loadFastaFile(File fastaFile) throws IOException, ClassNotFoundException, StringIndexOutOfBoundsException
fastaFile - the FASTA file to loadIOException - exception thrown if an error occurred while reading
the FASTA fileClassNotFoundException - exception thrown whenever an error
occurred while deserializing the file indexStringIndexOutOfBoundsException - thrown if issues occur during the
parsing of the protein headerspublic void loadFastaFile(File fastaFile, WaitingHandler waitingHandler) throws IOException, ClassNotFoundException, StringIndexOutOfBoundsException
fastaFile - the FASTA file to loadwaitingHandler - a waitingHandler showing the progressIOException - exception thrown if an error occurred while reading
the FASTA fileClassNotFoundException - exception thrown whenever an error
occurred while deserializing the file indexStringIndexOutOfBoundsException - thrown if issues occur during the
parsing of the protein headerspublic boolean isClosed()
public void resetConnection()
throws IOException
IOException - if an IOException occurspublic static FastaIndex getFastaIndex(File fastaFile, boolean overwrite, WaitingHandler waitingHandler) throws IOException, StringIndexOutOfBoundsException
fastaFile - the FASTA file to indexoverwrite - boolean indicating whether the index .cui file shall be
overwritten if present, even if the file has not been changedwaitingHandler - a waitingHandler showing the progressIOException - exception thrown if an error occurred while reading
the FASTA fileStringIndexOutOfBoundsException - thrown if issues occur during the
parsing of the protein headersIllegalArgumentException - if non unique accession numbers are
foundpublic static void writeIndex(FastaIndex fastaIndex, File directory) throws IOException
fastaIndex - the index of the FASTA filedirectory - the directory where to write the fileIOException - exception thrown whenever an error occurred while
writing the filepublic static String getIndexName(String fastaName)
fastaName - the name of the FASTA filepublic void saveIndex()
throws IOException
IOException - if an IOException occurspublic void closeFile()
throws IOException,
SQLException
IOException - exception thrown whenever an error occurred while
closing the fileSQLException - if an SQLException occurspublic static boolean isDecoy(String proteinAccession, String decoyFlag)
proteinAccession - The accession of the proteindecoyFlag - the decoy flagpublic boolean isDecoyAccession(String proteinAccession)
proteinAccession - the protein accession of interest.public boolean concatenatedTargetDecoy()
public boolean isDefaultReversed()
public int getNTargetSequences()
public int getNSequences()
public void appendDecoySequences(File destinationFile) throws IOException, InterruptedException, ClassNotFoundException
destinationFile - the destination fileIOException - exception thrown whenever an error occurred while
reading or writing a fileInterruptedException - if an InterruptedException occursClassNotFoundException - if an ClassNotFoundException occurspublic void appendDecoySequences(File destinationFile, WaitingHandler waitingHandler) throws IOException, InterruptedException, ClassNotFoundException
destinationFile - the destination filewaitingHandler - the waiting handlerIOException - exception thrown whenever an error occurred while
reading or writing a fileInterruptedException - if an InterruptedException occursClassNotFoundException - if an ClassNotFoundException occurspublic static String reverseSequence(String sequence)
sequence - the protein sequencepublic Set<String> getAccessions()
public int getnCache()
public void setnCache(int nCache)
nCache - the new size of the cachepublic HashMap<String,Integer> getAAOccurrences(JProgressBar progressBar) throws IOException, InterruptedException, ClassNotFoundException
progressBar - a progress bar, can be nullIOException - exception thrown whenever an error occurred while
reading the databaseInterruptedException - if an InterruptedException occursClassNotFoundException - if an ClassNotFoundException occurspublic double computeMolecularWeight(String accession) throws IOException, InterruptedException, ClassNotFoundException
accession - the protein's accession numberIOException - exception thrown whenever an error occurred while
reading the protein sequenceInterruptedException - exception thrown whenever an error occurred
while reading the protein sequenceClassNotFoundException - exception thrown whenever an error
occurred while reading the protein sequencepublic String getFileName()
public File getCurrentFastaFile()
public static String getDefaultDecoyAccessionSuffix()
public static String getDefaultDecoyAccession(String targetAccession)
targetAccession - the target accessionpublic static String getDefaultDecoyDescription(String targetDescription)
targetDescription - the description of a target proteinpublic static String getDefaultTargetAccession(String decoyAccession)
decoyAccession - the decoy accessionpublic FastaIndex getCurrentFastaIndex()
public ProteinTree getDefaultProteinTree()
public ProteinTree getDefaultProteinTree(WaitingHandler waitingHandler, ExceptionHandler exceptionHandler) throws IOException, InterruptedException, ClassNotFoundException, SQLException
waitingHandler - waiting handler displaying progress to the user
during the initiation of the treeexceptionHandler - handler for the exceptions encountered while
creating the treeIOException - exception thrown whenever an error occurs while
reading or writing a file.ClassNotFoundException - exception thrown whenever an error occurs
while deserializing an object.InterruptedException - exception thrown whenever a threading issue
occurred while interacting with the tree.SQLException - if an SQLException exception thrown whenever a
problem occurred while interacting with the tree database.public ProteinTree getDefaultProteinTree(int nThreads, WaitingHandler waitingHandler, ExceptionHandler exceptionHandler) throws IOException, InterruptedException, ClassNotFoundException, SQLException
nThreads - the number of threads to usewaitingHandler - waiting handler displaying progress to the user
during the initiation of the treeexceptionHandler - handler for the exceptions encountered while
creating the treeIOException - exception thrown whenever an error occurs while
reading or writing a file.ClassNotFoundException - exception thrown whenever an error occurs
while deserializing an object.InterruptedException - exception thrown whenever a threading issue
occurred while interacting with the tree.SQLException - if an SQLException exception thrown whenever a
problem occurred while interacting with the tree database.public ProteinTree getDefaultProteinTree(int nThreads, WaitingHandler waitingHandler, ExceptionHandler exceptionHandler, boolean displayProgress) throws IOException, InterruptedException, ClassNotFoundException, SQLException
nThreads - the number of threads to usewaitingHandler - waiting handler displaying progress to the user
during the initiation of the treeexceptionHandler - handler for the exceptions encountered while
creating the treedisplayProgress - display progressIOException - exception thrown whenever an error occurs while
reading or writing a file.ClassNotFoundException - exception thrown whenever an error occurs
while deserializing an object.InterruptedException - exception thrown whenever a threading issue
occurred while interacting with the tree.SQLException - if an SQLException exception thrown whenever a
problem occurred while interacting with the tree database.public boolean deleteProteinTree(ExceptionHandler exceptionHandler)
exceptionHandler - handler for the exceptions encountered while
creating the treepublic SequenceFactory.HeaderIterator getHeaderIterator(boolean targetOnly) throws FileNotFoundException
targetOnly - boolean indicating whether only target accessions shall
be iteratedFileNotFoundException - if a FileNotFoundException occurspublic SequenceFactory.ProteinIterator getProteinIterator(boolean targetOnly) throws FileNotFoundException
targetOnly - boolean indicating whether only target accessions shall
be iteratedFileNotFoundException - if a FileNotFoundException occurspublic boolean isDecoyInMemory()
public void setDecoyInMemory(boolean decoyInMemory)
decoyInMemory - true if decoys should be kept in memoryCopyright © 2016. All rights reserved.