public class SequenceFactory extends Object
Modifier and Type | Class and Description |
---|---|
class |
SequenceFactory.HeaderIterator
Convenience iterator iterating the headers of a FASTA file without using
the cache.
|
class |
SequenceFactory.ProteinIterator
Convenience iterator iterating all proteins in a FASTA file without using
index or cache.
|
Modifier and Type | Field and Description |
---|---|
static int |
minProteinCount
The minimal protein count required for reliable target/decoy based
statistics.
|
static long |
TIME_OUT
The time out in milliseconds when querying the file.
|
Modifier and Type | Method and Description |
---|---|
void |
appendDecoySequences(File destinationFile)
Appends decoy sequences to the desired file.
|
void |
appendDecoySequences(File destinationFile,
WaitingHandler waitingHandler)
Appends decoy sequences to the desired file while displaying progress.
|
void |
clearFactory()
Clears the factory getInstance() needs to be called afterwards.
|
void |
closeFile()
Closes the opened file.
|
double |
computeMolecularWeight(String accession)
Returns the protein's molecular weight in kDa.
|
boolean |
concatenatedTargetDecoy()
Indicates whether the database loaded contains decoy sequences.
|
boolean |
deleteProteinTree(ExceptionHandler exceptionHandler)
Try to delete the default protein tree.
|
void |
emptyCache()
Empties the cache of the factory.
|
HashMap<String,Integer> |
getAAOccurrences(JProgressBar progressBar)
Returns the occurrence of every amino acid in the database.
|
Set<String> |
getAccessions()
Returns the sequences present in the database.
|
File |
getCurrentFastaFile()
Returns the currently loaded FASTA file.
|
FastaIndex |
getCurrentFastaIndex()
Returns the FASTA index of the currently loaded file.
|
Protein |
getDecoyProteinFromTarget(String accession,
boolean reindex)
Returns a decoy protein from a target protein or looks for the sequence
in the cache if not found.
|
Protein |
getDecoyProteinFromTargetSynchronized(String accession,
boolean reindex)
Returns a decoy protein from a target protein or looks for the sequence
in the cache if not found.
|
static String |
getDefaultDecoyAccession(String targetAccession)
Returns the default decoy accession for a target accession.
|
static String |
getDefaultDecoyAccessionSuffix()
Returns the default suffix for a decoy accession.
|
static String |
getDefaultDecoyDescription(String targetDescription)
Returns the default description for a decoy protein.
|
ProteinTree |
getDefaultProteinTree()
Returns the default protein tree.
|
ProteinTree |
getDefaultProteinTree(int nThreads,
WaitingHandler waitingHandler,
ExceptionHandler exceptionHandler)
Returns the default protein tree corresponding to the database loaded in
factory, creates a new one if none found.
|
ProteinTree |
getDefaultProteinTree(int nThreads,
WaitingHandler waitingHandler,
ExceptionHandler exceptionHandler,
boolean displayProgress)
Returns the default protein tree corresponding to the database loaded in
factory, creates a new one if none found.
|
ProteinTree |
getDefaultProteinTree(WaitingHandler waitingHandler,
ExceptionHandler exceptionHandler)
Returns the default protein tree corresponding to the database loaded in
factory, creates a new one if none found.
|
static String |
getDefaultTargetAccession(String decoyAccession)
Returns the default target accession of a given decoy protein.
|
static FastaIndex |
getFastaIndex(File fastaFile,
boolean overwrite,
WaitingHandler waitingHandler)
Returns the file index of the given FASTA file.
|
String |
getFileName()
Returns the name of the loaded FASTA file.
|
Header |
getHeader(String accession)
Returns the desired header for the protein in the FASTA file.
|
SequenceFactory.HeaderIterator |
getHeaderIterator(boolean targetOnly)
Returns an iterator of all the headers in the FASTA file.
|
static String |
getIndexName(String fastaName)
Returns the name of the FASTA index corresponding to the given FASTA file
name.
|
static SequenceFactory |
getInstance()
Static method returning the instance of the factory.
|
static SequenceFactory |
getInstance(int nCache)
Returns the instance of the factory with the specified cache size.
|
int |
getnCache()
Returns the size of the cache.
|
int |
getNodesInCache()
Returns the number of nodes currently loaded in cache.
|
int |
getNSequences()
Returns the number of sequences in the FASTA file.
|
int |
getNTargetSequences()
Returns the number of target sequences in the database.
|
Protein |
getProtein(String accession)
Returns the desired protein.
|
SequenceFactory.ProteinIterator |
getProteinIterator(boolean targetOnly)
Returns an iterator of all the proteins in the FASTA file.
|
boolean |
hasEnoughSequences()
Indicates whether the database contained enough protein sequences for
reliability of the target/decoy based statistics.
|
boolean |
isClosed()
Indicates whether the connection to the random access file has been
closed.
|
static boolean |
isDecoy(String proteinAccession,
String decoyFlag)
Returns a boolean indicating whether a protein is decoy or not based on
the protein accession and a given decoy flag.
|
boolean |
isDecoyAccession(String proteinAccession)
Indicates whether a protein is a decoy in the selected loaded FASTA file.
|
boolean |
isDecoyInMemory()
Returns whether decoys should be kept in memory.
|
boolean |
isDefaultReversed()
Indicates whether the decoy sequences are reversed versions of the target
and the decoy accessions built based on the sequence factory methods.
|
void |
loadFastaFile(File fastaFile)
Loads a new FASTA file in the factory.
|
void |
loadFastaFile(File fastaFile,
WaitingHandler waitingHandler)
Loads a new FASTA file in the factory.
|
void |
reduceNodeCacheSize(double share)
Reduces the node cache size of the protein tree by the given share.
|
void |
resetConnection()
Resets the connection to the random access file.
|
static String |
reverseSequence(String sequence)
Reverses a protein sequence.
|
void |
saveIndex()
Saves the index.
|
void |
setDecoyInMemory(boolean decoyInMemory)
Sets whether decoys should be kept in memory.
|
void |
setnCache(int nCache)
Sets the size of the cache.
|
static void |
writeIndex(FastaIndex fastaIndex,
File directory)
Serializes the FASTA file index in a given directory.
|
public static final long TIME_OUT
public static int minProteinCount
public static SequenceFactory getInstance()
public static SequenceFactory getInstance(int nCache)
nCache
- the new cache sizepublic boolean hasEnoughSequences()
public void clearFactory() throws IOException, SQLException
IOException
- if an IOException occursSQLException
- if an SQLException occurspublic void emptyCache()
public void reduceNodeCacheSize(double share)
share
- the share of the cache to remove. 0.5 means 50%public int getNodesInCache()
public Protein getProtein(String accession) throws IOException, IllegalArgumentException, InterruptedException, FileNotFoundException
accession
- accession of the desired proteinIOException
- thrown whenever an error is encountered while reading
the FASTA fileIllegalArgumentException
- thrown whenever an error is encountered
while reading the FASTA fileInterruptedException
- if an InterruptedException occursFileNotFoundException
- if a FileNotFoundException occurspublic Protein getDecoyProteinFromTargetSynchronized(String accession, boolean reindex) throws IOException, IllegalArgumentException, FileNotFoundException
accession
- the accession of the decoy protein to look forreindex
- a boolean indicating whether the database should be
re-indexed in case the protein is not found.IOException
- if an IOException occursIllegalArgumentException
- if an IllegalArgumentException occursFileNotFoundException
- if a FileNotFoundException occurspublic Protein getDecoyProteinFromTarget(String accession, boolean reindex) throws IOException, IllegalArgumentException, FileNotFoundException
accession
- the accession of the decoy protein to look forreindex
- a boolean indicating whether the database should be
re-indexed in case the protein is not found.IOException
- if an IOException occursIllegalArgumentException
- if an IllegalArgumentException occursFileNotFoundException
- if a FileNotFoundException occurspublic Header getHeader(String accession) throws IOException, InterruptedException
accession
- accession of the desired proteinIOException
- exception thrown whenever an error occurred while
reading the FASTA fileInterruptedException
- exception thrown whenever an error
occurred while waiting for the connection to the FASTA file to recover.public void loadFastaFile(File fastaFile) throws IOException, ClassNotFoundException, StringIndexOutOfBoundsException
fastaFile
- the FASTA file to loadIOException
- exception thrown if an error occurred while reading
the FASTA fileClassNotFoundException
- exception thrown whenever an error
occurred while deserializing the file indexStringIndexOutOfBoundsException
- thrown if issues occur during the
parsing of the protein headerspublic void loadFastaFile(File fastaFile, WaitingHandler waitingHandler) throws IOException, ClassNotFoundException, StringIndexOutOfBoundsException
fastaFile
- the FASTA file to loadwaitingHandler
- a waitingHandler showing the progressIOException
- exception thrown if an error occurred while reading
the FASTA fileClassNotFoundException
- exception thrown whenever an error
occurred while deserializing the file indexStringIndexOutOfBoundsException
- thrown if issues occur during the
parsing of the protein headerspublic boolean isClosed()
public void resetConnection() throws IOException
IOException
- if an IOException occurspublic static FastaIndex getFastaIndex(File fastaFile, boolean overwrite, WaitingHandler waitingHandler) throws IOException, StringIndexOutOfBoundsException
fastaFile
- the FASTA file to indexoverwrite
- boolean indicating whether the index .cui file shall be
overwritten if present, even if the file has not been changedwaitingHandler
- a waitingHandler showing the progressIOException
- exception thrown if an error occurred while reading
the FASTA fileStringIndexOutOfBoundsException
- thrown if issues occur during the
parsing of the protein headersIllegalArgumentException
- if non unique accession numbers are
foundpublic static void writeIndex(FastaIndex fastaIndex, File directory) throws IOException
fastaIndex
- the index of the FASTA filedirectory
- the directory where to write the fileIOException
- exception thrown whenever an error occurred while
writing the filepublic static String getIndexName(String fastaName)
fastaName
- the name of the FASTA filepublic void saveIndex() throws IOException
IOException
- if an IOException occurspublic void closeFile() throws IOException, SQLException
IOException
- exception thrown whenever an error occurred while
closing the fileSQLException
- if an SQLException occurspublic static boolean isDecoy(String proteinAccession, String decoyFlag)
proteinAccession
- The accession of the proteindecoyFlag
- the decoy flagpublic boolean isDecoyAccession(String proteinAccession)
proteinAccession
- the protein accession of interest.public boolean concatenatedTargetDecoy()
public boolean isDefaultReversed()
public int getNTargetSequences()
public int getNSequences()
public void appendDecoySequences(File destinationFile) throws IOException, InterruptedException, ClassNotFoundException
destinationFile
- the destination fileIOException
- exception thrown whenever an error occurred while
reading or writing a fileInterruptedException
- if an InterruptedException occursClassNotFoundException
- if an ClassNotFoundException occurspublic void appendDecoySequences(File destinationFile, WaitingHandler waitingHandler) throws IOException, InterruptedException, ClassNotFoundException
destinationFile
- the destination filewaitingHandler
- the waiting handlerIOException
- exception thrown whenever an error occurred while
reading or writing a fileInterruptedException
- if an InterruptedException occursClassNotFoundException
- if an ClassNotFoundException occurspublic static String reverseSequence(String sequence)
sequence
- the protein sequencepublic Set<String> getAccessions()
public int getnCache()
public void setnCache(int nCache)
nCache
- the new size of the cachepublic HashMap<String,Integer> getAAOccurrences(JProgressBar progressBar) throws IOException, InterruptedException, ClassNotFoundException
progressBar
- a progress bar, can be nullIOException
- exception thrown whenever an error occurred while
reading the databaseInterruptedException
- if an InterruptedException occursClassNotFoundException
- if an ClassNotFoundException occurspublic double computeMolecularWeight(String accession) throws IOException, InterruptedException, ClassNotFoundException
accession
- the protein's accession numberIOException
- exception thrown whenever an error occurred while
reading the protein sequenceInterruptedException
- exception thrown whenever an error occurred
while reading the protein sequenceClassNotFoundException
- exception thrown whenever an error
occurred while reading the protein sequencepublic String getFileName()
public File getCurrentFastaFile()
public static String getDefaultDecoyAccessionSuffix()
public static String getDefaultDecoyAccession(String targetAccession)
targetAccession
- the target accessionpublic static String getDefaultDecoyDescription(String targetDescription)
targetDescription
- the description of a target proteinpublic static String getDefaultTargetAccession(String decoyAccession)
decoyAccession
- the decoy accessionpublic FastaIndex getCurrentFastaIndex()
public ProteinTree getDefaultProteinTree()
public ProteinTree getDefaultProteinTree(WaitingHandler waitingHandler, ExceptionHandler exceptionHandler) throws IOException, InterruptedException, ClassNotFoundException, SQLException
waitingHandler
- waiting handler displaying progress to the user
during the initiation of the treeexceptionHandler
- handler for the exceptions encountered while
creating the treeIOException
- exception thrown whenever an error occurs while
reading or writing a file.ClassNotFoundException
- exception thrown whenever an error occurs
while deserializing an object.InterruptedException
- exception thrown whenever a threading issue
occurred while interacting with the tree.SQLException
- if an SQLException exception thrown whenever a
problem occurred while interacting with the tree database.public ProteinTree getDefaultProteinTree(int nThreads, WaitingHandler waitingHandler, ExceptionHandler exceptionHandler) throws IOException, InterruptedException, ClassNotFoundException, SQLException
nThreads
- the number of threads to usewaitingHandler
- waiting handler displaying progress to the user
during the initiation of the treeexceptionHandler
- handler for the exceptions encountered while
creating the treeIOException
- exception thrown whenever an error occurs while
reading or writing a file.ClassNotFoundException
- exception thrown whenever an error occurs
while deserializing an object.InterruptedException
- exception thrown whenever a threading issue
occurred while interacting with the tree.SQLException
- if an SQLException exception thrown whenever a
problem occurred while interacting with the tree database.public ProteinTree getDefaultProteinTree(int nThreads, WaitingHandler waitingHandler, ExceptionHandler exceptionHandler, boolean displayProgress) throws IOException, InterruptedException, ClassNotFoundException, SQLException
nThreads
- the number of threads to usewaitingHandler
- waiting handler displaying progress to the user
during the initiation of the treeexceptionHandler
- handler for the exceptions encountered while
creating the treedisplayProgress
- display progressIOException
- exception thrown whenever an error occurs while
reading or writing a file.ClassNotFoundException
- exception thrown whenever an error occurs
while deserializing an object.InterruptedException
- exception thrown whenever a threading issue
occurred while interacting with the tree.SQLException
- if an SQLException exception thrown whenever a
problem occurred while interacting with the tree database.public boolean deleteProteinTree(ExceptionHandler exceptionHandler)
exceptionHandler
- handler for the exceptions encountered while
creating the treepublic SequenceFactory.HeaderIterator getHeaderIterator(boolean targetOnly) throws FileNotFoundException
targetOnly
- boolean indicating whether only target accessions shall
be iteratedFileNotFoundException
- if a FileNotFoundException occurspublic SequenceFactory.ProteinIterator getProteinIterator(boolean targetOnly) throws FileNotFoundException
targetOnly
- boolean indicating whether only target accessions shall
be iteratedFileNotFoundException
- if a FileNotFoundException occurspublic boolean isDecoyInMemory()
public void setDecoyInMemory(boolean decoyInMemory)
decoyInMemory
- true if decoys should be kept in memoryCopyright © 2016. All rights reserved.