Class ProteinGeneDetailsProvider
java.lang.Object
com.compomics.util.experiment.biology.genes.ProteinGeneDetailsProvider
Class used to map proteins to gene information.
- Author:
- Marc Vaudel, Harald Barsnes
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcreateDefaultGeneMappingFilesGeneric(File configFolder, File sourceEnsemblVersionsFile, File sourceGoDomainsFile, boolean updateEqualVersion) Copies the given gene mapping files to the gene mappings folder.booleandownloadGeneMappings(String ensemblType, String ensemblSchemaName, String ensemblDatasetName, String ensemblVersion, WaitingHandler waitingHandler) Download the gene mappings.booleandownloadGeneSequences(File destinationFile, String ensemblType, String ensemblSchemaName, String ensemblDbName, WaitingHandler waitingHandler) Download the gene sequences mappings.booleandownloadGoMappings(String ensemblType, String ensemblSchemaName, String ensemblDbName, boolean swissProtMapping, WaitingHandler waitingHandler) Download the GO mappings.booleandownloadMappings(WaitingHandler waitingHandler, String speciesName, String ensemblDatasetName, EnsemblSpecies.EnsemblDivision ensemblDivision) Try to download the gene and GO mappings for the currently selected species.getEnsemblSpeciesVersions(File ensemblVersionsFile) Gets the information contained into the Ensembl species file.getEnsemblVersion(String latinName) Returns the Ensembl version for a given species.getEnsemblVersionFromFile(File ensemblVersionsFile, String species) Gets the Ensembl version of a given species from a file.static FileReturns the Ensembl version file.static FilegetGeneMappingFile(String ensemblDatasetName) Returns the gene mapping file.static FileReturns the path to the folder containing the gene mapping files.getGeneMaps(GeneParameters genePreferences, FastaSummary fastaSummary, SequenceProvider sequenceProvider, ProteinDetailsProvider proteinDetailsProvider, WaitingHandler waitingHandler) Returns the gene maps for the given proteins.static FileReturns the GO domains file.static FilegetGoMappingFile(String ensemblDatasetName) Returns the GO mapping file.voidinitialize(File configFolder) Initializes the factory.voidloadEnsemblSpeciesVersions(File ensemblVersionsFile) Loads the given Ensembl species file.booleannewVersionExists(String latinName) Returns true if a newer version of the species mapping exists in Ensembl.booleanqueryEnsembl(String requestXml, File destinationFile, String ensemblType) Sends an XML query to Ensembl and writes the result in a text file.booleanqueryEnsembl(String requestXml, File destinationFile, String ensemblType, WaitingHandler waitingHandler) Sends an XML query to Ensembl and writes the result in a text file.booleanqueryEnsembl(String requestXml, String waitingText, File destinationFile, String ensemblType, WaitingHandler waitingHandler) Sends an XML query to Ensembl and writes the result in a text file.static voidsetGeneMappingFolder(String geneMappingFolder) Sets the folder where gene mappings are saved.voidupdateEnsemblVersion(String ensemblDatasetName, String ensemblVersion) Update the Ensembl version for the given species in the local map and in the Ensembl versions file.
-
Field Details
-
SEPARATOR
The separator used to separate line contents.- See Also:
-
GENE_MAPPING_FILE_SUFFIX
The suffix to use for files containing gene mappings.- See Also:
-
GO_MAPPING_FILE_SUFFIX
The suffix to use for files containing GO mappings.- See Also:
-
-
Constructor Details
-
ProteinGeneDetailsProvider
public ProteinGeneDetailsProvider()Constructor.
-
-
Method Details
-
initialize
Initializes the factory. Note: the species factory must be initialized first.- Parameters:
configFolder- the config folder- Throws:
IOException- Exception thrown if an error occurs while reading the species mapping
-
getGeneMaps
public GeneMaps getGeneMaps(GeneParameters genePreferences, FastaSummary fastaSummary, SequenceProvider sequenceProvider, ProteinDetailsProvider proteinDetailsProvider, WaitingHandler waitingHandler) Returns the gene maps for the given proteins. For every protein, the species must be given as well as the gene name, in the format used in a UniProt Importing gene mappings file.- Parameters:
genePreferences- the gene preferencesfastaSummary- summary information on the Importing gene mappings file containing the proteinssequenceProvider- the protein sequence providerproteinDetailsProvider- the protein details providerwaitingHandler- waiting handler displaying progress for the download and allowing canceling of the progress.- Returns:
- the gene maps for the FASTA file loaded in the factory
-
downloadGeneSequences
public boolean downloadGeneSequences(File destinationFile, String ensemblType, String ensemblSchemaName, String ensemblDbName, WaitingHandler waitingHandler) throws MalformedURLException, IOException Download the gene sequences mappings.- Parameters:
destinationFile- the destination file where to save the gene sequencesensemblType- the Ensembl type, e.g., default or plantsensemblSchemaName- the Ensembl schema name, e.g., default or plants_mart_18ensemblDbName- the Ensembl DB name of the selected specieswaitingHandler- waiting handler displaying progress and allowing canceling the process- Returns:
- true if downloading went OK
- Throws:
MalformedURLException- if a MalformedURLException occursIOException- if an IOException occurs
-
downloadGoMappings
public boolean downloadGoMappings(String ensemblType, String ensemblSchemaName, String ensemblDbName, boolean swissProtMapping, WaitingHandler waitingHandler) throws MalformedURLException, IOException Download the GO mappings.- Parameters:
ensemblType- the Ensembl type, e.g., default or plantsensemblSchemaName- the Ensembl schema name, e.g., default or plants_martensemblDbName- the Ensembl db name of the selected speciesswissProtMapping- if true, use the uniprotswissprot_accession parameter, if false use the uniprotsptrembl parameterwaitingHandler- waiting handler displaying progress and allowing canceling the process- Returns:
- true if downloading went OK
- Throws:
MalformedURLException- if an MalformedURLException occursIOException- if an IOException occurs
-
queryEnsembl
public boolean queryEnsembl(String requestXml, File destinationFile, String ensemblType) throws MalformedURLException, IOException Sends an XML query to Ensembl and writes the result in a text file.- Parameters:
requestXml- the XML requestdestinationFile- the file where to save the resultsensemblType- the Ensembl type, e.g., default or plants- Returns:
- true if downloading went OK
- Throws:
MalformedURLException- if an MalformedURLException occursIOException- if an IOException occurs
-
queryEnsembl
public boolean queryEnsembl(String requestXml, File destinationFile, String ensemblType, WaitingHandler waitingHandler) throws MalformedURLException, IOException Sends an XML query to Ensembl and writes the result in a text file.- Parameters:
requestXml- the XML requestdestinationFile- the file where to save the resultsensemblType- the Ensembl type, e.g., default or plantswaitingHandler- waiting handler displaying progress and allowing canceling the process- Returns:
- true if downloading went OK
- Throws:
MalformedURLException- if an MalformedURLException occursIOException- if an IOException occurs
-
queryEnsembl
public boolean queryEnsembl(String requestXml, String waitingText, File destinationFile, String ensemblType, WaitingHandler waitingHandler) throws MalformedURLException, IOException Sends an XML query to Ensembl and writes the result in a text file.- Parameters:
requestXml- the XML requestwaitingText- the text to write in case a progress dialog is useddestinationFile- the file where to save the resultsensemblType- the Ensembl type, e.g., default or plantswaitingHandler- waiting handler displaying progress and allowing canceling the process- Returns:
- true if downloading went OK
- Throws:
MalformedURLException- if an MalformedURLException occursIOException- if an IOException occurs
-
downloadGeneMappings
public boolean downloadGeneMappings(String ensemblType, String ensemblSchemaName, String ensemblDatasetName, String ensemblVersion, WaitingHandler waitingHandler) throws MalformedURLException, IOException, IllegalArgumentException Download the gene mappings.- Parameters:
ensemblType- the Ensembl type, e.g., default or plantsensemblSchemaName- the Ensembl schema name, e.g., default or plants_martensemblDatasetName- the Ensembl dataset name of the selected speciesensemblVersion- the Ensembl versionwaitingHandler- the waiting handler- Returns:
- true if downloading went OK
- Throws:
MalformedURLException- if a MalformedURLException occursIOException- if an IOException occursIllegalArgumentException- if an IllegalArgumentException occurs
-
getGeneMappingFolder
Returns the path to the folder containing the gene mapping files.- Returns:
- the gene mapping folder
-
setGeneMappingFolder
Sets the folder where gene mappings are saved.- Parameters:
geneMappingFolder- the folder where gene mappings are saved
-
createDefaultGeneMappingFilesGeneric
public void createDefaultGeneMappingFilesGeneric(File configFolder, File sourceEnsemblVersionsFile, File sourceGoDomainsFile, boolean updateEqualVersion) Copies the given gene mapping files to the gene mappings folder. If newer versions of the mapping exists they will be overwritten according to updateEqualVersion.- Parameters:
configFolder- the config foldersourceEnsemblVersionsFile- the Ensembl versions filesourceGoDomainsFile- the GO domains fileupdateEqualVersion- if true, the version is updated with equal version numbers, false, only update if the new version is newer
-
updateEnsemblVersion
public void updateEnsemblVersion(String ensemblDatasetName, String ensemblVersion) throws IOException Update the Ensembl version for the given species in the local map and in the Ensembl versions file.- Parameters:
ensemblDatasetName- the dataset name of the species to update, e.g., hsapiens_gene_ensemblensemblVersion- the new Ensembl version- Throws:
IOException- if an IOException occurs
-
getEnsemblVersionFromFile
public Integer getEnsemblVersionFromFile(File ensemblVersionsFile, String species) throws IOException Gets the Ensembl version of a given species from a file.- Parameters:
ensemblVersionsFile- the Ensembl versions filespecies- the species of interest- Returns:
- the Ensembl version
- Throws:
IOException- thrown whenever an error occurred while reading the file
-
getEnsemblSpeciesVersions
public HashMap<String,String> getEnsemblSpeciesVersions(File ensemblVersionsFile) throws FileNotFoundException, IOException Gets the information contained into the Ensembl species file.- Parameters:
ensemblVersionsFile- the Ensembl species file to read- Returns:
- The Ensembl versions for each species
- Throws:
FileNotFoundException- if an FileNotFoundException occursIOException- if an IOException occurs
-
loadEnsemblSpeciesVersions
Loads the given Ensembl species file.- Parameters:
ensemblVersionsFile- the Ensembl species file to load- Throws:
IOException- if an IOException occurs
-
downloadMappings
public boolean downloadMappings(WaitingHandler waitingHandler, String speciesName, String ensemblDatasetName, EnsemblSpecies.EnsemblDivision ensemblDivision) throws IOException Try to download the gene and GO mappings for the currently selected species.- Parameters:
waitingHandler- the waiting handlerspeciesName- the name of the speciesensemblDatasetName- the ensemblDatasetNameensemblDivision- the Ensembl division- Returns:
- true if the download was successful
- Throws:
IOException- exception thrown whenever an error occurred while reading the mapping files
-
getGeneMappingFile
Returns the gene mapping file.- Parameters:
ensemblDatasetName- the Ensembl dataset name- Returns:
- the gene mapping file
-
getGoMappingFile
Returns the GO mapping file.- Parameters:
ensemblDatasetName- the Ensembl dataset name- Returns:
- the GO mapping file
-
getEnsemblVersionsFile
Returns the Ensembl version file.- Returns:
- the Ensembl version file
-
getGoDomainsFile
Returns the GO domains file.- Returns:
- the GO domains file
-
getEnsemblVersion
Returns the Ensembl version for a given species.- Parameters:
latinName- the Latin name of the species- Returns:
- the Ensembl version for a given species.
-
newVersionExists
Returns true if a newer version of the species mapping exists in Ensembl.- Parameters:
latinName- the Latin name of the species- Returns:
- true if a newer version of the species mapping exists in Ensemble
-