Class ProteinGeneDetailsProvider

java.lang.Object
com.compomics.util.experiment.biology.genes.ProteinGeneDetailsProvider

public class ProteinGeneDetailsProvider
extends Object
Class used to map proteins to gene information.
Author:
Marc Vaudel, Harald Barsnes
  • Field Details

  • Constructor Details

  • Method Details

    • initialize

      public void initialize​(String jarFilePath) throws IOException
      Initializes the factory. Note: the species factory must be initialized first.
      Parameters:
      jarFilePath - the path to the jar file
      Throws:
      IOException - Exception thrown if an error occurs while reading the species mapping
    • getGeneMaps

      public GeneMaps getGeneMaps​(GeneParameters genePreferences, FastaSummary fastaSummary, SequenceProvider sequenceProvider, ProteinDetailsProvider proteinDetailsProvider, WaitingHandler waitingHandler)
      Returns the gene maps for the given proteins. For every protein, the species must be given as well as the gene name, in the format used in a UniProt Importing gene mappings file.
      Parameters:
      genePreferences - the gene preferences
      fastaSummary - summary information on the Importing gene mappings file containing the proteins
      sequenceProvider - the protein sequence provider
      proteinDetailsProvider - the protein details provider
      waitingHandler - waiting handler displaying progress for the download and allowing canceling of the progress.
      Returns:
      the gene maps for the FASTA file loaded in the factory
    • downloadGeneSequences

      public boolean downloadGeneSequences​(File destinationFile, String ensemblType, String ensemblSchemaName, String ensemblDbName, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Download the gene sequences mappings.
      Parameters:
      destinationFile - the destination file where to save the gene sequences
      ensemblType - the Ensembl type, e.g., default or plants
      ensemblSchemaName - the Ensembl schema name, e.g., default or plants_mart_18
      ensemblDbName - the Ensembl DB name of the selected species
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • downloadGoMappings

      public boolean downloadGoMappings​(String ensemblType, String ensemblSchemaName, String ensemblDbName, boolean swissProtMapping, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Download the GO mappings.
      Parameters:
      ensemblType - the Ensembl type, e.g., default or plants
      ensemblSchemaName - the Ensembl schema name, e.g., default or plants_mart_18
      ensemblDbName - the Ensembl db name of the selected species
      swissProtMapping - if true, use the uniprotswissprot_accession parameter, if false use the uniprotsptrembl parameter
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • queryEnsembl

      public boolean queryEnsembl​(String requestXml, File destinationFile, String ensemblType) throws MalformedURLException, IOException
      Sends an XML query to Ensembl and writes the result in a text file.
      Parameters:
      requestXml - the XML request
      destinationFile - the file where to save the results
      ensemblType - the Ensembl type, e.g., default or plants
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • queryEnsembl

      public boolean queryEnsembl​(String requestXml, File destinationFile, String ensemblType, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Sends an XML query to Ensembl and writes the result in a text file.
      Parameters:
      requestXml - the XML request
      destinationFile - the file where to save the results
      ensemblType - the Ensembl type, e.g., default or plants
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • queryEnsembl

      public boolean queryEnsembl​(String requestXml, String waitingText, File destinationFile, String ensemblType, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Sends an XML query to Ensembl and writes the result in a text file.
      Parameters:
      requestXml - the XML request
      destinationFile - the file where to save the results
      ensemblType - the Ensembl type, e.g., default or plants
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      waitingText - the text to write in case a progress dialog is used
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • downloadGeneMappings

      public void downloadGeneMappings​(String ensemblType, String ensemblSchemaName, String ensemblDatasetName, String ensemblVersion, WaitingHandler waitingHandler) throws MalformedURLException, IOException, IllegalArgumentException
      Download the gene mappings.
      Parameters:
      ensemblType - the Ensembl type, e.g., default or plants
      ensemblSchemaName - the Ensembl schema name, e.g., default or plants_mart_18
      ensemblDatasetName - the Ensembl dataset name of the selected species
      ensemblVersion - the Ensembl version
      waitingHandler - the waiting handler
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
      IllegalArgumentException - if an IllegalArgumentException occurs
    • getGeneMappingFolder

      public static File getGeneMappingFolder()
      Returns the path to the folder containing the gene mapping files.
      Returns:
      the gene mapping folder
    • setGeneMappingFolder

      public static void setGeneMappingFolder​(String geneMappingFolder)
      Sets the folder where gene mappings are saved.
      Parameters:
      geneMappingFolder - the folder where gene mappings are saved
    • createDefaultGeneMappingFilesGeneric

      public void createDefaultGeneMappingFilesGeneric​(String jarFilePath, File sourceEnsemblVersionsFile, File sourceGoDomainsFile, boolean updateEqualVersion)
      Copies the given gene mapping files to the gene mappings folder. If newer versions of the mapping exists they will be overwritten according to updateEqualVersion.
      Parameters:
      jarFilePath - the Ensembl versions file
      sourceEnsemblVersionsFile - the Ensembl versions file
      sourceGoDomainsFile - the GO domains file
      updateEqualVersion - if true, the version is updated with equal version numbers, false, only update if the new version is newer
    • updateEnsemblVersion

      public void updateEnsemblVersion​(String ensemblDatasetName, String ensemblVersion) throws IOException
      Update the Ensembl version for the given species in the local map and in the Ensembl versions file.
      Parameters:
      ensemblDatasetName - the dataset name of the species to update, e.g., hsapiens_gene_ensembl
      ensemblVersion - the new Ensembl version
      Throws:
      IOException - if an IOException occurs
    • getEnsemblVersionFromFile

      public Integer getEnsemblVersionFromFile​(File ensemblVersionsFile, String species) throws IOException
      Gets the Ensembl version of a given species from a file.
      Parameters:
      ensemblVersionsFile - the Ensembl versions file
      species - the species of interest
      Returns:
      the Ensembl version
      Throws:
      IOException - thrown whenever an error occurred while reading the file
    • getEnsemblSpeciesVersions

      public HashMap<String,​String> getEnsemblSpeciesVersions​(File ensemblVersionsFile) throws FileNotFoundException, IOException
      Gets the information contained into the Ensembl species file.
      Parameters:
      ensemblVersionsFile - the Ensembl species file to read
      Returns:
      The Ensembl versions for each species
      Throws:
      FileNotFoundException - if an FileNotFoundException occurs
      IOException - if an IOException occurs
    • loadEnsemblSpeciesVersions

      public void loadEnsemblSpeciesVersions​(File ensemblVersionsFile) throws IOException
      Loads the given Ensembl species file.
      Parameters:
      ensemblVersionsFile - the Ensembl species file to load
      Throws:
      IOException - if an IOException occurs
    • downloadMappings

      public boolean downloadMappings​(WaitingHandler waitingHandler, Integer taxon) throws IOException
      Try to download the gene and GO mappings for the currently selected species.
      Parameters:
      waitingHandler - the waiting handler
      taxon - the NCBI taxon of the species
      Returns:
      true if the download was successful
      Throws:
      IOException - exception thrown whenever an error occurred while reading the mapping files
    • getGeneMappingFile

      public static File getGeneMappingFile​(String ensemblDatasetName)
      Returns the gene mapping file.
      Parameters:
      ensemblDatasetName - the Ensembl dataset name
      Returns:
      the gene mapping file
    • getGoMappingFile

      public static File getGoMappingFile​(String ensemblDatasetName)
      Returns the GO mapping file.
      Parameters:
      ensemblDatasetName - the Ensembl dataset name
      Returns:
      the GO mapping file
    • getEnsemblVersionsFile

      public static File getEnsemblVersionsFile()
      Returns the Ensembl version file.
      Returns:
      the Ensembl version file
    • getGoDomainsFile

      public static File getGoDomainsFile()
      Returns the GO domains file.
      Returns:
      the GO domains file
    • getEnsemblVersion

      public String getEnsemblVersion​(Integer taxon)
      Returns the Ensembl version for a given species.
      Parameters:
      taxon - the NCBI taxon of the species
      Returns:
      the Ensembl version for a given species.
    • newVersionExists

      public boolean newVersionExists​(Integer taxon)
      Returns true if a newer version of the species mapping exists in Ensembl.
      Parameters:
      taxon - the NCBI taxon of the species
      Returns:
      rue if a newer version of the species mapping exists in Ensemble