Class ProteinGeneDetailsProvider

java.lang.Object
com.compomics.util.experiment.biology.genes.ProteinGeneDetailsProvider

public class ProteinGeneDetailsProvider extends Object
Class used to map proteins to gene information.
Author:
Marc Vaudel, Harald Barsnes
  • Field Details

    • SEPARATOR

      public static final String SEPARATOR
      The separator used to separate line contents.
      See Also:
    • GENE_MAPPING_FILE_SUFFIX

      public static final String GENE_MAPPING_FILE_SUFFIX
      The suffix to use for files containing gene mappings.
      See Also:
    • GO_MAPPING_FILE_SUFFIX

      public static final String GO_MAPPING_FILE_SUFFIX
      The suffix to use for files containing GO mappings.
      See Also:
  • Constructor Details

    • ProteinGeneDetailsProvider

      public ProteinGeneDetailsProvider()
      Constructor.
  • Method Details

    • initialize

      public void initialize(File configFolder) throws IOException
      Initializes the factory. Note: the species factory must be initialized first.
      Parameters:
      configFolder - the config folder
      Throws:
      IOException - Exception thrown if an error occurs while reading the species mapping
    • getGeneMaps

      public GeneMaps getGeneMaps(GeneParameters genePreferences, FastaSummary fastaSummary, SequenceProvider sequenceProvider, ProteinDetailsProvider proteinDetailsProvider, WaitingHandler waitingHandler)
      Returns the gene maps for the given proteins. For every protein, the species must be given as well as the gene name, in the format used in a UniProt Importing gene mappings file.
      Parameters:
      genePreferences - the gene preferences
      fastaSummary - summary information on the Importing gene mappings file containing the proteins
      sequenceProvider - the protein sequence provider
      proteinDetailsProvider - the protein details provider
      waitingHandler - waiting handler displaying progress for the download and allowing canceling of the progress.
      Returns:
      the gene maps for the FASTA file loaded in the factory
    • downloadGeneSequences

      public boolean downloadGeneSequences(File destinationFile, String ensemblType, String ensemblSchemaName, String ensemblDbName, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Download the gene sequences mappings.
      Parameters:
      destinationFile - the destination file where to save the gene sequences
      ensemblType - the Ensembl type, e.g., default or plants
      ensemblSchemaName - the Ensembl schema name, e.g., default or plants_mart_18
      ensemblDbName - the Ensembl DB name of the selected species
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if a MalformedURLException occurs
      IOException - if an IOException occurs
    • downloadGoMappings

      public boolean downloadGoMappings(String ensemblType, String ensemblSchemaName, String ensemblDbName, boolean swissProtMapping, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Download the GO mappings.
      Parameters:
      ensemblType - the Ensembl type, e.g., default or plants
      ensemblSchemaName - the Ensembl schema name, e.g., default or plants_mart
      ensemblDbName - the Ensembl db name of the selected species
      swissProtMapping - if true, use the uniprotswissprot_accession parameter, if false use the uniprotsptrembl parameter
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • queryEnsembl

      public boolean queryEnsembl(String requestXml, File destinationFile, String ensemblType) throws MalformedURLException, IOException
      Sends an XML query to Ensembl and writes the result in a text file.
      Parameters:
      requestXml - the XML request
      destinationFile - the file where to save the results
      ensemblType - the Ensembl type, e.g., default or plants
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • queryEnsembl

      public boolean queryEnsembl(String requestXml, File destinationFile, String ensemblType, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Sends an XML query to Ensembl and writes the result in a text file.
      Parameters:
      requestXml - the XML request
      destinationFile - the file where to save the results
      ensemblType - the Ensembl type, e.g., default or plants
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • queryEnsembl

      public boolean queryEnsembl(String requestXml, String waitingText, File destinationFile, String ensemblType, WaitingHandler waitingHandler) throws MalformedURLException, IOException
      Sends an XML query to Ensembl and writes the result in a text file.
      Parameters:
      requestXml - the XML request
      waitingText - the text to write in case a progress dialog is used
      destinationFile - the file where to save the results
      ensemblType - the Ensembl type, e.g., default or plants
      waitingHandler - waiting handler displaying progress and allowing canceling the process
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if an MalformedURLException occurs
      IOException - if an IOException occurs
    • downloadGeneMappings

      public boolean downloadGeneMappings(String ensemblType, String ensemblSchemaName, String ensemblDatasetName, String ensemblVersion, WaitingHandler waitingHandler) throws MalformedURLException, IOException, IllegalArgumentException
      Download the gene mappings.
      Parameters:
      ensemblType - the Ensembl type, e.g., default or plants
      ensemblSchemaName - the Ensembl schema name, e.g., default or plants_mart
      ensemblDatasetName - the Ensembl dataset name of the selected species
      ensemblVersion - the Ensembl version
      waitingHandler - the waiting handler
      Returns:
      true if downloading went OK
      Throws:
      MalformedURLException - if a MalformedURLException occurs
      IOException - if an IOException occurs
      IllegalArgumentException - if an IllegalArgumentException occurs
    • getGeneMappingFolder

      public static File getGeneMappingFolder()
      Returns the path to the folder containing the gene mapping files.
      Returns:
      the gene mapping folder
    • setGeneMappingFolder

      public static void setGeneMappingFolder(String geneMappingFolder)
      Sets the folder where gene mappings are saved.
      Parameters:
      geneMappingFolder - the folder where gene mappings are saved
    • createDefaultGeneMappingFilesGeneric

      public void createDefaultGeneMappingFilesGeneric(File configFolder, File sourceEnsemblVersionsFile, File sourceGoDomainsFile, boolean updateEqualVersion)
      Copies the given gene mapping files to the gene mappings folder. If newer versions of the mapping exists they will be overwritten according to updateEqualVersion.
      Parameters:
      configFolder - the config folder
      sourceEnsemblVersionsFile - the Ensembl versions file
      sourceGoDomainsFile - the GO domains file
      updateEqualVersion - if true, the version is updated with equal version numbers, false, only update if the new version is newer
    • updateEnsemblVersion

      public void updateEnsemblVersion(String ensemblDatasetName, String ensemblVersion) throws IOException
      Update the Ensembl version for the given species in the local map and in the Ensembl versions file.
      Parameters:
      ensemblDatasetName - the dataset name of the species to update, e.g., hsapiens_gene_ensembl
      ensemblVersion - the new Ensembl version
      Throws:
      IOException - if an IOException occurs
    • getEnsemblVersionFromFile

      public Integer getEnsemblVersionFromFile(File ensemblVersionsFile, String species) throws IOException
      Gets the Ensembl version of a given species from a file.
      Parameters:
      ensemblVersionsFile - the Ensembl versions file
      species - the species of interest
      Returns:
      the Ensembl version
      Throws:
      IOException - thrown whenever an error occurred while reading the file
    • getEnsemblSpeciesVersions

      public HashMap<String,String> getEnsemblSpeciesVersions(File ensemblVersionsFile) throws FileNotFoundException, IOException
      Gets the information contained into the Ensembl species file.
      Parameters:
      ensemblVersionsFile - the Ensembl species file to read
      Returns:
      The Ensembl versions for each species
      Throws:
      FileNotFoundException - if an FileNotFoundException occurs
      IOException - if an IOException occurs
    • loadEnsemblSpeciesVersions

      public void loadEnsemblSpeciesVersions(File ensemblVersionsFile) throws IOException
      Loads the given Ensembl species file.
      Parameters:
      ensemblVersionsFile - the Ensembl species file to load
      Throws:
      IOException - if an IOException occurs
    • downloadMappings

      public boolean downloadMappings(WaitingHandler waitingHandler, String speciesName, String ensemblDatasetName, EnsemblSpecies.EnsemblDivision ensemblDivision) throws IOException
      Try to download the gene and GO mappings for the currently selected species.
      Parameters:
      waitingHandler - the waiting handler
      speciesName - the name of the species
      ensemblDatasetName - the ensemblDatasetName
      ensemblDivision - the Ensembl division
      Returns:
      true if the download was successful
      Throws:
      IOException - exception thrown whenever an error occurred while reading the mapping files
    • getGeneMappingFile

      public static File getGeneMappingFile(String ensemblDatasetName)
      Returns the gene mapping file.
      Parameters:
      ensemblDatasetName - the Ensembl dataset name
      Returns:
      the gene mapping file
    • getGoMappingFile

      public static File getGoMappingFile(String ensemblDatasetName)
      Returns the GO mapping file.
      Parameters:
      ensemblDatasetName - the Ensembl dataset name
      Returns:
      the GO mapping file
    • getEnsemblVersionsFile

      public static File getEnsemblVersionsFile()
      Returns the Ensembl version file.
      Returns:
      the Ensembl version file
    • getGoDomainsFile

      public static File getGoDomainsFile()
      Returns the GO domains file.
      Returns:
      the GO domains file
    • getEnsemblVersion

      public String getEnsemblVersion(String latinName)
      Returns the Ensembl version for a given species.
      Parameters:
      latinName - the Latin name of the species
      Returns:
      the Ensembl version for a given species.
    • newVersionExists

      public boolean newVersionExists(String latinName)
      Returns true if a newer version of the species mapping exists in Ensembl.
      Parameters:
      latinName - the Latin name of the species
      Returns:
      true if a newer version of the species mapping exists in Ensemble