Gene Ontology Enrichment Analysis

Introduction

Gene Ontology Enrichment Analysis (GOEA) analyzes the frequencies of gene ontology terms in your dataset and compares these to the frequencies of the same GO terms in the species specific version of Ensembl. In order to not get too many terms the GOSlim UniProtKB-GOA is used.

GOEA shows if certain GO terms are found more or less often in your dataset compared to the distribution of the same terms in Ensembl. To calculate the significance PeptideShaker employs a Hypergeometric test. This divides the GO terms into three groups:

Note that the GO Enrichment Analysis only supports UniProt accession numbers! Also note that only proteins in your dataset mapping to GO annotated proteins are counted in the analysis. Finally, note that the statistical analysis is only correct as long as the selected protein set is unbiased.


More on the Gene Ontology Enrichment Analysis


Species

A long list of species is included, and the species is selected when setting up the project. The species can also be changed via the Edit menu: Edit > Species.

If you experience any problems with any of the species, please contact us by sending an e-mail to the PeptideShaker Google Group or by reporting an issue.

As a default only human GO term to protein mappings are included. The other species can be downloaded if needed. Simply select the species in the drop down menu and click the Download button to start downloading the GO term to protein mappings for that species. For species where the mappings already have been downloaded these can be updated to the current release of Ensembl by clicking the Update button.


More on the Gene Ontology Enrichment Analysis


Ensembl Version

PeptideShaker always uses the latest version of Ensembl when downloading the GO term to protein mappings. The version used for a given species is shown behind the species name in the species drop down menu. Ensembl is updated with the latest data on a monthly basis and each version is given a number, e.g., Ensembl 69.


More on the Gene Ontology Enrichment Analysis


GO Plots

Two GO plots are available: a Significance Plot and a Distribution Plot. The Significance Plot highlights the significant GO terms, while the Distribution Plot shows the frequencies in percent for your dataset compared to the whole of Ensembl.

The plots are closely connected to the table with the GO terms. Selecting a row in the table will highlight the same term in the plots and visa versa. GO terms can be excluded from the plot by deselecting them in the final column in the table. Right-clicking in the table will bring up a menu with shortcuts for selecting/deselecting all, and for selecting only the significant GO terms.

The order of the terms in the plot is also the same as for the table. Click the table column headers to re-order the table and plots.


More on the Gene Ontology Enrichment Analysis


Proteins

The proteins in your project annotated with the currently selected GO term is shown in this table. If the table is empty make sure that you have selected a row in the Gene Ontology Mappings table at the top. If the table is still empty the this means that there are no proteins in your project mapping to the given GO term.


More on the Gene Ontology Enrichment Analysis