FragmentationAnalyzer

FragmentationAnalyzer is a tool for analysing MS//MS fragmentation data.

Currently nine different analysis types are supported:

Project home page:
http://compomics.github.io/projects/fragmentation-analyzer.html.




Contents



Supported Input Data

FragmentationAnalyzer currently supports three input formats:

In addition any files that can be converted to the supported text file format can be used. (See the importing data section for details on how to import data.)


Go to top of page



Tutorial

Opening Data Sets
After starting the tool the data selection dialog will be shown. (The dialog can also be opened from the File menu by selection 'Open'.) You can either import new data sets from one of the three supported formats, or use one of the already imported data sets. For this tutorial we will use the example data set provided with the tool. See the Import Data section for details on how to import data. For now, select 'example data set' in the list of available data sets and click on 'Open Data Set'.

The tool will now open the data set. Note that depending on the size of the data set this can take some time. When finished a dialog verifying that the data has been opened will be shown and the drop down menus in the left part of the screen will be updated with the data from the opened data set.

Selecting Search Parameters
To perform a search select the properties of the identifications you are searching for in the lists. For example: instrument as ESI-QUAD-TOF, N-term as NH2, C-term as COOH and charge as 2. The numbers behind the terms are the total number of occurrences of that particular term in the data set. Note that these numbers are for the whole data set and not for the subset of the data set that you are currently selecting. So even though all your selections have a high occurrence number, this does not automatically result in identifications matching all your selections. Leave the modification selection empty for now.

General Search
There are two search options available: general search or modification search. We will first look at general search which basically returns all identifications matching all the selected parameters. Click on the 'Search' button in the lower left corner to start the search. (Again note that depending on the size of the data set the process might take some time completing.)

Search Results
When the search is completed a dialog presenting the main findings will be shown, number of matches etc, and the results will be inserted into the 'Search Results' table at the upper right part of the screen. The results are sorted on the number of occurrences of each identification, such that the most frequent are at the top of the list.

Individual Spectra
Select a subset of the identifications by clicking in the rightmost column of the table. Then select an analysis type in the 'Select Analysis Type' drop down menu. First try 'List Individual Identifications'. Then click the 'Analyze / Plot' button to the right of the drop down menu. The list of individual spectra is then shown in the 'Individual Spectra' table in the middle of the right part of the screen.

Analyze/Plot
Select a couple of the items in this table and again select an analysis type in the 'Select Analysis Type' drop down menu for the 'Individual Spectra' table, for example 'View Spectra'. Then click the 'Analyze / Plot' button for the 'Individual Spectra' table. The 'Search Results' and 'Individual Spectra' section will then close and the 'Plot / Analyses' section will be expanded showing the just created plot(s). The closed sections can easily be opened again by clicking on the section header. Clicking ones more will again close the section. Make sure that the 'Plot / Analyses' section is visible before continuing.

Resizing Plots
In the plots section each plot is located in its own separate frame. The frames can be resized and maximized individually. To maximize a plot click on the plots maximize button in the upper right corner (or double click on the title bar).

Plot Types
Currently nine different plot types are supported:

Plot Options
For the scatter and bubble plots two additional options are available. First, the results can either be combined into one plot, or each selected row can result in one plot. Chosen by selecting either 'Single' or 'Combine' in the drop down menu next to the 'Select Analysis Type' menus. Second, is the option of using absolute (Dalton) or relative (ppm) distance measurement when plotting. Again selected in a drop down menu next to the 'Select Analysis Type' menus.

The size of the bubbles (the scaling factor) can be altered by selecting 'Preferences' on the 'Edit' menu. Note that any changes only affect future plots. Existing plots are not updated.

Plot Tool Bars
Each plotting type has a set of additional options, which are accessed by right clicking on the title bar of the plot's frame. The options includes ways of refining the data shown, for example by turning on or off the different data series.

The non-spectrum plots also have an additional set of options that can be accessed by right clicking on the plot itself. These options include zooming, export/save plot amongst others.

Note that there are to export/save options for most plotting types. For high-detail figures it is recommended to use the export to SVG feature found in the popp menu occuring when right clicking the title bar of the plot.

Spectrum Plot Options
To zoom in a spectrum plot click and hold the left mouse button where you want to start the zoom, and then drag in the direction you want to zoom, marking the area to be zoomed. Note that all the spectra plots are linked, so zooming in one will result in zooming for all spectra. To do manual de-novo-sequencing click on one peak. The distance and amino acids matching this distance (if any) will then be shown. To add the sequencing, click on the second peak. Repeat the process to sequence more peaks. An added sequence is removed by holding down the Ctrl button when clicking on the sequence. To "store" a sequence, hold down the Alt button and click on the sequence, the sequence turns red. To remove such "stored" sequences, hold down Ctrl and Alt and click in the sequence.

Close All Plots
To close all the plots in one operation, right click on the title bar of one of the plots and select the 'Close All' option from the appearing popup menu.

Modification Search
While the general search simply finds all identifications matching the selected parameters, the modification search is a little more advanced. Use the same parameters as for the general search example (instrument as ESI-QUAD-TOF, N-term as NH2, C-term as COOH and charge as 2) but this time select '<Mox>' in the Alt 1 modification parameters drop down menu, select 'Modification Search' and click the 'Search Button'.

Identification/Sequence Pairs
For modification searches you are trying to find identification pairs where one of them are modified with the selected modification and the other is unmodified. You therefore have to select the minimum number of such pairs required before a pair is used. Generally you want as many matches as possible, e.g., 30+, but the data set used in this tutorial is not big enough for that, so reduce the number to 2.

Intensity Box Plots
When the search completes you will get one match. Select this match, select the 'Intensity Box Plot' analysis type and create the plot. The created plot presents the difference in relative intensity between the different fragment ion types, for both the modified and the unmodified identifications.

Normalization
In order to be able to compare the identifications coming from spectra with varying total intensity total intensity normalization is used to normalize the intensity of the used fragment ions before the comparison is made.


Go to top of page



Importing Data

Data can be imported from three different sources:

ms_lims
For ms-lims one logs on to the ms_lims database via a dialog in the tool using ones normal login details. When connected all the required details about the identifications will be downloaded, while some details, e.g., the fragment ion information, is not downloaded but extracted when needed. The database connection will therefore be required during the use of the tool. Please note that depending on the size of the database the process of importing data from ms_lims might take a while. However, the progress of the import will be monitored closely and presented to the user.

Mascot Dat Files
When importing Mascot dat files one simply selects the set of dat files to import and select the Mascot confidence level to use for the identifications. Only identifications above the selected confidence will be imported.

OMSSA OMX Files
Importing OMSSA omx files is done in the same way as for Mascot dat files (except for the setting of the Mascot confidence level of course). However, the instrument name is not included in the omx file and has to be provided manually by the user for each imported file. Also note that the omx file includes very little details about the amino acid modifications, only a number <1>, <2> etc. The OMMSA installation folder (containing the mods.xml and usermods.xml files) therefore also has to be provided.


Go to top of page



Text File Format

When a data set is imported into FragmentationAnalyzer it is divided into three parts:

identifications.txt
For ms-lims data only the identifications.txt file is created. The remaining information is extracted from the database when needed. However, a file called 'ms_lims.prop' is also created containing information about the database used.

identifications.txt is a tab separated text file where the first line includes the number of lines in the file, i.e., the number of identifications. The rest of the file consists of one row per identification with the following elements:

Either spectrum file name or spectrum id has to be provided, but the other can be set to "null".

fragmentIons.txt
fragmentIons.txt is also a tab separated consisting of one row per fragment ion with the following elements:

The following fragment ion type names are recommended and will result in the best integration with the tool:

Spectra Folder
For non-ms_lims data sets the spectra are stored as pkl files in a folder called 'spectra'. One file per spectrum. The first line in each file contains the precursor m/z, intensity and charge. Next follows one line per peak in the spectrum with the m/z and intensity values.

For more details see the example data set or the source code.


Go to top of page