Chapter 8
Exploring Analysis Results

Delta2D displays quantitative data in flexible tabular views (see figure 8.1) that fit your analysis needs. Table rows can be filtered and sorted by numerical and non-numerical columns, making it easy to identify relevant sets of spots. The table display is always synchronized with the spot boundaries on the gel image view, so you can go from image to data and back again with ease.

8.1 The Quantitation Tables


ui/quantitation_table_a_scaled
Figure 8.1: The quantitation table

The Quantitation Tables give you access to your data in three basic types of representation:
Single gel tables
show all available quantitive data to one single gel image.
Multi gel tables
show the same data as the above type plus data comparing the spots of the gel images to each other, whereas the
Statistics table
shows one selectable subset of the above data in a statistical evaluation for each group plus data comparing the spots groupwise.

The Quantitation Table can be opened manually if there is quantitative data by choosing Spots |\ Show Table or via the menu Window |\ Quantitation Table. The latter menu item opens the general quantitation table window as described below, whereas the first menu item additionally invokes a multi gel table with the two gel images represented in the Gel Pair View. There are several tabs at the bottom of the table:

The single gel tables are labeled each with the name of the respective gel image. The multi gel tables are labeled with either the names of the gel images involved, or, if containing all gel images, simply with All gel images. The statistics table is labeled Statistics.

Tab borders are color coded according to your current color scheme.

Single Gel Tables

Single gel tables show only data for one gel. Each row represents one spot and the columns show the data of this spot. For a more detailed description of this data, please refer to table 8.1.

Multi Gel Tables

Usually, you will only see one table of this kind: the All gel images table. But you also have the possibility to compare two gel images directly to each other. Simply open the two gel images you want to compare in the Gel Pair View. Then open the tables on the first way described above: choose the menu item Spots |\ Show Table in the Gel Pair View. Basically, the structure is the same as in the single gel tables, except that each row in the table represents a combination of corresponding spots. If no correspondence could be found for a spot, it is placed on a row by itself. The columns, a more detailed description of which you will find in table 8.1, are repeatedly represented, once for each gel image. The column headers are color coded as well to make it easy to see from which gel the data in the column was taken.

The Statistics Table

This is the first table you get to see when opening the Quantitation Table window. It gives you a statistical overview to the obtained data, sorted by groups. As in the other table views, each row represents a spot and its correspondences. But unlike the other tables you only have one column per gel image and seven columns with statistical data per group: minimum, maximum, mean, relative standard deviation, number, ratio, and t-test.
minimum
shows the lowest value of this and corresponding spots in the whole group.
maximum
shows the highest value of the set of corresponding spots in this group.
mean
is the arithmetic mean of this set of spots in this group.
relative standard deviation
indicates the standard deviation within this set of spots in percent of the total of the group.
number
indicates how often this spot is represented in this group.
ratio
shows the ratio for a certain parameter of the min/max/mean of this group to the min/max/mean of the group where the most left gel image in the project matrix belongs to. Choose the parameter and the function min, maxor mean at the top of the table.
t-test
Based on the Student's t-test algorithm, the data in this column indicates the error probability for the assumption, that this group belongs to the same parent population as the group where the most left gel image in the project matrix belongs to.

As an example for how the ratio columns in the Statistic Table are calculated, imagine the settings at the top of the table are Spot property: %Volume, ratio=sample groups mean / group control mean. The ratios are calculatedby the following procedure: The columns in the Statistic Table are sorted by groups, while the groups are sorted according to their order in the project manager from left to right. For every group, except for the first one in the Statistic Table, the mean of the normalized volume (%Volume) is calulated and divided by the mean of the normalized volume over the first group.


Column Description
mark Check this box to mark or unmark a row.
hide Check this box to hide a row (it will be hidden immediately).
cancel Check this box to cancel the spots in a row. Canceled spots are excluded from further analysis.
Normalization set here you can select a subset of the spots that will be used to normalize the quantities of the spots on a gel. By default, all spots are in the normalization set. This results in relative spot volumes being computed by setting total spot volume on a gel to 100%.
Symmetric ratio this column will display the symmetric ratio of the relative volumes. The difference between the symmetric ratio and the conventional ratio column is that for a 1:2 ratio the symmetric ratio column will show "-2", meaning a two-fold decrease. For the opposite 2:1 ratio, both columns will display 2, meaning a two-fold increase.
% V The relative quantity of the spot, excluding background. The total quantity of all spots on the gel is 100%.
ratio The numerical expression ratio (sample spot / master spot). Depending on your settings in the Tables tab in the options dialog (please refer to section 9.6) this column shows the ratio as mathematical ratio or as fold change. Additionally it can contain color coded representation of the ratio.
V Volume, i.e. the absolute quantity of the spot, in gray units, excluding background. One black pixel with no background has absolute quantity 1.
A The area of the spot.
bg The background volume for the spot.
avg The average intensity of the spot, including background.
ID The numerical ID of the spot.
frq Frequency, counts in how many rows this spot is displayed.
label One or more labels attached to this spot.

Table 8.1: Columns in the quantitation table.

Changing the Table Layout

To change the width of a column, just place the mouse pointer in the table header between two columns. When you see that the mouse pointer changes, click and drag to the left or to the right until the desired column width is reached. A column can be moved by clicking into its header and dragging it to the left or to the right. You can hide a column completely by using the entries in the Column menu.

Table Properties

A quick and effective way to customize the Quantitation Tables is to open the Properties dialog of the Quantitation Tables by choosing Column |\ Table Properties from the Quantitation Tables' menu. This dialog offers you the opportunity to set quickly or detailed the layout of the table.


ui/dialog_table_properties_scaled
Figure 8.2: The properties dialog of the quantitation table

This dialog consists of a table and four buttons, which let you change the following options:

The table:

Ratio Master
Here you determine which gel should be the one, the calculation of ratios refers to.
Visible
Check the gel images you want to see in your multi gel table. Visibility also applies to the Gel Regions View
Gel
The list of all gel images included in your project.
. . .
All the columns as described in table 8.1. Check those columns for those gel imags you want to set visible.

The buttons:

All Columns
Click here to set all columns for all gel images to be visible.
Quantity
Click on this button to set only those columns visible, which are related to quantitative data of spots, plus ID, Label, and coordinates of spots.
Ok
As in any other dialog: Apply changes and close dialog.
Cancel
Discard changes and close dialog.
Apply
Apply changes but let dialog open.

8.2 Working with Spots

Sorting and Selecting Spots

Now let us sort the table by the relative volume of the master spots. Just click into the lower part of the column header. A small arrow indicates the sort order, click again to sort in reverse order.
ui/quant_table_sorted_scaled
Figure 8.3: Part of quantitation table, sorted on the fifth column

Sorting makes it easy to identify the most intensive spots or those with a high expression ratio: just sort and then select the top rows. The selected spots will be highlighted in the main window.

You can use any column for sorting, try, for instance, to sort on the color-coded expression ratios.

Select one of the rows by clicking on it. Observe how the corresponding spot segments on the master and sample gel images are highlighted. You can select additional rows by pressing the Control key while clicking on them. Shift-clicking on a row selects all rows up to that row. Dragging the mouse over consecutive rows selects them, too. Use the menu item Edit |\ Select All to select all rows in the quantitation table, and Edit |\ Invert Selection to invert the selection.

You can select a spot in the gel window by clicking somewhere within its boundary. The corresponding row in the table will be selected automatically.

Here is how to select the 10 most intensive spots on a certain gel image:

  1. Switch to the single gel table for gel image: just click on the tab for this image. The tab's border has the same color as the gel image in the project manager.
  2. Click on the header of the column that is labeled %V. The table is now sorted according to master spot volume. In the column's header, you see a little arrow that indicates the sort order.
  3. Click again to reverse the sort order. Rows are sorted in descending order now, i.e. the spot with greatest quantity is in the first row.
  4. Select the first ten rows in the table: Click on the first row and drag down to the tenth line. You can watch in the title how many rows you have selected.

Watch how the ten most intensive spots are highlighted on the gel view, as well.

Selecting Spots in the Gel Image Pair View

You can select a spot in the gel image view by clicking on it. Make sure that the spots tool is activated before you select spots. Additionally, you can select spots in a rectangular region by dragging with the mouse.

Hiding Spots

The gel image view will always reflect the contents of the quantitation table, i.e. any spot that is visible or selected in the table will be visible or highlighted respectively in the gel window.

In some situations, it may be useful to hide some spots from the analysis. You can do this by checking the box in the "hide" column. The row will be hidden immediately. Hide a group of rows by selecting them and using View |\ Hide Selected Rows. Since the quantitation table is synchronized with the main window, spots you hide in the table will also be hidden in the main window.

Check View |\ Show Hidden Rows to see all hidden rows again. You can now click in the hide column to mark a row as visible or invisible — the display will not change. Use View |\ Hide Selected Rows and View |\ Don't Hide Selected Rows to control the visibility of whole groups of rows. Unchecking View |\ Show Hidden Rows will let your changes take effect.

Of course, all these tunings can be done on any tab of the table.

Canceling Spots

A canceled spot will be excluded from the analysis just as if it would have never been detected. Single spots or rows can be canceled by clicking on the check box in the cancel column. You will sometimes want to cancel spots in a region such as the border of the image. To do this, activate the spots tool in the gel image view and select the region by dragging with the mouse. Right-click to open a context menu and select cancel to cancel all spots you have selected.

Marking Spots

You will often want to concentrate on a subset of spots, such as those with a high expression ratio. Sometimes you will select spots individually, based on your own criteria. For this purpose, Delta2D lets you make a "note" on a spot, in the mark column. You can add new marks at any point of your analysis, building an increasing set of interesting spot pairs. Later you will see how to display only marked spots, or how to do other things to spots that are marked.

A single spot can be marked by clicking on the check box in the "mark" column. When you're in a correspondence view of the quantitation table then this will mark all spots in the row. Mark multiple rows by first selecting them and then choosing Mark |\ Mark Selected Rows. Marks will always be added to what you have already marked. Marks can be cleared using Mark |\ Unmark Selected Rows.

More advanced operations can be executed by combining selection and marking. Say, you have first identified all the interesting spots by marking them and now you want to hide all other spots:

  1. use Mark |\ Select Marked Rows to select all the marked rows
  2. use Edit |\ Invert Selection to select only the rows that are not marked
  3. use View |\ Hide Selected Rows to hide all rows that are not marked

Similarly, clearing all marks can easily be done by choosing Edit |\ Select All and then Mark |\ Unmark Selected Rows. To see which rows you have marked, click on the mark column for sorting, this will separate marked from unmarked rows.

Counting

Delta2D helps you count how many spots are visible or selected in a table. Counts are displayed in the table's title bar. In a single gel table, it may look like this

Rows: 1048 / 1048 / 4

These numbers represent the number of total / visible / selected items. In the example, there are 1048 spots in total, of which 1048 are visible and 4 are selected.

Select the master - sample tab to switch to the correspondence table. Select a few rows in the table and watch the table's title bar. Here, a row means a spot correspondence, for example:

Rows: 3598 / 2119 / 12  Master: 2205 / 1409 / 12  Sample: 2265 / 1451 / 8


ui/quantitation_table_counts_scaled
Figure 8.4: Automatic counting in the title bar of the correspondence table.

Again, these numbers represent the number of total / visible / selected items. In the example, Rows: 3598 / 2119 / 12 means that there are 3598 rows in total, of which 2119 are visible and 12 are selected. Likewise, the title bar shows that there are 2205 master spots, 1409 of them are visible, and the selection includes 12 master spots. All counts are automatically updated when you hide or select rows.

Filtering

Sometimes you want to focus the analysis on spots that meet certain criteria, say those with an expression ratio between 2 and 5. Of course, you could sort according to the expression ratio column and then select those spots manually, but there is a much more convenient way to do this: use a filter. A filter will show only those rows that meet your criterion. Filters can be set on most columns, see the Filter menu for all available filters.

Let's start with using filters. Firstly we only want those rows to be displayed whose expression ratio is between 2 and 5. Choose Filter |\ Ratios |\ sample / master to get a filter dialog, or simply click on the button labeled "Filter" on top of the appropriate column.


ui/dialog_filter_a_scaled    ui/dialog_filter_b_scaled
Figure 8.5: Editing a table row filter. Here, Delta2D will show only spots that have expression ratios between 0.5 and 2.

Click on the one of the check boxes labeled Active to activate the filter. Enter 2 into the left Ratio field named First Border. Now enter 5 into the right Ratio field (Second Border). In the upper part of the dialog you can watch in the histogram which range of spots will be displayed. You can also use the sliders below the histogram to shift the borders of the displayed range up and down. If the movement of the sliders is not fine enough adjustable for your purposes, you can resize the dialog window to a bigger size by dragging its borders like any other window.

Another convenient way of determining borders for your filter is given by the fields above the histogram: the top most row refers to the total of all absolute values this filter refers to. You can use any value between 0 and this total to indicate how big the ranges of the low, the middle and the high interval should be.

In the second row you can determine the size of these ranges by a relative value, e.g. set the low range of the filter to 20% of the total of all values.

The third and fourth row give you the count of all values you want to apply this filter to as absolute resp. relative numbers. This makes it easy to set the lower border of the filter to let's say the 150 smallest values, or, in the fourth row, you could type in 10 to set the border to the 10% smallest values.

Press OK to save the changes you have made to the filter and close the filter dialog, or Apply to just apply the changes without closing this dialog. You can directly switch to another filter without having to close and reopen this dialog. The table, as well as the gel image, contains now only those spots whose expression ratios lie between 2 and 5. By looking at the table's title bar, you can tell how many rows meet your criterion. You can continue to work with the filtered table as usual.

The button in the header of the ratio column has changed to a short description of the filter. Leave your mouse pointer over the button for a while to get a tool tip that contains a more detailed description.

Filters can be combined to implement more complex criteria, such as "show all rows with expression ratio between 2 and 5 and master spot volume greater than 0.1".

Example: Showing Spots Whose Quantities Differ by More Than a Factor of Two
Problem:
You want to focus on spots with a "significant" expression ratio, i.e. the expression ratio should be less than 0.5 or greater than 2.
Solution:
Use a negated filter that shows only rows whose expression ratio is not between 0.5 and 2. Choose Filter |\ ratio your gel image names to get a filter dialog and enter the data as shown in Figure 8.6.
ui/dialog_filter_c_scaled
Figure 8.6: A filter that hides expression ratios between 0.5 and 2.

Scatter Plots

In addition to the numerical possibilities to identify distinctive spots, Delta2D offers as visual tool a scatter plot. Scatter plots show the ratios of the relative volumes in two gel images. You can produce a scatter plot by going to the Project Manager, right-clicking on a gel pair and choosing menu item Scatter plot. Or, from the gel pair view, you can use the shortcut Ctrl + L, or choose the menu item Spots |\ Show Scatterplot.
ui/scatterplot_scaled
Figure 8.7: The scatter plot

As any other part of Delta2D, the scatter plot interacts with the other parts. If you have selected one or more matched spots in the gel pair view, they are selected in the scatter plot too and vice versa.

You can zoom in to the scatter plot by 'drawing' a rectangle around the region you want to magnify. To do this, click and drag with the left mouse button from the top left to the bottom right. To reset the view simply click and drag in any other direction.

8.3 Gel Image Regions Window


ui/regions_scaled
Figure 8.8: Same region of four different gel images

Back to our project in the project manager. The Gel Image Regions view lets you display the same image region of all gel images in the project side by side. Open the Gel Image Regions window by choosing the menu item Window |\ Gel regions. The view looks similar to Figure 8.8, spots will be displayed (and highlighted) if they are present on a gel. You can use the scroll bars to move the region that is displayed, or simply click in one of the views while holding the Alt key and drag in the desired direction.

When you have opened a Dual View window, its displayed area will determine the segment shown in the Gel Image Regions window.

8.4 Spot Color Coding

Color coding for spots lets Delta2D display a gel image (or a proteome map) with spots colored according to their expression profiles. For an example, see the figure below: Spots are colored by the following scheme: Spots that are increased in sample 1, and in no other sample are shown in red, green is for spots that are increased in sample 2 etc. Yellow is for spots that are increased in samples 1 and 2 etc.


imageprocessing/ColorCodingSpots_scaled

imageprocessing/ColorCodingLegend

Figure 8.9: A Region with Colored Spots. The color of a spot indicates on which sample(s) it is increased.

Start Spot Color Coding by clicking in the menu on Window |\ Color Coding in any window of Delta2D. A new window will open, letting you determine the type and settings for color coding. Color coding can use two basic criteria for coloring the spots:
Subsets
The subset of gel images on which a spot occurs is crucial for the color it will be represented with.
Min/Max
The spots are colored by the gel on which they have their maximum or minimum volume.

Switch to the tab containing the options for the type of color coding you want to achieve.

Color Coding by Subsets

This option gives you an overview of the matches for every spot on a given gel. First, select the gel image which will be used as "background" for the colored spots. Then determine the subsets of matches you want to see: Every column in the Color Coding Scheme specifies a combination of matches and a color. If a spot in the master gel image matches spots from subset of gel images specified in that column, the spot will be shown in the color of that column. To add a new match subset, click on the icons/add_one16 button. This will add a new empty subset, which you can configure by clicking on the boxes in the column. Note that if the empty subset already appears in the table, clicking the button will have no effect. To remove a selected subset, click on the icons/del_one16 button. You can select a subset for deletion by clicking on its column header. To add all possible subsets, click on the icons/add_all16 button. Afterwards, the table will contain one column for each possible combination of matches across the gel images. To delete all existing subsets, click on the icons/del_all16 button.
imageprocessing/ColorCodingDialog_scaled
Figure 8.10: Choose Colors and Master Image

Color Coding by Min/Max

This option enables you to highlight which group contains the spot for which a given characteristic is most strongly or weakly displayed. Select the characteristic that you want to highlight. Each spot will be assigned the color of the group that most strongly or weakly represents that characteristic.

Example: Color Coding Spots by Subsets

By combining Spot Color Coding with spot filtering, you can visualize various aspects of your experiment. As one example, let's make a proteome map that shows which spots are increased under which conditions (or combinations of conditions).
Step 1: Detect Spots
First you need to detect spots.We recommend that you do this on a union-fused image and transfer spots to all the images that you want to include in the color coding.
Step 2: Show a Subset of Spots on Each Image
The color code will show on which gel images a spot is visible. We want to see where a spot is increased relative to its "standard" volume on the master (control) image. Therefore we filter out the non-increased spots on each of the sample images. Go to the all gel images table and set a filter for a factor of two or greater on the ratio columns. As a result you will see on every single gel only the spots whose intensity increased relative to the master.
Step 3: Choose Colors and Master Image
Right click on a gel image in the project manager and select Spot Color Coding / By Subset. A dialog will appear that allows you to configure the color coding: Select the Union image as master gel image, i.e. the gel image on which the spots will be overlayed. The table is used to configure which subsets should be displayed in which color. The leftmost check box column controls if a group is taken into account for color coding, the second check box column controls if a spot's visibility on a certain gel image will be taken into account. In the screenshot, we have checked the three sample images. On these three images there may be eight different subsets for every spot: it can be visible on sample1, or on sample1 and sample2 etc. Press the Add All button to get a list of all possible combinations. A new color is assigned automatically to each combination. You can change colors by right-clicking on the column and choosing Select color. You can change the combination a color stands for by clicking inside the table. Press OK to open the color coding window.
Step 4: Adjust the Display
In the color coding window you can use the View menu to adjust the display, for example you can choose to use the inverted image (white spots on black background). Color Coding Spots by Intensity There is another variant of color coding where a spot is colored according to the image on which it has minimal or maximal intensity. Open the color coding dialog by going to the project manager, then right-clicking on a gel and selecting color coding > by Min/Max. The dialog is similar to the dialog for color coding by subset. You can select one color per gel image, as well as the parameter (volume, area etc) to use for color coding. Exporting the Color Coded Display Use Export to PowerPoint in the File menu to export the color coded gel image to PowerPoint. You can also make a Snapshot window (using File/Snapshot) which can then be exported to a variety of image formats.

8.5 Expression Profiles Window


ui/bar_charts_window_scaled
Figure 8.11: The expression profiles window

Exactly like the Expression Profile Rollup, the Expression Profiles Window shows you a quick survey of the distribution of selected spots in the gel images of your project, sorted by groups. But unlike the Rollup, the Expression Profiles Window can do this for as many spots as you want.

Simply select interesting spots in the Quantitation Table or in the Gel Pair View. Mark the spots in the Gel Pair View by clicking right on one of the selected spots and check the box named Mark spot, or, in the Table, by selecting the menu item Mark |\ Mark Selected Rows. Now you can open the Expression Profiles Window in any of the other windows with the menu item Window |\ Expression Profiles.

The window shows an amount of little graphs, one for each spot you have selected. You can determine their size and distribution with the controls on top of the window. If the spots are labeled, each graph shows the label of the spots he represents in his title. If they are not, you can easily identify them by right clicking on the respective graph. The now upcoming context menu not only shows you the ID's of the spots involved, but also gives you the opportunity to change their mark status.

Additionally, you can change the design of the graphs and the data shown in the menu item View:

Show Group Bars Collapsed
Combine all single bars of gel images of one group to one single bar for the whole group.
Show Mean Values
Shows the mean values for each group.
Show Standard Deviation
Shows the standard deviation for each group.
Connect Mean Values
Changes the representation of values by bars to a line which connects the mean values, thus signaling the fold change of this spot.
Show Axis
Shows a scale on the left border of each graph plus axis to make it easier to read the volume of each spot.

As in any other part of Delta2D, selection of spots will be actualized between windows.

________________________________________________________________________________

Note:
Please note that with multichannel projects (e.g. DIGE setups, for details on multichannel techniques please refer to section 4.7) it may occur that spots are not to be seen in Expression profiles: Spots on the standard gel image are used as normalization, which means that matching spots on other gel images refer to these spots as to 100%. Due to this, spots on other gel images that have no matching spots on the standard do not appear in any representation of Expression Profiles.

8.6 Statistical Analysis of the Results


tmev/screenshot-1_scaled

Since version 3.6 Delta2D incorporates advanced multivariate statistics in the analysis of 2D gels, including:

The algorithms are adapted from the TIGR Multiple Experiment Viewer (MeV, version 4.0, tm4.org/mev.html, Saeed et al. 2003) and tightly integrated into the image analysis workflow. With Delta2D's Complete Expression Profiles, there are no missing values, and matching problems are virtually eliminated. This makes Delta2D especially well suited for the methods that were originally applied in the context of DNA microarray analysis.

Getting a High Level Overview of Expression Data - Heat Maps


tmev/marmoset_standardized_data
Figure 8.12: A Heat Map

Heat maps are a well-known visualization method for expression data from DNA microarrays. Expression profiles are in the rows, gel images in the columns. The legend across the top shows the color code for spot intensities. Rows are labeled based on the spot labels from the gel images. By default, data is standardized to zero mean and unit variance before being shown in the heat map. Other options for normalization are available in the Analyze menu of the statistics table.

Let us make a heat map:


ui/dialog_table_properties_scaled
Figure 8.13: The properties dialog of the quantitation table


tmev/analyze_button
Figure 8.14: Start analysis from the Quantitation Table

Clustering Images: What Image Groups or Classes Are There?

Clustering methods can group expression profiles and gel images by similarity. This can be very useful for getting an overview of all expression profiles before proceeding with more detailed analyses. Clustering of gel images can also be used to detect outliers, and to identify structures in the experiment. Ideally, the cluster composition will reflect the structure of the experiment, e.g. replicates and images from the same sample should have similar expression levels and thus end up in the same cluster.


tmev/sample_clustering_scaled
Figure 8.15: In this clustering you see an experiment with control (C1, C2, C4, C5) and treated (T1, T2, T3, T4) samples, made in triplicates. The clustering rediscovers the experimental setup, i.e. gel images with similar samples share a cluster. A sample forming a separate cluster would indicate an outlier for which closer inspection is advisable. Made using Pearson correlation as the similarity measure between images.

Let us make a hierarchical clustering to show more structure in the data:

The hierarchical clustering groups both samples (gel images) and expression profiles. The cluster hierarchy is shown in a tree display. As you can see, replicates are clustered together, indicating higher similarity, as we would expect.

Clustering Expression Profiles: Finding Correlated Proteins

tmev/support_tree_result
Figure 8.16: Spots with similar expression profiles are clustered together. Support Tree clustering with Euclidean distance.

Clustering of expression profiles is done to identify proteins with similar behavior, implying that they are co-regulated or at least correlated. The global nature of the cluster display allows for a broad overview and the forming of hypotheses that can then be tested (fig. 8.16).

Discovering Patterns in Expression Profiles


tmev/clustering_tree_dialog
Figure 8.17: Cutting a tree by a distance threshold. Use the slider to adjust the threshold.

One can regard the mean (or median) of a cluster as a kind of "typical" expression profile. The clustering displays allow you to split the set of expression profiles into separate subsets:

tmev/cluster_profiles tmev/cluster_cut

Figure 8.18: Combined expression profiles in 12 clusters.

Finding differentially expressed proteins: Statistical Tests

Methods for statistical hypothesis testing in Delta2D are based on state-of-the-art algorithms that are applied in the context of DNA array analysis.


tmev/t_test_cluster
Figure 8.19: Result of applying t-tests (control vs. treated) to expression profiles. Profiles and images were clustered to better visualize differentially expressed proteins. P-values are based on 1000 permutations, false discovery rate is controlled to be 5 elements or less (with overall alpha=1%).

In the simplest case, the experiment is a comparison of two samples, e.g. diseased vs. control tissue, mutant vs. wild type etc. The task then is finding those proteins that show significant differences in expression levels. Certainly the most popular test in this area is Student's t-Test, where the null hypothesis is that the means of expression levels in samples A and B are the same. Rejecting the null hypothesis then means that the protein under test is differentially expressed.
No normal distribution of spot intensities required

One has to keep in mind that the classical Student's t-Test makes the assumption that spot quantities within replicates follow a normal distribution which should be tested separately. Depending on the staining method you use and other factors, spot quantities within replicate gels may not be normally distributed. Therefore it is advisable to use one of the provided methods that are based on permutations.

In the t-Test options dialog, choose "p-values based on permutation" and either "Use all permutations" or "Randomly group samples" and enter "1000".

Controlling the False Discovery Rate

When applying statistical tests to 2-D gel data, one is faced with the so-called multiple hypothesis testing problem: For each expression profile, a separate test is done. Each test has a certain probability of giving a false positive result, i.e. a protein spot is declared to be differentially expressed while the difference was due to pure chance. The large number of tests can produce a high number of false positives. For example, in an experiment with 2000 spots per gel, an accepted false - positive rate alpha of 5% will result in 100 proteins that are found to be "differentially expressed" although the difference is the result of mere chance.

The MeV t-test module incorporated in Delta2D provides methods to control the proportion of false positives in the result set (False Discovery Rate - FDR). Overall, the False Discovery Rate approach allows one to strike a balance between the need to find statistically valid proteins of interest and the additional cost that is associated with following up on false positives.

In the t-Test options dialog, make sure you selected "p-values based on permutations". Select "Stepdown Westfall and Young methods". Choose bounds for the number of false positive spots in the result set using the "number of false positive genes should not exceed". Alternatively choose a bound for the proportion of false positive spots in the result set, using the other radio button and text box.

Template Matching

With Template Matching, you can define a template for an expression profile and let Delta2D find spots whose expression profiles match the template. For example, in a time series experiment you might want to look for spots whose expression level increases with time.


a) tmev/template_profiles b) tmev/template_centroid
Figure 8.20: a): Expression profiles matching the template. b): Comparison between template (blue line) and matching expression profiles.

Templates can be entered directly by specifying an expression level for every image. Alternatively you can select a spot in the list on the top left of the dialog and use its expression profile as a template by pressing Select highlighted gene from above list to use as template. Increasing the p-Value will include more spots, decreasing p-value will result in more stringent matching. Templates can also be derived from present clusters.

Click on the PTM (Pavlidis Template Matching) button in the toolbar, or choose Analysis |\  Statistics |\ Pavlidis Template Matching from the menu. The Help button (labeled "i" on the bottom left of the dialog) gives more information about the options.


tmev/template_dialog_scaled

Figure 8.21: With Pavlidis template matching (PTM) you can specify a typical expression profile, e.g. one that increases with time.

Principal Component Analysis (PCA): Grouping and Visualization

When you do Principal Component Analysis (PCA) on a set of gel images, you get a two- or three-dimensional visualization of the image set that is optimal in certain sense, i.e. it preserves the variation as much as possible. PCA works by taking spot intensities on every gel image and assembling them into a vector. So an experiment of 24 gel images with 1200 spots each would be represented as a cloud of 24 points in a space with 1200 dimensions. The goal of principal component analysis is then to find a projection of the point cloud in two or three-dimensional space such that as much as possible of the variation of the point cloud is preserved. One hopes that the gels from different samples will be in separate regions of the resulting diagram. The principal components can then be interpreted as "typical spot patterns" or "eigengels". Their coordinates can be analyzed in order to determine which spots are contributing most to the variance, making them candidates for protein identification and biological interpretation.


a) tmev/pca_samples b) tmev/pca_samples_12
Figure 8.22: a): Principal component analysis of 24 gel images in 3 dimensions. Parallels have the same color. The view can be rotated by dragging with the mouse. Again, replicates are placed close together. b): The same principal component analysis of 24 gel images, projected onto the first two principal components. Treated and control samples (reddish vs greenish colors) can be separated.

When principal component analysis is applied to the expression profiles, in our example we would consider a point cloud of 1200 vectors (one vector for each expression profile) with 24 dimensions (the expression levels on the 24 gels). The result is a display of the proteins where (hopefully) proteins with close positions are biologically related. Consider a time series experiment, where proteins are switched on and off in stages. If there is a "hidden parameter", such as a stage in the cell cycle, it will have a systematic influence on the expression levels, and thus increase the variance for the genes taking part in it. This increased variance will then become part of the directions that are used for the projection (the principal components). The principal components were also called "eigengenes", they can be seen as "classes of most prominent expression profiles" see, for example, Alter et al. 2000 and Holter et al. 2000.

tmev/pca_genes

Figure 8.23: Principal component analysis of expression profiles in three dimensions. Differentially expressed spots were determined by t-test and highlighted orange and blue, respectively. Inset: First principal component.

Working with Sets of Spots

In the terminology of the TIGR Multiple Experiment Viewer (MeV), a cluster can be any set of expression profiles or samples (gel images). You can create new clusters by choosing Store Cluster in many displays of analysis results.


tmev/store_cluster

Storing a cluster of expression profiles:

Storing a sample cluster:

In the Cluster Manager you can change any attribute, e.g. cluster colors, or whether the color should be used in displays. Note that clusters may overlap, but only one cluster's color will be used in displays.


tmev/sample_clusters



tmev/venn_A tmev/venn_B

Cluster A Cluster B


When you have multiple clusters you can create new clusters that are combinations of selected ones:

tmev/venn_intersection tmev/venn_union tmev/venn_xor

Intersection of A and B Union of A and B XOR of A and B


In the Cluster Manager, select the clusters you want to combine. Right click, then select the operation you want to perform from the ClusterOperations submenu.

Statistical Analysis is Integrated with Image Analysis

When you select one or more spots in a heatmap display, the selection will be immediately visible in other parts of Delta2D, such as the Dual View, or the Gel Image Regions View. You can extend the selection to a range of rows by holding down the Shiftwhile clicking on the end of the range. You can add or remove a single row by holding down the Ctrlwhile clicking on it.

If you have organized spots of interest in the Cluster Manager, you can use these directly in Delta2D. Just right click on a cluster and choose Select in Delta2D this will select the expression profiles in the cluster throughout all parts of Delta2D.

Getting a Spot Album of Relevant Spots

Using Delta2D's Spot Album Report, it is easy to show snapshots of the statistically significant spots you have found. All you have to do is mark these spots in the Delta2D project:

For more information about Reports see also section 8.7.

Overview of Statistical Methods

The following is a list of methods, for in-depth information please refer to the MeV manual and the original papers cited below.

Clustering

HCL - Hierarchical Clustering Eisen, M.B., P.T. Spellman, P.O. Brown, and D. Botstein. 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95:14863-14868.

ST - Support trees (Bootstrapping) Graur, D., and W.-H. Li. 2000. Fundamentals of Molecular Evolution. Second Edition. Sinauer Associates, Sunderland, MA. pp 209-210.

KMC - K-Means Clustering Soukas, A., P. Cohen, N.D. Socci, and J.M. Friedman. 2000. Leptin-specific patterns of gene expression in white adipose tissue. Genes Dev. 14:963-980.

Template Matching

PTM - Template matching Pavlidis, P., and W.S. Noble 2001. Analysis of strain and regional variation in gene expression in mouse brain. Genome Biology 2:research0042.1-0042.15.

Principal Component Analysis

Raychaudhuri, S., J. M. Stuart, & R. B. Altman 2000. Principal components analysis to summarize microarray experiments: application to sporulation time series. Pacific Symposium on Biocomputing 2000, Honolulu, Hawaii, 452-463. Available at http:/smi-web.stanford.edu/pubs/SMI_Abstracts/SMI-1999-0804.html

Statistical Hypothesis Testing

TTEST - T-Tests

Pan, W. (2002). A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics 18: 546-554.

Dudoit, S., Y.H. Yang, M.J. Callow, and T. Speed (2000).Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Technical report 2000 Statistics Department, University of California, Berkeley.

Welch B.L. (1947).The generalization of 'students' problem when several different population variances are involved. Biometrika 34: 28-35.

ANOVA - One-way Analysis of Variance

Zar, J.H. 1999. Biostatistical Analysis. 4th ed. Prentice Hall, NJ.

TFA - Two-factor Analysis of Variance Keppel, G., and S. Zedeck.1989. Data Analysis for Research Designs. W. H. Freeman and Co., NY.

Manly, B.F.J. 1997. Randomization, Bootstrap and Monte Carlo Methods in Biology. 2nd ed. Chapman and Hall / CRC , FL.

Zar, J.H. 1999. Biostatistical Analysis. 4th ed. Prentice Hall, NJ.

References

Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J. TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003 Feb;34(2):374–8.

Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci U S A 97:10101–10106

Holter NS, Mitra M, Maritan A, Cieplak M, Banavar JR, Fedoroff NV (2000) Fundamental patterns underlying gene expression profiles: simplicity from complexity. Proc Natl Acad Sci U S A 97:8409–8414

TIGR Multiple Experiment Viewer (MeV): http:/www.tm4.org/mev.html

TIGR MeV manual: http:/www.decodon.com/Support/Documentation/MeV

8.7 Generating Reports

Delta2D now offers interactive reports on the current project. They make it easy to present data on relevant spots, experimental setup, and quantitative data. The reports are based on HTML so you can put them on the web easily. Just as easy you can process all or part of a report in your favorite word processor or presentation program by just copying excerpts into it.

All reports can be accessed via the Reports menu in the Project Manager. They are opened in your web browser; If you want to have a closer look on a gel image or a spot, just click on it and it will be opened and focused in Delta2D. You can save the reports in HTML format that is ready to be published on the web.

Project Summary

The project report shows a summary of your analysis project. It includes an overview of gel images and warpings as well as general data about the gel images, groups, samples, and images.

The dual channel images included in the report give a good indication of the quality of the direct warpings in the project. You can open a dual channel image in Delta2D by clicking on it.

Just like all reports you can click the save button to save it in a form that is ready to be published on the web.

Spot Album

The spot album shows thumbnails of marked spots and the region surrounding them. You can show spots in comparison using dual channel images . The album can be configured using the form in the upper part of the report: You can select marked spots from ddifferent gel images, the image they should be compared to, and the width and scale of the gel section that should be displayed.


reports/spotalbum_scaled
Figure 8.24: The spot album report

Next to each spot row you see the expression profile as a chart. Clicking on the expression profile takes you to a detail page that shows additional quantitative data. Click on any spot in the row to select and show it in the dual view.

Spot Quantities

The spot quantities report shows expression profiles numerically, together with group-wise ratios and t-Test values. You select spots for the report by marking them on a gel image. This report is well-suited for documenting a set of relevant spots, and for further statistical analysis.


reports/spotquantities_scaled
Figure 8.25: The spot quantities report

Modifying, Saving, and Printing Reports

All reports are produced in the form of HTML pages that are generated dynamically by Delta2D. This means you can easily integrate them into your current project documentation. Select a part of the page and copy it into a Microsoft Word document, or into PowerPoint. You can save the whole report using the Save button in the top right of the report. Delta2D will then prompt you for a file to which the report should be saved. The report will be saved without the configuration form. The sub-pages (e.g. expression profile details from the spot album) will also be saved and linked properly. The result is a set of HTML files and images that can be put directly on the web.

If you want to make changes to the whole report document it is recommended that you open the saved HTML file in a word processor. Usually, you can print the report directly from your web browser. For more advanced printing needs (e.g. splitting wide pages) we also recommend using a word processing program.

8.8 Exporting Spot Data to Other Applications

All the data you see in the quantitation table can be exported for further use in external programs. In the table window, use File |\ Save... This will save all rows that are visible in the quantitation table, so you can hide rows that you don't want to export.

The data is saved in a common exchange format called "comma separated values" (CSV) that can be imported easily into a spreadsheet or other data analysis programs. For easier reference, the column titles are given in the first line of the file. Saving data in CSV format will take hidden columns and sorting into account, so you can use the quantitation table's sophisticated sorting and filtering to select the rows and columns that should appear in the saved file.

The import procedure depends on the program you use. Generally, you open the data file as a text file, specifying that the data is separated by semicolons.

Label data, label formats and spot data are saved in XML file formats to allow for easy processing using external applications. Detailed specifications of these formats are available upon request.

Spot Picking

Delta2D can produce output for Genomic Solutions and Molecular Dynamics spot pickers, as well as a generic spot picking format in tabular form. Centers of detected spots as well as arbitrary labeled points on a gel may be selected for picking.

Currently, Delta2D is shipped with support for the following spot pickers:

The Generic File Format
The generic file format is a simple ASCII-text file in tabular format that includes marked spots and labels. The tabular format can be easily transformed to other formats if necessary. However, see below for what to do if your spot picker is not supported by Delta2D.

Use Spots |\ Export generic pick list to generate a pick list in the generic format. The pick list includes all spots that are marked (using the mark check box in the quantitation table), together with all labels on the selected layer. For a marked spot that has no label, the spot's center will be used to define the pick. If there are one or more labels inside a spot, one pick per label will be produced, and the spot's center will be ignored. For a label that labels a point outside any spot, one pick will be generated, as well.

The generic file format consists of four columns separated by a tab. They contain following data:

Spot ID
The ID of each spot as used in Delta2D.
Coordinates
The next two columns mark the X- resp. Y-coordinate of the exported spot.
Label
The last column contains the label of each spot.
The Molecular Dynamics File Format
The file is generated according to the same rules as the generic file format, i.e. when you want to pick a spot, you have to mark it in the table or place a label inside of it. The Molecular Dynamics spot picker needs two special landmarks that are placed on the gel. In order for the robot to register the gel image to the physical gel, you need to provide labels for the two landmarks. They have to be named "IR1" and "IR2", respectively. Be careful that the labels point exactly to the centers of the landmark points.

The layout of the exported text file is slightly different from the generic format. The columns are also separated by tabs except the coordinates; they are placed in one column, separated from each other by a comma. The ID in the first column is simply a serial number, not the one used in Delta2D. The ID used in Delta2D is set in square brackets and attached to the labels in the fourth column. All residuary aspects are identical with the generic format.

The Genomic Solutions File Format
The file for Genomic Solutions includes additional information: the image field is filled with the name of the image and the name of the project. The table consists of six columns separated by commas, out of which the last three columns consist of generic data. The first column contains the spot definition in the form Spotn=SpotID - Label, whereas the n in Spotn stands for the count starting from 0 and SpotID means the ID used by Delta2D. The second and third column contain the X resp. Y coordinates of the spot center.
The Ettan Spot Handling WorkStation File Format
This file format has a quite simple structure: the first of the four tab separated columns counts the spots starting from 1, the next two contain the X resp. Y coordinates of the spots and the fourth one is reserved for comments, but not used by Delta2D by now.
What if my Picker is not Supported?
We are constantly working on broadening the range of supported spot picking file formats. If your device is not supported, please do not hesitate to contact our technical support – we will be glad to work with you to find a solution.

Instant MS Excel Reports

Use File |\ Generate Report In Excel. . . to produce an Excel worksheet that contains the currently visible data in the table, plus an extensive set of diagrams and statistics. You need to have Excelinstalled to use this feature.

________________________________________________________________________________

Note:
Since the Excel report is meant to compare the spot data of different gel images to each other, and the saving and exporting of table data always refers to what you see, this feature is only available from the views in the table showing data of multiple gel images.
Additionally, you can export just the contents of a quantitation table into Excel, using Delta2D to sort and filter data before the worksheet is created. Use File |\ Export into Excel to retain exactly the data displayed on the activated table in exactly the same alignment of columns and rows in a new Excel sheet.

This feature is tested with MS Excel Versions 2000 (9), XP (10) and 2003 (11)

Instant Export to MS PowerPoint

From within the Gel Image Pair View window, you can create a PowerPoint slide that includes everything you see in the gel image view: images, spots, and labels. Open a snapshot window using Edit |\ Snapshot. In the snapshot window, use File |\ Export to PowerPoint to produce a PowerPoint slide that contains all objects which are currently visible in the gel image view. These objects are fully editable inside PowerPoint. You need to have PowerPoint 2000, XP, or 2003 installed to use this feature.