Chapter 8
Exploring Analysis Results
Delta2D displays quantitative data in flexible tabular views (see
figure 8.1) that fit your analysis needs. Table rows can be filtered and sorted by numerical and
non-numerical columns, making it easy to identify relevant sets of spots. The table display is always
synchronized with the spot boundaries on the gel image view, so you can go from image to data and
back again with ease.
8.1 The Quantitation Tables
| Figure 8.1: | The quantitation table |
|
The Quantitation Tables give you access to your data in three basic types of representation:
-
Single gel tables
- show all available quantitive data to one single gel image.
-
Multi gel tables
- show the same data as the above type plus data comparing the spots of the
gel images to each other, whereas the
-
Statistics table
- shows one selectable subset of the above data in a statistical evaluation for
each group plus data comparing the spots groupwise.
The Quantitation Table can be opened manually if there is quantitative data by choosing Spots
Show
Table or via the menu Window
Quantitation Table. The latter menu item opens the general
quantitation table window as described below, whereas the first menu item additionally invokes a multi
gel table with the two gel images represented in the Gel Pair View. There are several tabs at the bottom
of the table:
The single gel tables are labeled each with the name of the respective gel image. The multi gel tables
are labeled with either the names of the gel images involved, or, if containing all gel images, simply
with All gel images. The statistics table is labeled Statistics.
Tab borders are color coded according to your current color scheme.
Single Gel Tables
Single gel tables show only data for one gel. Each row represents one spot and the columns
show the data of this spot. For a more detailed description of this data, please refer to table
8.1.
Multi Gel Tables
Usually, you will only see one table of this kind: the All gel images table. But you also have the
possibility to compare two gel images directly to each other. Simply open the two gel images you
want to compare in the Gel Pair View. Then open the tables on the first way described
above: choose the menu item Spots
Show Table in the Gel Pair View. Basically, the
structure is the same as in the single gel tables, except that each row in the table represents a
combination of corresponding spots. If no correspondence could be found for a spot, it
is placed on a row by itself. The columns, a more detailed description of which you will
find in table 8.1, are repeatedly represented, once for each gel image. The column headers
are color coded as well to make it easy to see from which gel the data in the column was
taken.
The Statistics Table
This is the first table you get to see when opening the Quantitation
Table window. It gives you a statistical overview to the obtained data, sorted by groups. As in
the other table views, each row represents a spot and its correspondences. But unlike the
other tables you only have one column per gel image and seven columns with statistical
data per group: minimum, maximum, mean, relative standard deviation, number, ratio, and
t-test.
-
minimum
- shows the lowest value of this and corresponding spots in the whole group.
-
maximum
- shows the highest value of the set of corresponding spots in this group.
-
mean
- is the arithmetic mean of this set of spots in this group.
-
relative standard deviation
- indicates the standard deviation within this set of spots in
percent of the total of the group.
-
number
- indicates how often this spot is represented in this group.
-
ratio
- shows the ratio for a certain parameter of the min/max/mean of this group to the
min/max/mean of the group where the most left gel image in the project matrix belongs
to. Choose the parameter and the function min, maxor mean at the top of the table.
-
t-test
- Based on the Student's t-test algorithm, the data in this column indicates the error
probability for the assumption, that this group belongs to the same parent population as
the group where the most left gel image in the project matrix belongs to.
As an example for how the ratio columns in the Statistic Table are calculated, imagine the settings at
the top of the table are Spot property: %Volume, ratio=sample groups mean / group control mean.
The ratios are calculatedby the following procedure: The columns in the Statistic Table are sorted by
groups, while the groups are sorted according to their order in the project manager from left to right.
For every group, except for the first one in the Statistic Table, the mean of the normalized volume
(%Volume) is calulated and divided by the mean of the normalized volume over the first
group.
| Column | Description
| | mark | Check this box to mark or unmark a row. |
| hide | Check this box to hide a row (it will be hidden immediately). |
| cancel | Check this box to cancel the spots in a row. Canceled spots are excluded
from further analysis. |
| Normalization set | here you can select a subset of the spots that will be used to normalize
the quantities of the spots on a gel. By default, all spots are in the
normalization set. This results in relative spot volumes being computed
by setting total spot volume on a gel to 100%. |
| Symmetric ratio | this column will display the symmetric ratio of the relative volumes.
The difference between the symmetric ratio and the conventional ratio
column is that for a 1:2 ratio the symmetric ratio column will show "-2",
meaning a two-fold decrease. For the opposite 2:1 ratio, both columns
will display 2, meaning a two-fold increase. |
| % V | The relative quantity of the spot, excluding background. The total
quantity of all spots on the gel is 100%. |
| ratio | The numerical expression ratio (sample spot / master spot). Depending
on your settings in the Tables tab in the options dialog (please refer to
section 9.6) this column shows the ratio as mathematical ratio or as fold
change. Additionally it can contain color coded representation of the
ratio. |
| V | Volume, i.e. the absolute quantity of the spot, in gray units, excluding
background. One black pixel with no background has absolute quantity
1. |
| A | The area of the spot. |
| bg | The background volume for the spot. |
| avg | The average intensity of the spot, including background. |
| ID | The numerical ID of the spot. |
| frq | Frequency, counts in how many rows this spot is displayed. |
| label | One or more labels attached to this spot. |
| Table 8.1: | Columns in the quantitation table. |
|
Changing the Table Layout
To change the width of a column, just place the mouse pointer in the
table header between two columns. When you see that the mouse pointer changes, click and drag to the
left or to the right until the desired column width is reached. A column can be moved by clicking into
its header and dragging it to the left or to the right. You can hide a column completely by using the
entries in the Column menu.
Table Properties
A quick and effective way to customize the Quantitation Tables is to open the Properties dialog of
the Quantitation Tables by choosing Column
Table Properties from the Quantitation Tables'
menu. This dialog offers you the opportunity to set quickly or detailed the layout of the
table.
| Figure 8.2: | The properties dialog of the quantitation table |
|
This dialog consists of a table and four buttons, which let you change the following options:
The table:
-
Ratio Master
- Here you determine which gel should be the one, the calculation of ratios refers
to.
-
Visible
- Check the gel images you want to see in your multi gel table. Visibility also applies to
the Gel Regions View
-
Gel
- The list of all gel images included in your project.
-
. . .
- All the columns as described in table 8.1. Check those columns for those gel imags you
want to set visible.
The buttons:
-
All Columns
- Click here to set all columns for all gel images to be visible.
-
Quantity
- Click on this button to set only those columns visible, which are related to
quantitative data of spots, plus ID, Label, and coordinates of spots.
-
Ok
- As in any other dialog: Apply changes and close dialog.
-
Cancel
- Discard changes and close dialog.
-
Apply
- Apply changes but let dialog open.
8.2 Working with Spots
Sorting and Selecting Spots
Now let us sort the table by the relative volume of the master spots. Just
click into the lower part of the column header. A small arrow indicates the sort order, click again to sort
in reverse order.
| Figure 8.3: | Part of quantitation table, sorted on the fifth column |
|
Sorting makes it easy to identify the most intensive spots or those with a high expression ratio:
just sort and then select the top rows. The selected spots will be highlighted in the main
window.
You can use any column for sorting, try, for instance, to sort on the color-coded expression
ratios.
Select one of the rows by clicking on it. Observe how the corresponding spot segments on the
master and sample gel images are highlighted. You can select additional rows by pressing the
Control key while clicking on them. Shift-clicking on a row selects all rows up to that row.
Dragging the mouse over consecutive rows selects them, too. Use the menu item Edit
Select All to select all rows in the quantitation table, and Edit
Invert Selection to invert the
selection.
You can select a spot in the gel window by clicking somewhere within its boundary. The
corresponding row in the table will be selected automatically.
Here is how to select the 10 most intensive spots on a certain gel image:
- Switch to the single gel table for gel image: just click on the tab for this image. The tab's
border has the same color as the gel image in the project manager.
- Click on the header of the column that is labeled %V. The table is now sorted according
to master spot volume. In the column's header, you see a little arrow that indicates the sort
order.
- Click again to reverse the sort order. Rows are sorted in descending order now, i.e. the
spot with greatest quantity is in the first row.
- Select the first ten rows in the table: Click on the first row and drag down to the tenth line.
You can watch in the title how many rows you have selected.
Watch how the ten most intensive spots are highlighted on the gel view, as well.
Selecting Spots in the Gel Image Pair View
You can select a spot in the gel image view by clicking
on it. Make sure that the spots tool is activated before you select spots. Additionally, you can select
spots in a rectangular region by dragging with the mouse.
Hiding Spots
The gel image view will always reflect the contents of the quantitation table, i.e. any
spot that is visible or selected in the table will be visible or highlighted respectively in the gel
window.
In some situations, it may be useful to hide some spots from the analysis. You can do this by checking
the box in the "hide" column. The row will be hidden immediately. Hide a group of rows
by selecting them and using View
Hide Selected Rows. Since the quantitation table is
synchronized with the main window, spots you hide in the table will also be hidden in the main
window.
Check View
Show Hidden Rows to see all hidden rows again. You can now click in the hide
column to mark a row as visible or invisible — the display will not change. Use View
Hide Selected Rows and View
Don't Hide Selected Rows to control the visibility of
whole groups of rows. Unchecking View
Show Hidden Rows will let your changes take
effect.
Of course, all these tunings can be done on any tab of the table.
Canceling Spots
A canceled spot will be excluded from the analysis just as if it would have never
been detected. Single spots or rows can be canceled by clicking on the check box in the cancel column.
You will sometimes want to cancel spots in a region such as the border of the image. To do this,
activate the spots tool in the gel image view and select the region by dragging with the
mouse. Right-click to open a context menu and select cancel to cancel all spots you have
selected.
Marking Spots
You will often want to concentrate on a subset of spots, such as those with a high
expression ratio. Sometimes you will select spots individually, based on your own criteria. For this
purpose, Delta2D lets you make a "note" on a spot, in the mark column. You can add new
marks at any point of your analysis, building an increasing set of interesting spot pairs. Later
you will see how to display only marked spots, or how to do other things to spots that are
marked.
A single spot can be marked by clicking on the check box in the "mark" column. When you're in a
correspondence view of the quantitation table then this will mark all spots in the row. Mark multiple
rows by first selecting them and then choosing Mark
Mark Selected Rows. Marks will always be
added to what you have already marked. Marks can be cleared using Mark
Unmark Selected
Rows.
More advanced operations can be executed by combining selection and marking. Say, you have
first identified all the interesting spots by marking them and now you want to hide all other
spots:
- use Mark
Select Marked Rows to select all the marked rows
- use Edit
Invert Selection to select only the rows that are not marked
- use View
Hide Selected Rows to hide all rows that are not marked
Similarly, clearing all marks can easily be done by choosing Edit
Select All and then Mark
Unmark Selected Rows. To see which rows you have marked, click on the mark column for sorting,
this will separate marked from unmarked rows.
Counting
Delta2D helps you count how many spots are visible or selected in a table. Counts are
displayed in the table's title bar. In a single gel table, it may look like this
These numbers represent the number of total / visible / selected items. In the example, there are 1048
spots in total, of which 1048 are visible and 4 are selected.
Select the master - sample tab to switch to the correspondence table. Select a few rows
in the table and watch the table's title bar. Here, a row means a spot correspondence, for
example:
Rows: 3598 / 2119 / 12 Master: 2205 / 1409 / 12 Sample: 2265 / 1451 / 8
| Figure 8.4: | Automatic counting in the title bar of the correspondence table. |
|
Again, these numbers represent the number of total / visible / selected items. In the example, Rows:
3598 / 2119 / 12 means that there are 3598 rows in total, of which 2119 are visible and 12 are
selected. Likewise, the title bar shows that there are 2205 master spots, 1409 of them are visible, and
the selection includes 12 master spots. All counts are automatically updated when you hide or select
rows.
Filtering
Sometimes you want to focus the analysis on spots that meet certain criteria, say those with an
expression ratio between 2 and 5. Of course, you could sort according to the expression ratio column
and then select those spots manually, but there is a much more convenient way to do this: use a filter. A
filter will show only those rows that meet your criterion. Filters can be set on most columns, see the
Filter menu for all available filters.
Let's start with using filters. Firstly we only want those rows to be displayed whose expression ratio
is between 2 and 5. Choose Filter
Ratios
sample / master to get a filter dialog, or simply click on
the button labeled "Filter" on top of the appropriate column.
| Figure 8.5: | Editing a table row filter. Here, Delta2D will show only spots that have expression
ratios between 0.5 and 2. |
|
Click on the one of the check boxes labeled Active to activate the filter. Enter 2 into the left Ratio
field named First Border. Now enter 5 into the right Ratio field (Second Border). In the upper part
of the dialog you can watch in the histogram which range of spots will be displayed. You
can also use the sliders below the histogram to shift the borders of the displayed range up
and down. If the movement of the sliders is not fine enough adjustable for your purposes,
you can resize the dialog window to a bigger size by dragging its borders like any other
window.
Another convenient way of determining borders for your filter is given by the fields above the
histogram: the top most row refers to the total of all absolute values this filter refers to. You can use any
value between 0 and this total to indicate how big the ranges of the low, the middle and the high interval
should be.
In the second row you can determine the size of these ranges by a relative value, e.g. set the low
range of the filter to 20% of the total of all values.
The third and fourth row give you the count of all values you want to apply this filter to as absolute
resp. relative numbers. This makes it easy to set the lower border of the filter to let's say the 150 smallest
values, or, in the fourth row, you could type in 10 to set the border to the 10% smallest
values.
Press OK to save the changes you have made to the filter and close the filter dialog, or Apply to just
apply the changes without closing this dialog. You can directly switch to another filter without having
to close and reopen this dialog. The table, as well as the gel image, contains now only those spots
whose expression ratios lie between 2 and 5. By looking at the table's title bar, you can tell
how many rows meet your criterion. You can continue to work with the filtered table as
usual.
The button in the header of the ratio column has changed to a short description of the filter. Leave
your mouse pointer over the button for a while to get a tool tip that contains a more detailed
description.
Filters can be combined to implement more complex criteria, such as "show all rows with expression
ratio between 2 and 5 and master spot volume greater than 0.1".
Example: Showing Spots Whose Quantities Differ by More Than a Factor of Two
-
Problem:
- You want to focus on spots with a "significant" expression ratio, i.e. the expression
ratio should be less than 0.5 or greater than 2.
-
Solution:
- Use a negated filter that shows only rows whose expression ratio is not between 0.5
and 2. Choose Filter
ratio your gel image names to get a filter dialog and enter the
data as shown in Figure 8.6.
| Figure 8.6: | A filter that hides expression ratios between 0.5 and 2. |
|
Scatter Plots
In addition to the numerical possibilities to identify distinctive spots, Delta2D offers as
visual tool a scatter plot. Scatter plots show the ratios of the relative volumes in two gel images. You
can produce a scatter plot by going to the Project Manager, right-clicking on a gel pair and choosing
menu item Scatter plot. Or, from the gel pair view, you can use the shortcut Ctrl + L, or choose the
menu item Spots
Show Scatterplot.
| Figure 8.7: | The scatter plot |
|
As any other part of Delta2D, the scatter plot interacts with the other parts. If you have selected one
or more matched spots in the gel pair view, they are selected in the scatter plot too and vice
versa.
You can zoom in to the scatter plot by 'drawing' a rectangle around the region you want to magnify.
To do this, click and drag with the left mouse button from the top left to the bottom right. To reset the
view simply click and drag in any other direction.
8.3 Gel Image Regions Window
| Figure 8.8: | Same region of four different gel images |
|
Back to our project in the project manager. The Gel Image Regions view lets you display the same
image region of all gel images in the project side by side. Open the Gel Image Regions window by
choosing the menu item Window
Gel regions. The view looks similar to Figure 8.8, spots will be
displayed (and highlighted) if they are present on a gel. You can use the scroll bars to move the region
that is displayed, or simply click in one of the views while holding the Alt key and drag in the desired
direction.
When you have opened a Dual View window, its displayed area will determine the segment shown
in the Gel Image Regions window.
8.4 Spot Color Coding
Color coding for spots lets Delta2D display a gel image (or a proteome map) with spots colored
according to their expression profiles. For an example, see the figure below: Spots are colored by the
following scheme: Spots that are increased in sample 1, and in no other sample are shown in red, green
is for spots that are increased in sample 2 etc. Yellow is for spots that are increased in samples 1 and 2
etc.
| Figure 8.9: | A Region with Colored Spots. The color of a spot indicates on which sample(s) it is
increased. |
|
Start Spot Color Coding by clicking in the menu on Window
Color Coding in any window of
Delta2D. A new window will open, letting you determine the type and settings for color coding. Color
coding can use two basic criteria for coloring the spots:
-
Subsets
- The subset of gel images on which a spot occurs is crucial for the color it will be
represented with.
-
Min/Max
- The spots are colored by the gel on which they have their maximum or minimum
volume.
Switch to the tab containing the options for the type of color coding you want to achieve.
Color Coding by Subsets
This option gives you an overview of the matches for every spot on a given
gel. First, select the gel image which will be used as "background" for the colored spots. Then
determine the subsets of matches you want to see: Every column in the Color Coding Scheme specifies
a combination of matches and a color. If a spot in the master gel image matches spots from subset of gel
images specified in that column, the spot will be shown in the color of that column. To add a new match
subset, click on the
button. This will add a new empty subset, which you can configure by
clicking on the boxes in the column. Note that if the empty subset already appears in the table,
clicking the button will have no effect. To remove a selected subset, click on the
button.
You can select a subset for deletion by clicking on its column header. To add all possible
subsets, click on the
button. Afterwards, the table will contain one column for each possible
combination of matches across the gel images. To delete all existing subsets, click on the
button.
| Figure 8.10: | Choose Colors and Master Image |
|
Color Coding by Min/Max
This option enables you to highlight which group contains the spot for
which a given characteristic is most strongly or weakly displayed. Select the characteristic that you
want to highlight. Each spot will be assigned the color of the group that most strongly or weakly
represents that characteristic.
Example: Color Coding Spots by Subsets
By combining Spot Color Coding with spot
filtering, you can visualize various aspects of your experiment. As one example, let's make a
proteome map that shows which spots are increased under which conditions (or combinations of
conditions).
Step 1: Detect Spots
First you need to detect spots.We recommend that you do this on a
union-fused image and transfer spots to all the images that you want to include in the color
coding.
Step 2: Show a Subset of Spots on Each Image
The color code will show on which gel images a spot
is visible. We want to see where a spot is increased relative to its "standard" volume on the master
(control) image. Therefore we filter out the non-increased spots on each of the sample images. Go to the
all gel images table and set a filter for a factor of two or greater on the ratio columns. As a result
you will see on every single gel only the spots whose intensity increased relative to the
master.
Step 3: Choose Colors and Master Image
Right click on a gel image in the project manager and select
Spot Color Coding / By Subset. A dialog will appear that allows you to configure the color
coding: Select the Union image as master gel image, i.e. the gel image on which the spots will
be overlayed. The table is used to configure which subsets should be displayed in which
color. The leftmost check box column controls if a group is taken into account for color
coding, the second check box column controls if a spot's visibility on a certain gel image
will be taken into account. In the screenshot, we have checked the three sample images.
On these three images there may be eight different subsets for every spot: it can be visible
on sample1, or on sample1 and sample2 etc. Press the Add All button to get a list of all
possible combinations. A new color is assigned automatically to each combination. You can
change colors by right-clicking on the column and choosing Select color. You can change the
combination a color stands for by clicking inside the table. Press OK to open the color coding
window.
Step 4: Adjust the Display
In the color coding window you can use the View menu to adjust
the display, for example you can choose to use the inverted image (white spots on black
background). Color Coding Spots by Intensity There is another variant of color coding where a
spot is colored according to the image on which it has minimal or maximal intensity. Open
the color coding dialog by going to the project manager, then right-clicking on a gel and
selecting color coding > by Min/Max. The dialog is similar to the dialog for color coding by
subset. You can select one color per gel image, as well as the parameter (volume, area etc)
to use for color coding. Exporting the Color Coded Display Use Export to PowerPoint in
the File menu to export the color coded gel image to PowerPoint. You can also make a
Snapshot window (using File/Snapshot) which can then be exported to a variety of image
formats.
8.5 Expression Profiles Window
| Figure 8.11: | The expression profiles window |
|
Exactly like the Expression Profile Rollup, the Expression Profiles Window shows you a quick
survey of the distribution of selected spots in the gel images of your project, sorted by groups. But
unlike the Rollup, the Expression Profiles Window can do this for as many spots as you
want.
Simply select interesting spots in the Quantitation Table or in the Gel Pair View. Mark the spots in
the Gel Pair View by clicking right on one of the selected spots and check the box named Mark spot,
or, in the Table, by selecting the menu item Mark
Mark Selected Rows. Now you can open the
Expression Profiles Window in any of the other windows with the menu item Window
Expression Profiles.
The window shows an amount of little graphs, one for each spot you have selected. You can
determine their size and distribution with the controls on top of the window. If the spots are labeled,
each graph shows the label of the spots he represents in his title. If they are not, you can easily identify
them by right clicking on the respective graph. The now upcoming context menu not only shows
you the ID's of the spots involved, but also gives you the opportunity to change their mark
status.
Additionally, you can change the design of the graphs and the data shown in the menu item
View:
-
Show Group Bars Collapsed
- Combine all single bars of gel images of one group to one
single bar for the whole group.
-
Show Mean Values
- Shows the mean values for each group.
-
Show Standard Deviation
- Shows the standard deviation for each group.
-
Connect Mean Values
- Changes the representation of values by bars to a line which connects
the mean values, thus signaling the fold change of this spot.
-
Show Axis
- Shows a scale on the left border of each graph plus axis to make it easier to read
the volume of each spot.
As in any other part of Delta2D, selection of spots will be actualized between windows.
________________________________________________________________________________
-
Note:
- Please note that with multichannel projects (e.g. DIGE setups, for details on
multichannel techniques please refer to section 4.7) it may occur that spots are not
to be seen in Expression profiles: Spots on the standard gel image are used as
normalization, which means that matching spots on other gel images refer to these
spots as to 100%. Due to this, spots on other gel images that have no matching spots
on the standard do not appear in any representation of Expression Profiles.
8.6 Statistical Analysis of the Results
Since version 3.6 Delta2D incorporates advanced multivariate statistics in the analysis of 2D gels,
including:
- Heat map display of expression profiles
- Various methods of clustering
- Principal Components Analysis (PCA)
- T-tests with optional resampling and control of false discovery rate
- Analysis of Variance (ANOVA)
- Template matching for expression profiles
The algorithms are adapted from the TIGR Multiple Experiment Viewer (MeV, version 4.0,
tm4.org/mev.html, Saeed et al. 2003) and tightly integrated into the image analysis workflow. With
Delta2D's Complete Expression Profiles, there are no missing values, and matching problems are
virtually eliminated. This makes Delta2D especially well suited for the methods that were originally
applied in the context of DNA microarray analysis.
Getting a High Level Overview of Expression Data - Heat Maps
Heat maps are a well-known visualization method for expression data from DNA microarrays.
Expression profiles are in the rows, gel images in the columns. The legend across the top shows the
color code for spot intensities. Rows are labeled based on the spot labels from the gel images. By
default, data is standardized to zero mean and unit variance before being shown in the heat
map. Other options for normalization are available in the Analyze menu of the statistics
table.
Let us make a heat map:
- Open the Demonstration project in Delta2D.
- Open the quantitation table (Window
Quantitation Table), make sure the Statistics
Table is selected.
- Hide the quantitative data for the fused image: Choose Column
Column Properties,
uncheck the checkbox next to Fused Image, press OK (fig. 8.13).
- Press the Analyze button in the top left of the statistics table (fig. 8.14). A new analysis
window is opened, containing the current expression profiles in a heat map display.
- If you want to see more rows at once, you can use Display
Set Element Size and select
20 by 5.
| Figure 8.13: | The properties dialog of the quantitation table |
|
| Figure 8.14: | Start analysis from the Quantitation Table |
|
Clustering Images: What Image Groups or Classes Are There?
Clustering methods can group expression profiles and gel images by similarity. This can be very useful
for getting an overview of all expression profiles before proceeding with more detailed analyses.
Clustering of gel images can also be used to detect outliers, and to identify structures in the experiment.
Ideally, the cluster composition will reflect the structure of the experiment, e.g. replicates and images
from the same sample should have similar expression levels and thus end up in the same
cluster.
| Figure 8.15: | In this clustering you see an experiment with control (C1, C2, C4, C5) and treated
(T1, T2, T3, T4) samples, made in triplicates. The clustering rediscovers the experimental setup,
i.e. gel images with similar samples share a cluster. A sample forming a separate cluster would
indicate an outlier for which closer inspection is advisable. Made using Pearson correlation as
the similarity measure between images. |
|
Let us make a hierarchical clustering to show more structure in the data:
- Press the HCL button in the toolbar.
- Accept the default settings and press OK.
The hierarchical clustering groups both samples (gel images) and expression profiles. The cluster
hierarchy is shown in a tree display. As you can see, replicates are clustered together, indicating higher
similarity, as we would expect.
Clustering Expression Profiles: Finding Correlated Proteins
| Figure 8.16: | Spots with similar expression profiles are clustered together. Support Tree clustering
with Euclidean distance. |
|
Clustering of expression profiles is done to identify proteins with similar behavior, implying
that they are co-regulated or at least correlated. The global nature of the cluster display
allows for a broad overview and the forming of hypotheses that can then be tested (fig.
8.16).
Discovering Patterns in Expression Profiles
| Figure 8.17: | Cutting a tree by a distance threshold. Use the slider to adjust the threshold. |
|
One can regard the mean (or median) of a cluster as a kind of "typical" expression profile. The
clustering displays allow you to split the set of expression profiles into separate subsets:
- Right click and select Gene tree properties from the context menu.
- Use the slider to cut the tree at a certain distance from the root (fig. 8.17).
- Then check the Create Cluster Viewers checkbox and press OK.
- A new section called Gene Tree Cut is created in the left hand side of the display (fig.
8.18).
|
| Figure 8.18: | Combined expression profiles in 12 clusters. |
|
Finding differentially expressed proteins: Statistical Tests
Methods for statistical hypothesis testing in Delta2D are based on state-of-the-art algorithms that are
applied in the context of DNA array analysis.
| Figure 8.19: | Result of applying t-tests (control vs. treated) to expression profiles. Profiles and
images were clustered to better visualize differentially expressed proteins. P-values are based
on 1000 permutations, false discovery rate is controlled to be 5 elements or less (with overall
alpha=1%). |
|
In the simplest case, the experiment is a comparison of two samples, e.g. diseased vs. control
tissue, mutant vs. wild type etc. The task then is finding those proteins that show significant
differences in expression levels. Certainly the most popular test in this area is Student's t-Test,
where the null hypothesis is that the means of expression levels in samples A and B are the
same. Rejecting the null hypothesis then means that the protein under test is differentially
expressed.
No normal distribution of spot intensities required
One has to keep in mind that the classical Student's t-Test makes the assumption that spot quantities
within replicates follow a normal distribution which should be tested separately. Depending on the
staining method you use and other factors, spot quantities within replicate gels may not be normally
distributed. Therefore it is advisable to use one of the provided methods that are based on
permutations.
In the t-Test options dialog, choose "p-values based on permutation" and either "Use all
permutations" or "Randomly group samples" and enter "1000".
Controlling the False Discovery Rate
When applying statistical tests to 2-D gel data, one is faced with the so-called multiple
hypothesis testing problem: For each expression profile, a separate test is done. Each test has
a certain probability of giving a false positive result, i.e. a protein spot is declared to be
differentially expressed while the difference was due to pure chance. The large number of
tests can produce a high number of false positives. For example, in an experiment with
2000 spots per gel, an accepted false - positive rate alpha of 5% will result in 100 proteins
that are found to be "differentially expressed" although the difference is the result of mere
chance.
The MeV t-test module incorporated in Delta2D provides methods to control the proportion of
false positives in the result set (False Discovery Rate - FDR). Overall, the False Discovery
Rate approach allows one to strike a balance between the need to find statistically valid
proteins of interest and the additional cost that is associated with following up on false
positives.
In the t-Test options dialog, make sure you selected "p-values based on permutations". Select
"Stepdown Westfall and Young methods". Choose bounds for the number of false positive spots in
the result set using the "number of false positive genes should not exceed". Alternatively choose a
bound for the proportion of false positive spots in the result set, using the other radio button and text
box.
Template Matching
With Template Matching, you can define a template for an expression profile and let Delta2D find spots
whose expression profiles match the template. For example, in a time series experiment you might want
to look for spots whose expression level increases with time.
a) b)
| Figure 8.20: | a): Expression profiles matching the template. b): Comparison between template
(blue line) and matching expression profiles. |
|
Templates can be entered directly by specifying an expression level for every image. Alternatively
you can select a spot in the list on the top left of the dialog and use its expression profile as a template
by pressing Select highlighted gene from above list to use as template. Increasing the p-Value
will include more spots, decreasing p-value will result in more stringent matching. Templates can also
be derived from present clusters.
Click on the PTM (Pavlidis Template Matching) button in the toolbar, or choose Analysis
Statistics
Pavlidis Template Matching from the menu. The Help button (labeled "i" on the bottom
left of the dialog) gives more information about the options.
|
| Figure 8.21: | With Pavlidis template matching (PTM) you can specify a typical expression
profile, e.g. one that increases with time. |
|
Principal Component Analysis (PCA): Grouping and Visualization
When you do Principal Component Analysis (PCA) on a set of gel images, you get a two- or
three-dimensional visualization of the image set that is optimal in certain sense, i.e. it preserves the
variation as much as possible. PCA works by taking spot intensities on every gel image and
assembling them into a vector. So an experiment of 24 gel images with 1200 spots each
would be represented as a cloud of 24 points in a space with 1200 dimensions. The goal
of principal component analysis is then to find a projection of the point cloud in two or
three-dimensional space such that as much as possible of the variation of the point cloud is
preserved. One hopes that the gels from different samples will be in separate regions of the
resulting diagram. The principal components can then be interpreted as "typical spot patterns"
or "eigengels". Their coordinates can be analyzed in order to determine which spots are
contributing most to the variance, making them candidates for protein identification and biological
interpretation.
a) b)
| Figure 8.22: | a): Principal component analysis of 24 gel images in 3 dimensions. Parallels have
the same color. The view can be rotated by dragging with the mouse. Again, replicates are placed
close together. b): The same principal component analysis of 24 gel images, projected onto the
first two principal components. Treated and control samples (reddish vs greenish colors) can be
separated. |
|
When principal component analysis is applied to the expression profiles, in our example we would
consider a point cloud of 1200 vectors (one vector for each expression profile) with 24 dimensions (the
expression levels on the 24 gels). The result is a display of the proteins where (hopefully)
proteins with close positions are biologically related. Consider a time series experiment,
where proteins are switched on and off in stages. If there is a "hidden parameter", such as
a stage in the cell cycle, it will have a systematic influence on the expression levels, and
thus increase the variance for the genes taking part in it. This increased variance will then
become part of the directions that are used for the projection (the principal components).
The principal components were also called "eigengenes", they can be seen as "classes of
most prominent expression profiles" see, for example, Alter et al. 2000 and Holter et al.
2000.
|
| Figure 8.23: | Principal component analysis of expression profiles in three dimensions.
Differentially expressed spots were determined by t-test and highlighted orange and blue,
respectively. Inset: First principal component. |
|
Working with Sets of Spots
In the terminology of the TIGR Multiple Experiment Viewer (MeV), a cluster can be any set of
expression profiles or samples (gel images). You can create new clusters by choosing Store Cluster in
many displays of analysis results.
Storing a cluster of expression profiles:
- In a clustering display, select the expression profiles of interest. In a hierachical clustering,
you can select a whole branch of the dendrogram by clicking it in the tree. The
corresponding expression profiles will be selected.
- Now right-click and select Store Cluster. The new cluster will be shown in the Cluster
Manager under Gene Clusters.
Storing a sample cluster:
- In a hierarchical clustering, click on a part of the dendrogram for samples (column
dendrogram), maybe you want to select a set of replicate gel images.
- Note how columns are selected in the heatmap display. Now right-click and select Store
Cluster.
- A dialog opens that lets you define a name, comment and color of the cluster. You will
have to select at least a color. Click the OK button.
- The new sample cluster should now be visible in the Cluster Manager. By default, the
color of the cluster will now be shown on top of the heatmap column, and in other displays
such as PCA (for samples).
In the Cluster Manager you can change any attribute, e.g. cluster colors, or whether the color should
be used in displays. Note that clusters may overlap, but only one cluster's color will be used in
displays.
Cluster A Cluster B
|
When you have multiple clusters you can create new clusters that are combinations of selected
ones:
- Intersection: The new cluster contains only expression profiles that are present in each of
the selected clusters.
- Union: The new cluster contains all expression profiles that were presentin any of the
selected clusters.
- XOR: The new cluster contains only expression profiles that are found exclusively in one
of the selected clusters.
|
Intersection of A and B Union of A and B XOR of A and B
|
In the Cluster Manager, select the clusters you want to combine. Right click, then select the
operation you want to perform from the ClusterOperations submenu.
Statistical Analysis is Integrated with Image Analysis
When you select one or more spots in a heatmap display, the selection will be immediately visible in
other parts of Delta2D, such as the Dual View, or the Gel Image Regions View. You can
extend the selection to a range of rows by holding down the Shiftwhile clicking on the end of
the range. You can add or remove a single row by holding down the Ctrlwhile clicking on
it.
If you have organized spots of interest in the Cluster Manager, you can use these directly in
Delta2D. Just right click on a cluster and choose Select in Delta2D this will select the expression
profiles in the cluster throughout all parts of Delta2D.
Getting a Spot Album of Relevant Spots
Using Delta2D's Spot Album Report, it is easy to show snapshots of the statistically significant spots
you have found. All you have to do is mark these spots in the Delta2D project:
- Make sure you have selected the spots of interest.
- Switch to the Statistics tab of the Quantitation Table and choose Mark
Unmark all
spots to unmark all spots that you might have marked previously.
- Then choose Mark
Mark selected spots.
- Then switch to the Project Manager and choose Reports / Spot Album. Note that the
spot album may by quite large, as there is one image for each spot on each image. You
can restrict the album to a single group by clicking on the "hide others" link in the group
caption.
For more information about Reports see also section 8.7.
Overview of Statistical Methods
The following is a list of methods, for in-depth information please refer to the MeV manual and the
original papers cited below.
Clustering
- Clustering can be applied to samples and / or expression profiles
- Hierarchical clustering and k-Means / k-Medians clustering
- Supports average linkage, complete linkage, and single linkage for determining
cluster-to-cluster distances
- Supported distance metrics: Euclidean distance, Manhattan distance, Pearson correlation,
Pearson uncentered correlation, Pearson squared correlation, Average dot product, Cosine
correlation, Covariance, Spearman's rank correlation, Kendall's tau.
- Construction of support trees by resampling methods: bootstrapping (resampling with
replacement), and jackknifing (resampling by leaving out one observation).
HCL - Hierarchical Clustering
Eisen, M.B., P.T. Spellman, P.O. Brown, and D. Botstein. 1998. Cluster analysis and display of
genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95:14863-14868.
ST - Support trees (Bootstrapping)
Graur, D., and W.-H. Li. 2000. Fundamentals of Molecular Evolution. Second Edition. Sinauer
Associates, Sunderland, MA. pp 209-210.
KMC - K-Means Clustering
Soukas, A., P. Cohen, N.D. Socci, and J.M. Friedman. 2000. Leptin-specific patterns of gene
expression in white adipose tissue. Genes Dev. 14:963-980.
Template Matching
- Templates can be defined for expression profiles and samples.
- Templates can be defined interactively, from a given expression profile, or from a cluster.
PTM - Template matching
Pavlidis, P., and W.S. Noble 2001. Analysis of strain and regional variation in gene expression in
mouse brain. Genome Biology 2:research0042.1-0042.15.
Principal Component Analysis
- Principal component analysis is available for both samples and expression profiles.
- Three-dimensional and two-dimensional displays are available
- New clusters can be defined by dragging in a two-dimensional display.
Raychaudhuri, S., J. M. Stuart, & R. B. Altman 2000. Principal components analysis to summarize microarray
experiments: application to sporulation time series. Pacific Symposium on Biocomputing 2000, Honolulu,
Hawaii, 452-463. Available at http:/smi-web.stanford.edu/pubs/SMI_Abstracts/SMI-1999-0804.html
Statistical Hypothesis Testing
TTEST - T-Tests
- T-tests: one-sample, between samples, paired t-test
- Assuming equal or different group variances
- P-values can be computed based on normal distribution or using randomization.
- Corrections for multiple testing: Bonferroni, adjusted Bonferroni, Westfall-Young
- Control of false discovery rate
- Volcano Plot
Pan, W. (2002). A comparative review of statistical methods for discovering differentially expressed
genes in replicated microarray experiments. Bioinformatics 18: 546-554.
Dudoit, S., Y.H. Yang, M.J. Callow, and T. Speed (2000).Statistical methods for identifying
differentially expressed genes in replicated cDNA microarray experiments. Technical report 2000
Statistics Department, University of California, Berkeley.
Welch B.L. (1947).The generalization of 'students' problem when several different population
variances are involved. Biometrika 34: 28-35.
ANOVA - One-way Analysis of Variance
- P-values can be computed based on F-distribution or using randomization.
- Corrections for multiple testing: Bonferroni, adjusted Bonferroni, Westfall-Young
- Control of false discovery rate
Zar, J.H. 1999. Biostatistical Analysis. 4th ed. Prentice Hall, NJ.
TFA - Two-factor Analysis of Variance
Keppel, G., and S. Zedeck.1989. Data Analysis for Research Designs. W. H. Freeman and Co.,
NY.
Manly, B.F.J. 1997. Randomization, Bootstrap and Monte Carlo Methods in Biology. 2nd ed.
Chapman and Hall / CRC , FL.
Zar, J.H. 1999. Biostatistical Analysis. 4th ed. Prentice Hall, NJ.
References
Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan
M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z,
Vinsavich A, Trush V, Quackenbush J. TM4: a free, open-source system for microarray data
management and analysis. Biotechniques. 2003 Feb;34(2):374–8.
Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression
data processing and modeling. Proc Natl Acad Sci U S A 97:10101–10106
Holter NS, Mitra M, Maritan A, Cieplak M, Banavar JR, Fedoroff NV (2000) Fundamental patterns
underlying gene expression profiles: simplicity from complexity. Proc Natl Acad Sci U S A
97:8409–8414
TIGR Multiple Experiment Viewer (MeV): http:/www.tm4.org/mev.html
TIGR MeV manual: http:/www.decodon.com/Support/Documentation/MeV
8.7 Generating Reports
Delta2D now offers interactive reports on the current project. They make it easy to present
data on relevant spots, experimental setup, and quantitative data. The reports are based on
HTML so you can put them on the web easily. Just as easy you can process all or part of a
report in your favorite word processor or presentation program by just copying excerpts into
it.
All reports can be accessed via the Reports menu in the Project Manager. They are opened in your
web browser; If you want to have a closer look on a gel image or a spot, just click on it and it will be
opened and focused in Delta2D. You can save the reports in HTML format that is ready to be published
on the web.
Project Summary
The project report shows a summary of your analysis project. It includes an overview of gel
images and warpings as well as general data about the gel images, groups, samples, and
images.
The dual channel images included in the report give a good indication of the quality of the
direct warpings in the project. You can open a dual channel image in Delta2D by clicking on
it.
Just like all reports you can click the save button to save it in a form that is ready to be published on
the web.
Spot Album
The spot album shows thumbnails of marked spots and the region surrounding them. You can show
spots in comparison using dual channel images . The album can be configured using the form in
the upper part of the report: You can select marked spots from ddifferent gel images, the
image they should be compared to, and the width and scale of the gel section that should be
displayed.
| Figure 8.24: | The spot album report |
|
Next to each spot row you see the expression profile as a chart. Clicking on the expression profile
takes you to a detail page that shows additional quantitative data. Click on any spot in the row to select
and show it in the dual view.
Spot Quantities
The spot quantities report shows expression profiles numerically, together with group-wise
ratios and t-Test values. You select spots for the report by marking them on a gel image.
This report is well-suited for documenting a set of relevant spots, and for further statistical
analysis.
| Figure 8.25: | The spot quantities report |
|
Modifying, Saving, and Printing Reports
All reports are produced in the form of HTML pages that are generated dynamically by Delta2D. This
means you can easily integrate them into your current project documentation. Select a part of the page
and copy it into a Microsoft Word document, or into PowerPoint. You can save the whole report using
the Save button in the top right of the report. Delta2D will then prompt you for a file to
which the report should be saved. The report will be saved without the configuration form.
The sub-pages (e.g. expression profile details from the spot album) will also be saved and
linked properly. The result is a set of HTML files and images that can be put directly on the
web.
If you want to make changes to the whole report document it is recommended that you open the
saved HTML file in a word processor. Usually, you can print the report directly from your web browser.
For more advanced printing needs (e.g. splitting wide pages) we also recommend using a word
processing program.
8.8 Exporting Spot Data to Other Applications
All the data you see in the quantitation table can be exported for further use in external programs. In the
table window, use File
Save... This will save all rows that are visible in the quantitation table, so you
can hide rows that you don't want to export.
The data is saved in a common exchange format called "comma separated values" (CSV)
that can be imported easily into a spreadsheet or other data analysis programs. For easier
reference, the column titles are given in the first line of the file. Saving data in CSV format
will take hidden columns and sorting into account, so you can use the quantitation table's
sophisticated sorting and filtering to select the rows and columns that should appear in the saved
file.
The import procedure depends on the program you use. Generally, you open the data file as a text
file, specifying that the data is separated by semicolons.
Label data, label formats and spot data are saved in XML file formats to allow for easy processing
using external applications. Detailed specifications of these formats are available upon request.
Spot Picking
Delta2D can produce output for Genomic Solutions and Molecular Dynamics spot
pickers, as well as a generic spot picking format in tabular form. Centers of detected spots as well as
arbitrary labeled points on a gel may be selected for picking.
Currently, Delta2D is shipped with support for the following spot pickers:
- Genomic Solutions ProPic
- PerkinElmer ProXCISION
- Molecular Dynamics
- Ettan Spot Handling Workstation
- Bruker Proteineer (this entry is disabled because the interface needs additional setup)
- Generic pick list format
The Generic File Format
The generic file format is a simple ASCII-text file in tabular format that
includes marked spots and labels. The tabular format can be easily transformed to other formats
if necessary. However, see below for what to do if your spot picker is not supported by
Delta2D.
Use Spots
Export generic pick list to generate a pick list in the generic format. The pick list
includes all spots that are marked (using the mark check box in the quantitation table), together with all
labels on the selected layer. For a marked spot that has no label, the spot's center will be used to define
the pick. If there are one or more labels inside a spot, one pick per label will be produced, and the spot's
center will be ignored. For a label that labels a point outside any spot, one pick will be generated, as
well.
The generic file format consists of four columns separated by a tab. They contain following
data:
-
Spot ID
- The ID of each spot as used in Delta2D.
-
Coordinates
- The next two columns mark the X- resp. Y-coordinate of the exported spot.
-
Label
- The last column contains the label of each spot.
The Molecular Dynamics File Format
The file is generated according to the same rules as the
generic file format, i.e. when you want to pick a spot, you have to mark it in the table or
place a label inside of it. The Molecular Dynamics spot picker needs two special landmarks
that are placed on the gel. In order for the robot to register the gel image to the physical
gel, you need to provide labels for the two landmarks. They have to be named "IR1" and
"IR2", respectively. Be careful that the labels point exactly to the centers of the landmark
points.
The layout of the exported text file is slightly different from the generic format. The columns are also
separated by tabs except the coordinates; they are placed in one column, separated from each other by a
comma. The ID in the first column is simply a serial number, not the one used in Delta2D. The ID used
in Delta2D is set in square brackets and attached to the labels in the fourth column. All residuary aspects
are identical with the generic format.
The Genomic Solutions File Format
The file for Genomic Solutions includes additional information:
the image field is filled with the name of the image and the name of the project. The table consists of
six columns separated by commas, out of which the last three columns consist of generic
data. The first column contains the spot definition in the form Spotn=SpotID - Label,
whereas the n in Spotn stands for the count starting from 0 and SpotID means the ID used
by Delta2D. The second and third column contain the X resp. Y coordinates of the spot
center.
The Ettan Spot Handling WorkStation File Format
This file format has a quite simple structure: the
first of the four tab separated columns counts the spots starting from 1, the next two contain the X resp.
Y coordinates of the spots and the fourth one is reserved for comments, but not used by Delta2D by
now.
What if my Picker is not Supported?
We are constantly working on broadening the range of
supported spot picking file formats. If your device is not supported, please do not hesitate to contact our
technical support – we will be glad to work with you to find a solution.
Instant MS Excel Reports
Use File
Generate Report In Excel. . . to produce an Excel worksheet
that contains the currently visible data in the table, plus an extensive set of diagrams and statistics. You
need to have Excelinstalled to use this feature.
________________________________________________________________________________
-
Note:
- Since the Excel report is meant to compare the spot data of different gel images to
each other, and the saving and exporting of table data always refers to what you see,
this feature is only available from the views in the table showing data of multiple gel
images.
Additionally, you can export just the contents of a quantitation table into Excel, using Delta2D to sort
and filter data before the worksheet is created. Use File
Export into Excel to retain exactly the data
displayed on the activated table in exactly the same alignment of columns and rows in a new
Excel sheet.
This feature is tested with MS Excel Versions 2000 (9), XP (10) and 2003 (11)
Instant Export to MS PowerPoint
From within the Gel Image Pair View window, you can create a
PowerPoint slide that includes everything you see in the gel image view: images, spots, and labels. Open
a snapshot window using Edit
Snapshot. In the snapshot window, use File
Export to PowerPoint
to produce a PowerPoint slide that contains all objects which are currently visible in the gel image view.
These objects are fully editable inside PowerPoint. You need to have PowerPoint 2000, XP, or 2003
installed to use this feature.