Finding differentially expressed proteins: Statistical Tests
Methods for statistical hypothesis testing in Delta2D are
based on state-of-the-art algorithms that are also applied in the context of DNA array analysis.
 |
| Result of applying t-tests (control vs. treated) to expression profiles. Profiles and images were clustered to better visualize differentially expressed proteins. P-values are based on 1000 permutations, false discovery rate is controlled to be 5 elements or less (with overall alpha=1%).
|
In the simplest case, the experiment is a comparison of two samples,
e.g. diseased vs. control tissue, mutant vs. wild type etc. The task
then is finding those proteins that show significant differences in
expression levels. Certainly the most popular test in this area is
Student's t-Test, where the null hypothesis is that the means of
expression levels in samples A and B are the same. Rejecting the null
hypothesis then means that the protein under test is differentially
expressed.
No normal distribution of spot intensities required
One has to keep in mind that the classical Student's t-test makes the
assumption that spot quantities within replicates follow a normal distribution
which should be tested separately. Depending on the staining method you use and
other factors, spot quantities within replicate gels may not be normally
distributed. Therefore it is advisable to use one of the provided methods that
are based on permutations.
In the t-Test options dialog, click on the "Between
subjects" choose "p-values based on permutation" and either
"Use all permutations" or "Randomly group samples" and enter
1000.
Controlling the False Discovery Rate
When applying statistical tests to 2D gel data, one is faced with the
so-called multiple hypothesis testing problem: For each expression profile, a
separate test is done. Each test has a certain probability of giving a false
positive result, i.e. a protein spot is declared to be differentially expressed
while the difference was due to pure chance. The large number of tests can
produce a high number of false positives. For example, in an experiment with
2000 spots per gel, an accepted false - positive rate alpha of 5% will result in
100 proteins that are found to be "differentially expressed" although the
difference is the result of mere chance.
The MeV t-test module incorporated in Delta2D provides methods to control
the proportion of false positives in the result set (False Discovery Rate -
FDR). Overall, the False Discovery Rate approach allows one to strike a balance
between the need to find statistically valid proteins of interest and the
additional cost that is associated with following up on false positives.
In the t-Test options dialog, make sure you selected
"p-values based on permutations". Select "Stepdown Westfall and Young
methods". Choose bounds for the number of false positive spots in the
result set using the "number of false positive genes should not
exceed". Alternatively, choose a bound for the proportion of false
positive spots in the result set, using the other radio button and text
box.
|