DEVELOPMENTAL STABILITY: THE SEEMING SIMPLICITY
OF THE METHODOLOGY
The recent study "The Health of the Environment: Assessment Methodology.
Assessment of Natural Populations on Their Developmental Stability: Methodical
Manual for Reserves" (V.M. Zakharov, A.S. Baranov, V.I. Borisov, A.V.
Valetsky, N.G. Kryazheva, E.K. Chistyakova, A.T. Chubinishvili. Moscow:
Edition of Russian Ecological Policy Center, 2000. 66 pp.) focuses on the use
of developmental stability to assess environmental quality. This study will
no doubt attract the attention of many researchers. Most appealing is the simplicity
of the methods for measuring and calculating fluctuating asymmetry: the reader
gets the impression that even a schoolchild could do it. Yet, is the matter
as simple as that? We think not.
The study presents an optimistic view of the universality of correlation between
unfavorable impact on an organism and the reduction of developmental stability
that is shown in increasing fluctuating asymmetry. Unfortunately, this approach
reflects the situation that existed a decade ago; in recent years a more cautious,
even skeptical approach prevails to using fluctuating asymmetry to discover
stresses among animals and particularly plants. The names of some of the discussion
topics speak for themselves: "Waltzing with asymmetry" (Palmer,
1996) and "What does sexual trait FA tell us about stress?" (Bjorkstein et al., 2000). Critics of researches based on measuring fluctuating asymmetry
have found a significant number of methodological flaws (Merila, Bjorklund,
1995; Bjorklund, Merila, 1997; Van Dongen et al., 1999), which
cast doubt on some previously published deductions. On the other hand, the absence
of fluctuating asymmetry changes does not necessarily mean an absence of stress
(Anne et al., 1998): some species demonstrate an unchanging asymmetry
level even if the level of industrial pollution is very high (Zvereva et al., 1997; Valkama, Kozlov, 2001). Published negative results (i.e.
results that do not fit the universal concept) constitute approximately one
third of all publications (Bjorkstein et al., 2000). So, we can say that
the method (in its present state) is far from being universal.
The key methodological aims of any research work are to ensure that the results
and assessment of authenticity of observed phenomena can be replicated. Problems
relating to developmental stability have been much discussed in recent years.
But they are not even mentioned in the study under review. Given the interest
international edition reviewers traditionally have in statistics analysis methods,
I consider such simplification very dangerous: it may result in a number of
new publications in Russian editions which the world scientific community would
see as informational turmoil. And international journals would certainly refuse
to publish articles using this methodological approach.
First of all, any character measurement contains a certain margin of error.
Thus, even if we measure an ideally symmetrical organism we can get an FA value
not equal to zero. Apparently, measurement error should be considered in calculations
(Merila, Bjorklund 1995; Bjorklund, Merila, 1997). This is possible
if you take multiple (two or three) measures. Further we do not calculate average
measure value, we use derived figures in dispersion analysis (See: Van Dongen et al., 1999). From the analysis we learn whether the taken FA measure really
differs from zero, or, in other words, whether the value is true or whether
there is a hindrance.
The underestimation of measure errors may easily lead to wrong conclusions.
As an example I can refer to my own research that was based on now outdated
but then generally accepted methods (Kozlov et al., 1996). During the
research, we discovered a significant FA increase of birch tree (Betula pubescens
subsp. czerepanovii) leaves drawing near the North Nickel Plant (Murmansk
Region). But later on we figured out that the birch tree leaf asymmetry in the
given gradient of pollution does not change (Valkama, Kozlov, 2001).
The error resulted from insufficient accuracy of measures taken (rounding off
to 1 mm, as it was recommended in the research work under review), coinciding
with the decrease of leaf size as we drew nearer to the plant. We were wrong
to take the increase of relevant measure error for an asymmetry increase. Modeling
showed that twofold repeat of measures with 0,5 mm accuracy would have allowed
us to avoid the mistake even if we had used outdated data analysis methods,
while today's analysis methods (Van Dongen et al., 1999) protect research
studies from such mistakes.
Secondly, there are three types of asymmetry: set, fluctuating and anti-asymmetry
(Palmer, Strobeck, 1986). Of the three, only fluctuating asymmetry can
tell us (not always) something about the stress on an organism (Moller, Swaddle,
1997). Thus, the first stage of any analysis should be to find proof for classifying
the observed fluctuations from ideal symmetry as fluctuation asymmetry (Palmer,
Strobeck, 1986; Van Dongen et al., 1999). This stage, described in
detail in all 20 randomly selected English language publications from 1996 to
2001, is absent not only from the study, but also in the publication of original
results to which the authors refer (See: Zakharov, etc., 2000).
The authors' proposal that we add up (in the first stages of data analysis!)
FA values for a number of features of one and the same object is cause for serious
objection. If the features under analysis were correlated (as, for instance,
various measures of birch tree leaf plate), taking measures from some of them
would give us no additional information as opposed to taking measures from one
of them. If features change independently of one another, summing them up can
result in missing very important information or drawing erroneous conclusions:
if one of the features reacts clearly to the impact, while none of the rest
do, averaging may do a disservice to the researcher. So, totaling as a means
of "information curtailment", if it can't be done without, should
be applied only in the final stages of the analysis, taking into consideration
not only average value, but also individual asymmetry levels separately for
each feature under analysis.
It is hard (or almost impossible) to agree with the proposal of the authors
to assess the significance of differences between samples by using Student criteria.
This method, of which Russian biologists are so fond, is no longer used in the
West, where comparison of sampling estimation of FA is carried out with the
use of disperse analysis (ANOVA, or ANalysis Of VAriance) (Moller, Swaddle,
1997; Van Dongen et al., 1999).
And finally, in assessing anthropogenic impact on ecosystems, it is important
to choose the right place for selecting materials for the analysis. Most studies
on relevant topics by Russian scientists are based on comparison of samples
from only 3-5 selection places, located along one (only one) gradient (source)
of pollution. The authors of the study use the same approach. Unfortunately,
this philosophy does not allow one to distinguish the presumed impact of discharges
from the influence of other environmental factors not taken into consideration
by the researcher. For example, most research done near North Nickel is based
on 4-8 sampling sites, located at different distances to the South of the plant.
In this case there is no logical basis to explain changes (for instance, reduction
of needles in size) due to pollution, and not changes in local climatic features
from the North to the South, or impact of any other environmental factor. To
estimate the impact of any factor one must compare data from at least two independent
samples from the presumably impacted area with two control samples. To graph
the source of pollution it is recommended to choose samples located along oppositely
directed transects. Enlarging the number of samples considerably improves reliability
of the results. In other words, differences between Impact and Control should
be compared to changeability within each of these groups.
The ten trees analyzed within one sampling area (according to the authors)
in reference to the objective of the analysis cannot be considered independent
replications - they are "pseudo-replications", to quote S.
H. Hurlbert (1984); there is only one genuine replication in this case.
Unfortunately, the problem of pseudo-replications in ecological researches
- with which Western scientists can cope after S. H. Hurlbert's publication
(See: Heffner et al., 1996) - remains unknown to Russian ecologists.
The study's list of background literature on developmental stability of different
organisms is a great puzzle: it includes only works by the authors of the study.
The innocent reader could get the impression that the method had never been
used by anyone else, and this is not true. Since we cannot suspect the authors
of not knowing English-language literature, the reasons for not mentioning it
remain incomprehensible.
From my point of view, suppressing my criticisms of the given study could do
harm rather than good. Its relatively large print run (1,000 copies) and the
support of the Reserves Department of the Russian State Ecological Committee
(referred to in the Introduction) make us fear that the efforts of many reserve
specialists to whom the study is addressed may be wasted. Moreover, the results
of their work could lead to fallacious conclusions and become the basis for
unfounded decisions. For example, I can easily prove that any source of pollution
has not impacted the environment negatively.
In conclusion I would like to emphasize that my criticisms should not be viewed
as a call to stop using fluctuating asymmetry for estimating environmental quality.
On the contrary, I believe that this direction is rather promising, but only
given careful selection of preliminary information and comprehensive analysis
of collected data (Kozlov, Niemela, 1999; Kozlov et al., 2001;
Valkama, Kozlov, 2001). Unlike the authors of the study, I recommend
that you:
be very careful when choosing the place for collecting materials, planning
at least two independent replications for each level (or type) of impact under
comparison ;
insist on maximum accuracy (at least 0.5 mm for objects with linear dimensions
15-50 mm, such as birch tree leaves, and 0.1 mm for objects with linear dimensions
3-15 mm, such as leaves of bilberry or dwarf birch tree);
measure each object at least two times, estimate the degree to which the
results can be replicated and margin of error based on independent selections;
investigate each feature separately during the analysis of one and the
same object;
use modern methods of statistics analysis (mixed model ANOVA) to distinguish
between the three types of asymmetry and to prove the statistical significance
of measured FA value;
use dispersion analysis (ANOVA) when comparing samples; use parallel comparison
when required (for example, Duncan's multiple range test);
remember that one negative result (absence of FA changes) very often does
not imply the absence of stress.
Critical analysis of methodical indications has been done within the framework
of the Vulnerability of Northern Ecosystems to Pollution and Climate Change
Project, supported by NorFA (Nordic Academy of Advanced Studies).
Literature
V.M. Zakharov, A.T. Chubinishvili, S.G. Dmitriev, A.S. Baranov, V.I. Borisov,
A.V. Valetsky, E. Y. Krysanov, N.G. Kryazheva, A.V. Pronin, E.K. Chistyakova.
The Health of the Environment: Assessment in Practice - Moscow Russian Ecological
Policy Center Edition, 2000. - 318 p.
Anne P., Mawri F., Gladstone S., Freeman C. D. Is fluctuating asymmetry
a reliable biomonitor of stress? A test using life history parameters in the
soybean // Int. J. of Plant Sci. - 1998. - Vol. 159. - P. 559-565.
Bjorksten T. A., Fowler K., Pomiakowski A. What does sexual trait FA
tell us about stress? // Trends Ecol. Evol. - 2000. - Vol. 15. - P. 163-166.
Bjorklund M., Merila J. Why some measures of fluctuating asymmetry are
so sensitive to measurement error? // Ann. Zool. Fen. - 1997. - Vol. 34. - P.
133-137.
Heffner R. A., Butler M. J.-IV, Reilly C. K. Pseudo-replication revisited
// Ecology. - 1996. - Vol. 77. - P. 2558-2562.
Hurlbert S. H. Pseudoreplication and the design of ecological field
experiments // Ecol. Monogr. - 1984. - Vol. 54. - P. 187-211.
Kozlov M. V., Niemela P. Difference in needle length - a new and objective
indicator of pollution impact on Scots pine (Pinus sylvestris) //
Water, Air, and Soil Pollution. - 1999. - Vol. 116. - P. 365-370.
Kozlov M. V., Wilsey B. J., Koricheva J., Haukioja E. Fluctuating asymmetry
of birch leaves increases under pollution impact // J. of Appl. Ecol. - 1996.
- Vol. 33. - P. 1489-1495.
Kozlov M. V., Zvereva E. L., Niemela P. Shoot fluctuating asymmetry
- a new and objective stress index in the Norway spruce (Picea abies)
// Can. J. of Forest Res. - 2001. - Vol. 31. - P. 1289-1291.
Merila J., Bjorklund M. Fluactuating asymmetry and measurement error
// Systematic Biol. - 1995. - Vol. 44. - P. 97-101.
Moller A. P., Swaddle J. P. Asymmetry, developmental stability, and
evolution. - Oxford: Oxford Univ. Press, 1997. - 291 p.
Palmer A. R. Waltzing with asymmetry // BioScience. - 1996. - Vol. 46.
- P. 518-532.
Palmer A. R., Strobeck C. Fluctuating asymmetry: measurement, analysis,
patterns // Ann. Rev. of Ecol. and Systematics. - 1986. - Vol. 17. - P. 391-421.
Valkama J., Kozlov M. V. Impact of climatic factors on the developmental
stability of the mountain birch growing in a contaminated area // J. of Appl.
Ecol. - 2001. - Vol. 38. - P. 665-673.
Van Dongen S., Molenberghs G., Matthysen E. A statistical analysis of
fluctuating asymmetry: REML estimation of a mixed regression model // J. of
Evol. Biol. - 1999. - Vol. 12. - P. 94-102.
Zvereva E. L., Kozlov M. V., Haukioja E. Stress responses of Salix
borealis to pollution and defoliation // J. of Appl. Ecol. - 1997. -
Vol. 34. - P. 1387-1396.
M. Kozlov,
Ecology Laboratory, Turku University, Finland