An e-publication by the World Agroforestry Centre

METEOROLOGY AND AGROFORESTRYPrintprint Preview

section 4 : measurement and analysis of agroforestry experiments

Sampling procedures

S. Langton

Statistics Department
Rothamsted Experimental Station
Harpenden, Herts AL52JQ, England

 

Abstract

Where large plantations are being studied it is necessary to use some form of sampling. The various methods available (eg. simple random sampling, stratified random sampling and systematic sampling) will be discussed. Particular attention will be paid to the situation where climatic variables can be used as a basis for stratification.


Why sample?

In any scientific investigation there will be situations where it is advantageous to use some form of sampling. There are three basic reasons why this may be the case:

  1. Complete enumeration might take too long or be too expensive due to the large number of measurements needed; e.g., we might wish to find the average height of the trees in a 10,000 ha forest;

  2. Destructive measurements: for example, the true germination rate of a batch of seeds could be found by growing them in a laboratory, but this would leave none to sow in the field;

  3. In some circumstances sampling may actually give more reliable results than complete enumeration. This may arise if the smaller number of measurements allows more care to be taken over each or permits the use of more sophisticated equipment, leading to more accurate measurements.

Definitions

In everyday speech the words 'accuracy' and 'precision' are nearly synonymous, but in the context of sampling they have different meanings. Accuracy refers to the success in estimating the true value, whereas precision describes how closely clustered the values of successive estimates are. To make this clearer, consider the measurement of wind speed. If we asked four people to guess the speed and took an average of their guesses, the result might, by chance, be very accurate (i.e. close to the true speed), but would probably be imprecise (the four estimates are likely to be very different). On the other hand, if we took four readings using an anemometer which always under-recorded the speed by 20 per cent, the results might be precise, but not very accurate. The latter result could also be described as biased; that is, it contains a systematic error. Whether precision or accuracy is more important depends on the situation being investigated, although both properties are obviously desirable.

In a sampling procedure the objects or sites, on which individual measurements are made, are termed sampling units and the population consists of all such sampling units. A sample refers to a group of sampling units selected from the population, on which measurements are made.


Simple random sampling

In simple random sampling not only does each sampling unit in the population have the same chance of inclusion in the sample, but each combination of n units (where n is the sample size) is equally likely to be included. This is achieved by randomly selecting each unit in the sample from amongst the whole population.

The method is best considered by taking an example. Suppose there are forty shade trees in a section of a coffee plantation and we wish to make a variety of measurements at the tree-crop interfaces of a sample of these trees. (Thus in this case the sampling units are the trees or, more specifically, the forty interfaces.) The stages in taking a simple random sample are:

  1. Decide on the population. This might be all forty trees, or we might wish to exclude trees on the edge of the plantation, for example.

  2. Decide on the sample size. This will depend on the equipment available and the precision required for the final estimates.

  3. Select the sample by a random method (usually by using tables of random numbers). Beware of semi-random methods, such as throwing quadrats; these lead to bias.

  4. Take the measurements and calculate means and standard errors.

Systematic sampling

One problem with simple random sampling is that coverage can be very uneven. Thus in the example of the last section all the trees selected to form the sample might come from the same corner of the plantation. An obvious way to avoid this is to select a regular pattern of sampling units (e.g., every fifth tree); this is systematic sampling. To give another example, in a soil survey of an area of land a grid of squares would be drawn on the map and a soil sample taken from the same position within each square. The particular systematic sample chosen must be selected at random and the simplest way of doing this, in order to avoid all risk of this, is to choose one sampling unit at random from amongst the whole population, and then to select the other units in the sample relative to it.

The advantages of systematic sampling are as follows:

  1. Its even coverage will frequently lead to greater precision than a simple random sample;

  2. Once the random starting point has been chosen the process is very simple to operate;

  3. The information obtained can easily be used to construct a map, if desired;

  4. The population size need not be exactly known at the start.

The chief disadvantage of systematic sampling is that the precision cannot be calculated and so no reliable standard error can be quoted for the mean. Approximate formulae, which give some idea of the precision subject to certain conditions, are to be found in Yates (1981) and Cochran (1977). How important knowledge of precision is will depend on the purpose of the experiment and the nature of the variable being measured; the experimenter must weigh this against the likely advantages before deciding to adopt systematic sampling.

One situation where systematic sampling should be used with care is when there is some form of periodicity or regular variation amongst the sampling units. For example, it is obviously unwise when sampling soil on terracing 10 m wide to adopt a systematic sample at 10 m intervals (or 20 m or 30 m, etc.), since all samples will be at the same position relative to the terraces. Even when the sampling interval is not particularly close to the scale of periodicity (or a multiple of it) there is some risk of bias, especially in small samples. Unaligned systematic sampling may help in this situation (see Cochran 1977).


Stratified random sampling

Both the methods discussed so far have not exploited the structure of the population; stratified random sampling divides the units into two or more strata, grouping similar units into the same stratum. Separate estimates are then obtained for each stratum, usually using simple random sampling, and these are finally combined to give an overall estimate. Provided the sampling units in a stratum really are less variable than units in different strata, this method provides a more precise estimate than does simple random sampling. In addition the method is very useful where separate estimates are in any case needed for different sections of the population, or where it is desirable to divide the population for administrative reasons. One slight disadvantage is that it is necessary to know the sizes of the strata.

Many different factors can be used to stratify the population; for a large survey (e.g., over a whole region) it might be possible to use climatic data, where these exist. Otherwise factors such as geographic area, soil type, height above sea level, or land ownership can be used. Satellite photographs have formed the basis for stratification in some studies. On a smaller scale, any past or present knowledge of the site can be used. Where the edge of a plantation or plot is different from the centre, this can possibly be treated as a separate stratum.

Formulae for the estimates and their standard errors can be found in any of the books recommended below. These books also contain advice and formulae relating to the number of units that should be sampled from each stratum.


Discussion

In conclusion, the relative merits of the methods discussed can be summarized as follows: Simple random sampling is adequate for populations that are fairly homogeneous, particularly where the total number of units is small. Stratified random sampling, on the other hand, should be used when the population is heterogeneous, but can be divided into more homogeneous sub-populations. Systematic sampling is a good procedure to adopt when even coverage is important, but is not appropriate if exact knowledge of the precision is required, and should be used with care when there is any regular variation amongst the sampling units.

This review of sampling procedures is by no means exhaustive and the following topics would repay some study for those doing a lot of sampling; multistage sampling, cluster sampling, and ratio and regression estimators (including double sampling as a means of updating surveys, which is described in Freese (1961)).


Acknowledgement

This contribution was prepared while the author was funded by the U.K. Overseas Development Administration.


References and further reading

The following books are readable, and do not demand a high level of mathematical knowledge:

Freese, F. 1962. Elementary forest sampling. U.S. Department of Agriculture Handbook No. 232.

Freese, F. 1984. Statistics for land managers; an introduction to sampling methods and statistical analysis for foresters, farmers and environmental biologists). Paeony Press. (This is the same as the 1962 Freese book.)

Sampford, M.R. 1962. An introduction to sampling theory. Edinburgh and London: Oliver & Boyd. (This book was written for agriculturists, and is highly recommended).

Webster, R. 1977. Quantitative and numerical methods in soil classification. Oxford: Clarendon Press. (Chapters 4-6 cover sampling, with particular reference to soil sampling).

When further details are required the books below should be consulted. However, they are not nearly as easy to read as the above and are therefore of much less use to the beginner.

Cochran, W.G. 1977. Sampling techniques. New York and London: John Wiley and Sons.

Yates, F. 1981. Sampling methods for censuses and surveys. London: Griffin.

Finally, the following guide, which is not yet published, should be a very useful source of advice on the practical details of sampling in a forest:

Adlard, P.G. (in preparation). Guide to the establishment, measurement and analysis of permanent sample plots. Oxford: Forestry Institute.