An Introduction to Mathematical Taxonomy

An Introduction to Mathematical Taxonomy

An Introduction to Mathematical Taxonomy

An Introduction to Mathematical Taxonomy

eBook

$8.99  $9.99 Save 10% Current price is $8.99, Original price is $9.99. You Save 10%.

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers

LEND ME® See Details

Overview

Students of mathematical biology discover modern methods of taxonomy with this text, which introduces taxonomic characters, the measurement of similarity, and the analysis of principal components. Other topics include multidimensional scaling, cluster analysis, identification and assignment techniques, more. A familiarity with matrix algebra and elementary statistics are the sole prerequisites.

Product Details

ISBN-13: 9780486151366
Publisher: Dover Publications
Publication date: 04/30/2012
Series: Dover Books on Biology
Sold by: Barnes & Noble
Format: eBook
Pages: 160
File size: 5 MB

Read an Excerpt

An Introduction to Mathematical Taxonomy


By G. Dunn, B. S. Everitt

Dover Publications, Inc.

Copyright © 1982 Cambridge University Press
All rights reserved.
ISBN: 978-0-486-15136-6



CHAPTER 1

An introduction to the philosophy and aims of numerical taxonomy

1.1 Introduction

Classification of organisms has been a preoccupation of biologists since the very first biological investigations. Aristotle, for example, built up an elaborate system for classifying the species of the animal kingdom, which began by dividing animals into two main groups; those having red blood, corresponding roughly to our own vertebrates, and those lacking it, the invertebrates. He further subdivided these two groups according to the way in which the young are produced, whether alive, in eggs, as pupae and so on. Such classification has always been an essential component of man's knowledge of the living world. If nothing else, early man must have been able to realize that many individual objects (whether or not they would now be classified as 'living') shared certain properties such as being edible, or poisonous, or ferocious, and so on. A modern biologist might be tempted to remark that the earliest methods of classifying animals and plants by biologists were, like those of prehistoric man, based upon what may now be considered superficially similar features, and that although the resulting classifications could have been useful, for example for communication, they did not in general imply any 'natural' or 'real' affinity. However, one might then be tempted to ask what are 'superficially similar features' and what are 'real' or 'natural' classifications? Such questions and the attempts to answer them will be discussed in later parts of this chapter.


1.2 Systematics, classification and taxonomy

Before proceeding further it is necessary to introduce a number of terms which will be met frequently throughout the rest of the book. The definitions given here are intentionally brief; the full extent of the meaning of each term will become apparent during the remaining chapters.

Systematics–the scientific study of the kinds and diversity of organisms and of any and all relationships among them (Simpson, 1961).

Classification–the ordering of organisms into groups on the basis of their relationships. The relationships may be genetic, evolutionary (phylogenetic) or may simply refer to similarities of phenotype (phenetic).

Taxonomy–the theory and practice of classifying organisms (Mayr, 1969). (In the last two definitions it is important to distinguish classification, meaning the construction of classificatory systems, from the process of placing an individual into a given group, or the act of classifying, which is more properly referred to as identification; see Chapter 7.)

Once an ordering of organisms has been achieved one of course needs a means of referring to the classified groups; that is, one needs a convenient and informative method of nomenclature. The last word to be defined here is taxon. When one speaks of robins, lions, orchids or yeasts one is referring to the members of distinct groups of organisms, called taxa. A taxon is a taxonomic group of any rank that is sufficiently distinct to be worthy of being assigned to a definite category (Mayr, 1969). This definition implies that the delimitation of a taxon against other taxa of the same rank is virtually always subject to the judgement of the taxonomist.


1.3 The construction of taxonomic hierarchies by traditional and numerical taxonomy: comparison of methods

In order to summarize and make sense of the diversity of organisms the taxonomist customarily constructs a taxonomic hierarchy in which a taxon occupies a position in a nested scheme such as that given in Table 1.1, involving the classification of wolves, honeybees and common wasps. The hierarchy is intended to illustrate that different species within a given genus are more similar to one another than to species of other genera. Similarly, genera of one family are more similar to one another than they are to those of different families, and so on. Wolves are clearly not very similar to bees and wasps, but they are all classified as being members of the animal kingdom (Animalia), implying that they share some properties that are not characteristic of, say, members of the plant kingdom (Plantae). In addition, wasps and bees share properties not characteristic of wolves; that is, they are both Hymenoptera ('having membranous wings').

Taxonomy as a quantitative science is concerned with the problems of constructing such (usually) hierarchical structures, and in operation consists of essentially four separate stages. First one has to decide on what one wishes to classify. On the assumption that one is able to distinguish living from inanimate material (by studying the history of science one can see that the distinction is by no means trivial), one could select, for example, deoxyribonucleic acid (DNA) sequences, proteins, organisms, species, or some more complex groups. In order to do this one has to have previous knowledge, or a previous system of classification, or else one would not be able to distinguish animate from inanimate objects, animals from plants, or daisies from orchids. No modern classification occurs in the absence of such previously formed classifications; one's knowledge is always built on previous experience, whether one ultimately rejects the previous ideas or merely adds to them.

Next one decides on the choice of characters on which to base comparisons between the taxonomic units (referred to as operational taxonomic units, OTUs, by numerical taxonomists). Now, despite the fact that numerical taxonomists sometimes claim that they choose as many characters as possible (Sokal & Sneath, 1963), this clearly cannot be true. Both the traditional taxonomist and the numerical taxonomist are forced to make subjective decisions on what sort of characters to select for comparison, but while there may, in practice, be differences in the way they choose these characters, the real difference between the two approaches lies in what they then do with the resulting observations; that is, in the assessment of similarity between units and in the use made of these similarities to construct the final classification.

The traditional taxonomist makes intuitive or subjective decisions concerning similarity, which, he claims, are based upon experience, skill and perhaps insight. The numerical taxonomist, on the other hand, bases his comparisons on an estimate of a defined measure of similarity (see Chapter 3), which is objective in the sense that the measure can be re-estimated by a second taxonomist using a different set of observations; such a procedure has a further advantage in being open to criticism in a way that an intuitive, subjective decision cannot be.

The final stage of the four is to make decisions concerning the classification of units on the basis of their previously assessed similarities. Again the traditional taxonomist will base such decisions on intuition, experience, and skill (he hopes!), whilst the numerical taxonomist resorts to a defined set of rules within one of the many cluster analysis techniques available (see Chapter 6). Which is the better method? A priori one cannot tell. However, there are situations where one can quite easily decide which is the easier, or more economical in terms of intellectual effort. For example, how does one effectively judge similarity between amino acid sequences of proteins without referring to a set of rules? Again, how does one assess a gradient or gradients of properties of characters across, for example, the British Isles, without resorting to some sort of defined quantitative measurements? It is in such situations and in many others that we feel that the methods of numerical taxonomy will be more applicable or more useful than the traditional approaches associated with the names of Linnaeus, Darwin or Mayr.


1.4 The philosophy of taxonomy

Consider a hypothetical situation in which one is asked to classify individuals within each of the following groups: warblers, hawkweeds, enteric bacteria, viruses, neolithic ceramics and rocks. Are the methods of classifying rocks and ceramics applicable to the classification of living organisms? Are the methods of classifying warblers applicable to bacteria? Most biologists would answer 'no' to the first question, and many would give the same answer to the second. Why? Why do biologists often regard the classification of living material to be something special, needing its own particular logic or philosophy? It is not the purpose of this book to give final answers to these questions, but some discussion is needed since the numerical taxonomist explicitly denies that there are, or should be, any particular methodologies specifically applicable to the classification of the living world.

One does not have to read many textbooks on taxonomy to realize that there is no single underlying philosophy for this field, and one is tempted to conclude that 'anything goes' (Feyerabend, 1975). Much of the controversy appears to centre around the biologist's concept of a species. The typical view of a 'traditional' taxonomist (Simpson, 1961; Mayr, 1969) is that species (and often genera and higher taxa) are real entities that have to be discovered or revealed by the methods of classification. '... individuals do not belong in the same taxon because they are similar, but they are similar because they belong to the same taxon' (Simpson, 1961). The implication of such a view is that a particular classification is equivalent to a scientific theory, and so could be shown to be wrong. One particular difficulty of this belief is the necessity of producing a definition of species which is applicable to animals, plants and microorganisms. Most of the traditional views concerning the definition of a species are irrelevant when one considers bacteria and viruses. Surely, even if one accepts the view that classification of organisms is, or should be, different from the classification of rocks, one needs to have a philosophy of taxonomy that will apply to all of the living world, and not just to, say, animals.

An alternative view is that

Nature produces individuals and nothing more ... species have no actual existence in nature. They are mental concepts and nothing more ... species have been invented in order that we may refer to great numbers of individuals collectively. (Bessey, 1908)


Gilmour (1940) has summarized this alternative view of classification as follows:

The classifier experiences a vast number of sense data which he clips together into classes ... thus a class of blue things may be made for sense data exhibiting a certain range of colour, and so on ... the important point to emphasize is that the construction of these classes is an activity of reason, and hence, provided they are based on experienced data, such classes can be manipulated at will to serve the purpose of the classifier ... The classification of animals and plants ... is essentially similar in principle to the classification of inanimate objects.


This is the philosophy of the numerical taxonomist. The implication of the numerical taxonomist's approach is that the resulting classification can be neither right nor wrong. It is not a theory, but merely a way of summarizing information in an intelligible form. One assesses its value by consideration of its usefulness to other biologists. If one accepts this view one can quite easily accept that traditional (evolutionary) and numerical (phenetic) taxonomies can exist side by side. One does not judge the classificatory method on the a priori beliefs of the taxonomist, but on the usefulness of the results, a view endorsed by Ruse (1973):

A classification is a division based on a set of rules and, for this reason, is neither true nor false (which is what a theory is). This is not to deny that if, for example, evolutionary taxonomists can show that phenetic taxonomy is inferior to evolutionary taxonomy in its ability to enable taxonomists to summarize material or to predict things, then in this respect phenetic taxonomy is fair game. The proof of the pudding is in the eating, and if phenetic taxonomists cannot deliver what they claim to be able to deliver, then they are rightly open to criticism.


When assessing the utility of a particular approach to classification, one always has to bear in mind the reasons for which the classification was made. The first important role of any system of classification is as an aid to memory, particularly if the classification is hierarchic. Knowing where a particular taxon comes in a hierarchical scheme enables one to remember many of its characteristics (particularly if the characteristics are those which were originally used to construct the taxon concerned). The second role, very closely associated with the first, is as an aid to prediction of properties that have not been used to make the original classification. If, for example, one knows that orchids have a characteristic association with saprophytic fungi (a characteristic unlikely to have been used in the construction of the taxon Orchidaceae), it can be predicted with reasonable confidence that a plant identified as an orchid from its flower structure will also be growing in association with a fungus. Finally, an important function of any classification of the living world is its explanatory power, particularly with respect to the pathways of evolution. (This will be dealt with in greater detail in the next section.)

One argument for classifications produced by a numerical taxonomist, which fulfil these three roles at least as efficiently as those produced using traditional methods, lies in the amount of information utilized by each approach, the numerical taxonomist tending to use more, and more diverse, characters on which to base his classification. (This argument will be developed in the next chapter; see section 2.2.)


1.5 Classification and inferences concerning patterns of evolution

Virtually all present-day biologists believe in two fundamental concepts pertaining to the scientific study of the living world. The first is that of evolution through natural selection. The second is that of a universal genetic code; that is, the concept that all of the information required for the development of an organism is contained in coded sequences of nucleotide bases in deoxyribonucleic acid (DNA), or occasionally, as in some viruses, in ribonucleic acid (RNA). Evolution can be thought of as either the evolution of populations of organisms or of populations of nucleotide sequences, or both. Individuals clearly do not evolve in the above sense since they do not survive for more than a few years, at the most. What can the results of taxonomy tell one about the patterns of evolution? Attempting to produce answers to this question is, intellectually, one of the most interesting uses to which a classification can be put, and it is here, perhaps, that one assesses the value of any particular method of classification.

It is vital that the student of evolution distinguishes phenetic relationships, which are based on the properties of organisms as they are observed now, from phylogenetic relationships, which describe the evolutionary pathways that have given rise to these organisms and their properties. The most important phylogenetic relationship is that expressed by a genealogy, and this is called a cladistic relationship. One can also define a genomic relationship between organisms based on the similarity of their DNA (or RNA) sequences. Now, one can use the methods of numerical taxonomy to classify organisms either on the basis of their phenetic relationships, or on the basis of their genomic relationships (or both). The latter can be obtained by the study of nucleotide sequences, or indirectly from the amino acid sequences of proteins. Finally, one hopes to infer phylogenetic or cladistic relationships from the resulting classification. It makes no difference to the argument whether the phylogenetic relationships are inferred from the classification itself or from the original distances or similarities; what is important is the fact that they are always inferred from phenetic or genomic relationships. Details of how this is done will be discussed later.

But how do the views of a zoologist such as Simpson differ from this? He claims that, since one knows that populations have evolved through a process of natural selection, one should assess one's system of classification from what is known about the past pathways of evolution, these being inferred from the study of, say, fossil evidence. The following statement taken from Simpson (1961) summarizes this point of view:

It is preferable to consider evolutionary classification not as expressing phylogeny, not even as based on it (although in a sufficiently broad sense that is true), but as consistent with it. A consistent evolutionary classification is one whose implications, drawn according to stated criteria of such classifycation, do not contradict the classifier's view as to the phylogeny of the group.


Hence the term 'evolutionary taxonomist'. The difficulty of this approach is the problem of assessing phylogeny independently of a system of classification, and the argument has been rejected by numerical taxonomists as circular (Sokal & Sneath, 1963). How does one use fossil evidence to infer pathways of evolution without first classifying the fossils and in some way assessing their similarity to living organisms?


(Continues...)

Excerpted from An Introduction to Mathematical Taxonomy by G. Dunn, B. S. Everitt. Copyright © 1982 Cambridge University Press. Excerpted by permission of Dover Publications, Inc..
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

1. An introduction to the philosophy and aims of numerical taxonomy.
2. Taxonomic characters.
3. The measurement of similarity.
4. Principal components analysis.
5. Multidimensional scaling.
6. Cluster analysis.
7. Identification and assignment techniques.
8. The construction of evolutionary trees.
References.
Indexes.
From the B&N Reads Blog

Customer Reviews