Taxonomic names are the only point of reference for most users of taxonomy, but are infact pointers or labels for underlying taxonomic concepts. Names are determined by complex rules of nomenclature (including priority and typification) and reflect the taxonomic classification of the taxon of that name (i.e. the Taxon concept). A consequence of these rules is that an identical name can be shared between different taxonomic concepts (for example if in a revised classification a species is subdivided, one new concept may retain the old name). Furthermore, different names can apply to the same or overlapping set of specimens according to different classifications. Such Taxon Concepts can be linked by a variety of 'synonymy' relationships, which connect a use of one name (i.e. a concept) to a use of another name. Such relationships between concepts are often expressed and interpreted as relationships between names. Use of species names can never be truly separated from a taxonomic classification because the rules of binomial nomenclature obscure the boundary between classification and nomenclature for taxon names below the hierarchical level of 'genus' (see for example Berendsohn (1998) also mirrored).

Full taxonomic names should be modelled to include relevant author and citation information, although frequently users will omit or abbreviate part or all of the full name. Far richer models of names (see Fig1) can include details on taxonomic rank and parent taxa (i.e. classification hierarchy), synonymy relationships, nomenclatural status, protonym/basionym, vernacular names, orthographical and typographical notes and full citation (author and reference) information for all of this expanded information associated with a name. The LinneanCore schema provides one such extremely rich model of taxonomic name information, which although name based, clearly encapsulates full taxon concept information, where the uniqueness of a record (i.e. a unique taxon concept) depends on more than just the full name fields (see Fig2). If only one 'hierarchy' is permitted for a name record, the Linnean Core NameRecord can record the essence of a Taxon Concept but differs from the TCML Taxon Concept by lacking the notion of circumscription (the included specimens or taxa (or possibly characters) that define a taxon). Infact Name-based representations of taxonomic hierarchies are typically modelled by classification according to parental inclusion, a taxon having a single parent, rather than defining a taxon by its (many) circumscribing members.

The model of Nomenclature implemented by uBio as the basis of the NameBank layer of their NameServer is similar to (a simplified representation of) LinneanCore, with the 'subjective' layer in the model ('ClassificationBank') encapsulating the taxonomic classifications/opinions in LinneanCore. Thus NameServer currently provides purely nomenclatural resolution, which could be expanded to include analysis of classification and synonymy opinions, but if name objects can be linked to multiple alternative classifications, the uniqueness of Taxon Concepts is less readily represented than in LinneanCore. And t eh name on its own is still a 'Weak Entity'.

Taxon Concepts

The TDWG Concept Schema model of Taxon Concepts views A Name can be modelled as simply an attribute of a Concept, a Name being on its own being a weak entity (it is not inherently unique - even when fully represented with name, author, year and citation, see Figure 3). The TDWG Concept Schema model of Taxon Concepts represents names as required attributes of a Concept (Figure 4). Minimally a 'Potential' Taxon Concept (sec Berendsohn) possesses a name and a 'usage' or source reference, the definition of this usage can be captured in the circumscription of the taxon.

The TDWG Concept Schema (Figure 5, overview; Figure 6, detailed) includes both simple and detailed representations of names. The detailed representations of full names are included from the ABCD schema and vary according to Kingdom (i.e. Botanical, Bacterial, Zoological and Viral). This is not deemed necessary in LinneanCore which is designed to handle better quality names from 'Name Providers' rather than those included with specimen records.

Comparing the models of Names and Taxon Concepts ({#Fig7]) it is clear that the attributes of a minimal model of a name (i.e. based purely on the rules of nomenclature) are required in a Concept model, and if a name model includes 'opinions' on classification, it is then equivalent to a 'Potential Taxon' Concept. Name based models however do not typically represent definition of taxa by circumscription, but rather by inclusion in name based taxonomic hierarchies.

Are there Different 'Types' of Taxon Concepts?

Different databases or schema will be able to supply variable numbers of the attributes of a Taxonomic Concept as defined in the TDWG transfer schema. It may be possible to measure the 'Quality' of a Taxon Concept by the number of attributes specified. For example if only a name was supplied, even if this was a full 'scientific' name, the concept could be considered 'poorly' defined, as it would not capture any notion of meaning or usage attached to the bare name. If a usage of the name is captured, by reference, this would be a better quality representation of a concept as the citation detailing the usage could be examined. However the best quality representations of a concept would explicitly define the concept, by recording the circumscription of the concept by other concepts, character descriptions or biological specimens.

Different user groups may wish or be constrained to using a certain quality of Concept representation. For example it may not be possible or valuable to record data identified by high quality taxonomic concept. For example a biologist may wish to collect or mark-up data with a general notion of a 'buttercup', or the 'AIDs virus', without constraining identification to a specifically defined Taxonomic Concept. In these cases it may be useful to use only weak quality Concepts. The TDWG schema can accommodate these different 'types' of Concepts, according to which attributes are defined, without explicitly recording/distinguishing the different types of concepts in the transfer schema. (Rather, the types would be distinguished by users on the basis of the attributes recorded).

The variety of 'Types' that are possible might include:

It should be possible to relate the weaker quality concepts to good quality concepts, in effect forming aggregations or lists of concepts (for example the Name Concept for Ranunculus bulbosus might link to all Concepts that contain a version of this name). However querying resources on the basis of such weak quality concepts would be equivalent to string searches on the indexed name attributes of Concepts.

Users of Data supplied in TDWG transfer schema format could develope business rules that allow them to distinguish different 'types' or 'qualities' of Taxon Concepts, and hence use the resources/data accordingly.

Nico Franz has dissected a detailed list of possible Taxon Concept types - based on information content or on intellectual credit (see, or mirrored).

Nico's Concept types based on information content:

Again by examining the attributes provided in data conforming to the TDWG transfer schema a user could provide business rules to recognize these types. For example a provider of a Name Resolution Service might wish to disinguish these types in order control the quality of data retreived by a query.

Figure 1: Modelling Names

A simple model for taxonomic names may represent little more than the concatenated formal name as a string, or this may be atomized into the constituents of a complex name, with more fields giving increasing complexity. Richer models, such as that of uBio's Taxonomic Name Service, include classification and synonymy relationships and may record the source of a particular usage of a name - i.e. possibly giving the name the status of a potential taxonomic concept.
(Key: unbroken lines to attribute denote common name fields used in all models of names; broken lines to attributes provide a much fuller representation of a name concept)
return to top

Figure 2: Linnean Core representation of a NameRecord (unexpanded)

return to top

Figure 3: An abstract model of a Taxon Concept

return to text

Figure 4: Taxon Concept Model

return to text

Figure 5: (Unexpanded) Representation of a Taxon Concept in the TDWG Concept Schema.

return to text

Figure 6: Representation of a Name in the TDWG Concept Schema (expanded to show inclusion of ABCD Name representation)

return to text

Figure 7: The overlap between Names and Concepts

return to text