Taxonomic Concept Standard Workshop, Edinburgh 12/05/2004

Key points

  1. General agreement that the overall structure of the schema seems appropriate and suitable for different groups.
  2. Some concern over the ultimate meaningfulness of comparing widely disparate treatments using concept models based on bibliography, specimens and descriptions.
  3. Recognition that Metadata elements should be unified with current SDD/ABCD metadata coordination, Voucher elements should reuse ABCD elements, CharacterCircumscription? elements should re-use SDD.
  4. Protocol issues need separate consideration prior to TDWG. Three main transfer modes. First is simple point-to-point transfer between applications of data sets (export/import), second is browse mode similar to SPICE interface or current SEEK test implementation, third (probably) is a DiGIR-style search interface to locate concepts using any of the included elements (but some additional complexity for DiGIR in that document structure includes three top-level containers. GBIF to liaise with SPICE and SEEK to try to establish a common model for second. More work needed to understand what may be needed for a DiGIR-style interface.
  5. Discussion of GUIDs. SEEK currently prototyping a system based on Handle system, particularly because of its acceptability to journal community. LSIDs remain a clear alternative candidate.
  6. Continuing discussion of what events and judgments lead to new concepts. Unclear whether this needs any standardisation to make it useful.
  7. Main point in schema requiring immediate general input is determination of appropriate vocabulary of relationship types between concepts to ensure that different groups can map their synonymy (and other) relationships to the standard while still guaranteeing that it is possible to perform set logic on concepts in all suitable situations.
  8. GUIDs should be associated with concepts to ensure that common concepts can be related to one another. Grand challenge here is probably in managing identifiers for publications. If we can control identifiers for these, concept identifiers become rather trivial (similar issue to controlling identifiers for collections as a means to simplify specimen identifiers.
  9. Important to get as much testing of transformation of individual data sets into the schema (and importing of data from schema) as is possible now. Recommendation for any projects represented to investigate prototyping use of the standard prior to TDWG 2004 to ensure that the standard is convincing.
  10. Next steps must include assimilation of immediate comments on schema and then circulation in wider community for comments and buy-in.

JessieKennedy, “Why do we need a taxonomic concept transfer standard and for whom?” · Most taxonomy databases name based. How to relate names not on synonymy lists. They are major source of on-line taxonomic information. · Some databases model taxonomic concepts, but different concepts, not much data. · Taxonomic concepts needed for serious communication about taxa and to match names from disparate sources · What a taxonomic concept is depends on perspective and usage. Need a common definition. · GBIF/SEEK funding · Consultation with major taxonomic database developers. Determine similarities and differences. Amend/extend abstract model into transfer of schema. Follow-on consultation and final version TDWG October 2004. · Berlin Model, GBIF, IPNI/APNI, ITIS, Nomencurator, Prometheus, SEEK, Species 2000/BDWorld, Taxonomer, VegBank?

· Three distinct but related areas in taxonomy: Classification, Nomenclature, Identification/determination. These are not kept distinct: names and taxa get confused, defining new taxa gets confused with data on identification or description. · Names versus Concepts

· All names can be treated as concepts. · How do we define a concept?

· Aims of workshop

WalterBerendsohn? · Why do we need these systems? The usability of the results is the key issue. The aim is to enable non-taxonomists to be able to use the names.

Frank Bisby · For aggregation, need to be able to cover whole domains with a taxonomic treatment (know whether we have included all entities within a genus once and only once). ILDIS would regard itself as performing its own revision. Jessie: e.g. ITIS is taking other people’s concepts but it is not necessarily clear what their concept is.

RobertKukla, “Taxonomic Concept Transfer Schema” · Transfer schema: taxonomic entities of interest, relationships between them · Metadata element for human consumption · Publications element based on simple endnote style structure · Concept defined here as an opinion about a group of organisms, by a person/group of people, having a name/label, definition, record is available, “time stamped” · TaxonConcepts? have @type attribute. May be own concept (original or revision). May be referenced (related to other authors’ concepts). May be vernacular. · NameDetailed using ABCD element. · Relationship @type attribute allows for (rarely found) direct assertion of relationship (boolean), synonymy via typification, lineage if derived from other concepts, vernacular. · SpecimenCircumscription? @type specimen (for holotype, etc.) · TaxonConceptCircumscription? has @type (one level higher in hierarchy than other @type attributes.

Is schema available? From SEEK web site. Will be on NeSC site.

WalterBerendsohn?: · Metadata standardisation will be handled in SDD workshop next week. Let’s defer until then. Specimens (“Units”) could use ABCD elements. SDD has description and circumscription issues. · You stated that Relationships are always directed – assumes that always have an expert who references earlier concepts.

JessieKennedy: Author of revision can just identify e.g. that his concept is congruent with several other people’s concepts. This is not an implementation model. Walter: Can allow relationships to be related to the author of the relationship.

JamesYtow · What of inter-regnal organisms with names under two codes? Jessie: treat these as two congruent concepts.

FrankBisby · Should the vernacular name have a structure (and hence support NameDetailed)? · Should there be a congruenceOrInclusion (or synonymy) relationship to cover the majority of cases? Bob Peet: the existing values are candidates to which many others could be added.

WalterBerendsohn? · Need to include all nomenclatural relationships. Some of this information might be best placed in name part rather than relationship part.

DonaldHobern · Are all of these relationships suitable for inclusion in a single attribute? Jessie: use multiple relationships

JamesYtow: · Need locale for vernacular names

WalterBerendsohn? · Each revised concept has two author strings.

FrankBisby · How do Relationships and TaxonConceptCircumscription? relate to each other? Don’t they blur into each other. Both are set relationships. Jessie: can find e.g. that two genera are stated to be equivalent, but that included species represent different sets.

Will this be mapped to OWL or some other semantic language? Jessie: This would be an enormous job for which we don’t have the necessary information. Dave Thau: We have looked into doing this for the schema (without the data), but the usefulness of such a representation is unclear. OWL representations are hard to query.

DonaldHobern · Any thoughts on bidirectional links to simplify processing these large documents (moving up hierarchy)? Jessie: Intended purely as a transfer schema.

DaveThau?, “Globally Unique Identifiers, why, where and how?” · What?

· Why?

· When?

· Which?

· What now?

· Discuss

AlexGray? · Note that uniqueness is only within context defined for given identifier type.

RobertKukla, “Experience from Mapping Existing Models to the Transfer Schema” · ITIS plants, Berlin mosses (both text files), Taxonomer fishes (Access database) · Imported into MySQL · Java program to generate XML · 3 main aspects: Identifying concepts, extracting relationships, concept details · No CharacterCircumscription? or SpecimenCircumscription? information · No hybrids as implications are not fully understood · ITIS: 97741 plants, 206649 concepts, ITIS’ own concepts (usage=”accepted” -> type=”revision”), synonyms (usage=”not accepted -> type=”referenced”), vernaculars (type=”vernacular”), concept circumscription (parent_tsn field), synonymy (explicit + vernaculars), lineage relationships (to concept of same name according to different publication), NameSimple calculated · Berlin: 24368 concepts, explicit concept relationships and name-synonymy, many different relationship types (some very rare) · Taxonomer: Parent links, but no relationships Protocol questions · Need an interface like the SPICE interface to allow users to find concepts (treewalking, etc.). SPICE itself is a candidate. · SEEK has an early implementation of an API using the Napier schema. · Bring these together to see what can be done to resolve them · May have three different kinds of use for schema:

Question and Answer

FrankBisby · Healthy level of unanimity in acceptance of basic model (with possible exception of what a concept is)

DonaldHobern · The acceptability of the standard may be precisely because it does not seek to overdefine this.

SallyHinchcliffe · Important question is how compatible the concepts may be at the end of the day and hence how useful it may be to bring the datasets together.

FrankBisby · Lots of work still in getting a complete list of relationship types