I think we should also consider whether we need anything beyond existing id attribute as an identifier for the record. IPNI clearly has identifiers for each name in the index and it would be simple to use that for the id in the schema, but I really hope we get to develop a GUID model for concepts in the near future, so it may be helpful to rename the attribute to "key" and include an <Identifiers> element inside <TaxonConcept?>, with this containing <Identifier> subelements. I'd suggest that we then need an attribute on the <Identifier> element to classify the identifier, something like "source" or "type". In the long run we will need a better way to control the content of this attribute, but it gives us a way now to include a label such as "IPNI" or "LSID" to help us know what we are dealing with.

RobertKukla I am not a 100% sure if I am with you here. The id field only needs to be unique within the file, the origin is not relevant. I don't understand why it would need to be renamed to key?

If you want to store a back reference to where the data originally came from using their (e.g. IPNIs) internal ID - that might be potentially useful (e.g. for merging data files). The source is currently part of the metadata

not guaranteed that such an ID exists and what form it takes. So again I would add it in as and when required with a different namespace.

But do you anticipate to link back to multiple data sources? Not sure what the scenario for such a requirement would be ?


I was trying to cast this requirement in a general form which did not prejudge what we (SEEK, GBIF and others) end up doing about globally unique identifiers. My point is that we need to plan now for a place in which we can include a GUID for each concept where that GUID may very well not be the same as the local id for the concept (or whatever internal keys get generated to build the XML documents. The only reason I suggested renaming id to key was to avoid unnecessary confusion between such local identifiers and anything we placed in a GUID element. I believe that it is really essential for us to include such an element in the base schema rather than leaving it to be handled ad hoc in different ways in other namespaces.


Could you give an example scenario "where that GUID may very well not be the same as the local id for the concept"? In my opinion if the (local) file contains the record of a concept that is also available via a GUID it should use this GUID as its (local) ID within the file.

I can probably be convinced that there is a need to record what kind of ID it is (maybe by an additional attribute) although I feel the format of the GUID should contain the information (URI vs number).