SallyHinchcliffe 5/8/2004 :

Quick question - I'm just trying to generate some dummy data for the seek/napier part of the schema. From what I can see, the way that relationships between taxa are modelled is via the ReferenceType which the annotation describes as being 'an entity defined elsewhere'. Does that mean elsewhere in the file, or elsewhere somewhere else? If it's in the file, you will end up getting

a list like this

TaxonConcept A (refers to TaxonConcept F and G) TaxonConcept B (refers to TaxonConcept H) TaxonConcept C (refers to TaxonConcept I) TaxonConcept D TaxonConcept E (refers to TaxonConcept J & K) TaxonConcept F (here for completeness) TaxonConcept G (ditto) TaxonConcept H etc ...

TaxonConcepts A-E are the taxon concepts that were actually asked for. The rest of them are just there to make the references work. (And presumably there's nothing to stop them recursing into other concepts until ultimately the entire data set is included ... ) I can't see how your calling software is going to distinguish between the 'real' taxon concepts, i.e. the families or genera or species that you asked for, and the rest of them...

I'm sure this can't be right but I'm struggling with reading these schemas. Are there any example xml documents anywhere that show how these schemas should be used, and how these onward references are supposed to be handled? I couldn't find anything on the wiki ..

DonaldHobern 7/8/2004:

I'm so glad you're doing this... This is the reason why I was wondering whether or not to try including any nomenclatural relationships in the "slimline" response.

I think that all of the documents I have seen so far have been representations of complete databases, so this has not been an issue.

We have three basic possibilities:

1. Include all referenced names as extra TaxonConcept elements. 2. Include no references to such names at all. 3. Include only references to such names and require the user to retrieve them separately if they want more information.

Number 1 may become very verbose (and could in some situations lead to massive amounts of drag-along data). It is also a pity that there is no way to tell which TaxonConcepts are returned because they match a request and which are just there fore completeness.

Number 2 seems unsatisfactory since it does not give any indication of the status of a name.

Number 3 may not result in a valid XML document. I just took a quick look at the schema and I think that the attribute concerned is not really a reference to another element in the document, so it may be OK. It seems unfortunate however that this would only give the identifier for the related name without the name itself.

I think that number 1 may be the least of all evils, but would be interested to know if Jessie has thought about this issue and has any suggestions.


RobertKukla:

The schema supports either of these methods. In my opinion it is part of the qerry to state what kind of information it expects as a return and in what format.

Number 1 may become very verbose (and could in some situations lead to massive amounts of drag-along data). It is also a pity that there is no way to tell which TaxonConcepts are returned because they match a request and which are just there fore completeness.

I agree. Maybe it would be possible to limit the depth in the query. E.g. primary results and concepts they link to would be 2 levels deep. In order to distinguish between actual results and concepts that are 'just' being referenced from those results we propose an attribute called primary.

Number 2 seems unsatisfactory since it does not give any indication of the status of a name.

Yes, but there might be other contexts in which this is not required.

Number 3 may not result in a valid XML document. I just took a quick look at the schema and I think that the attribute concerned is not really a reference to another element in the document, so it may be OK. It seems unfortunate however that this would only give the identifier for the related name without the name itself.

The latest version (to be submitted tomorrow) makes an explicit distinction between internal and external references. This would only work if a GUID mechanism is in place.

I think that number 1 may be the least of all evils, but would be interested to know if Jessie has thought about this issue and has any suggestions.

Which option to choose IMO depends on what you want to do with the data. I would agree, option 1, maybe limited to two levels would be most suitable for most cases.


See PrimaryConcepts for a proposed mechanism to flag the 'primary' or first level Concepts in a response document.