Related to Name Usage disucssion but coming back to do we have names as entities in their own right as Walter suggests the paper he mentions in WalterBerendsohn Fri 04-06-2004 09:45 supports in some way.


JessieKennedy Tue 08-06-2004 15:45

I do not see this article as clarifying the discussion that we have been having between nomenclature and concepts.

After reading the article the main message I got was that taxonomists should get their act together and sort out a proper registration process for new names (and I agree ;-) )- which implicitly means new concepts - it is clear to me that the article is referring to concepts as he talks about the descriptions of the name - i.e. the description of the taxon concept with that name.

This is in agreement with my view that we need good registration of "original concepts", i.e. the description of a taxon associated with the first usage of a name. These act as the building blocks upon which a proper concept system could be built. And in fact all original concepts could be the legal entities.


GregorHagedorn Tue 08-06-2004 16:35

Nobody argues against this, I and others only argue in favor of making this very high requirement not the only level at which it is possible to collaborate and integrate data.


JessieKennedy Tue 08-06-2004 15:45

Regarding nomenclature specific changes i.e. those caused by spelling or changes to the codes etc then I still believe that if these changes result in a new name i.e. a first usage of that name then what we have done is in fact created a new concept - even if that concept is congruent with an exisitng one. Because people (perhaps except those only interested in nomenclature) will use that new name as a concept. So there is no point in thinking simply about names as people will always use names as concepts, i.e. they will mean something when they use it.


GregorHagedorn Tue 08-06-2004 16:35

Similar to what I said in another mail: Why do people have social security IDs without exactly defining their concept? Multiple concepts of "Gregor Hagedorn" exists - a scientist, a husband, dancer, etc. Also certainly several people of that name exist. These are two different levels of concepts.


JessieKennedy 17/6/2004

I do not believe several concepts of Gregor Hagerdorn exist. What you have described here is that we have one instance (specimen!)Gregor Hagerdorn (identified uniquely by his social security number (GUID)), with a non-unique name string "Gregor Hagerdorn" who has been classified as being an instance of the concepts scientists, husband and dancer. The fact the several Gregor Hagerdorn's exist is simply evidence of the non-uniqueness of the name string to act as a key to identify the individual Gregor Hagerdorn - you.


GregorHagedorn Tue 08-06-2004 16:35

Now multiple names exist: "Gregor Hagedorn", "Hagedorn, Gregor", "Gregor M. Hagedorn", "G. Hagedorn", "G. Hagedom". The first two can be recognized as identical if we now the cultural concept (western european) and its conventions. The third only in assymmetric way (Gregor M. Hagedorn -> Gregor Hagedorn, but not reverse) and the last two are highly depending on knowledge about scope, context and available homonyms. It may be possible to identify them as representations of the same object, but they are not always. I call them name variants and would like to use processing combined with expertise to refer them in the instance to a person object. Your model would require me to refer them to a person concept (i.e. I have to decide, wether husband or dancer is meant).


JessieKennedy 17/7/2004

No Gregor I don't agree. You do have different name strings being used to refer to you the specimen with Soc. Sec no. lets say xxx. But these are not different concepts in my mind - see above Gregor Hagerdorn isn't a concept but rather dancer is. Even so, this is where labels have been badly used when identifying a specimen, not when defining a new concept. If someone identifies something by using a label that doesn't exist as a defined concept then yes we need to try and resolve what they mean - but that's a different issue and one that although related should be tackled separately form talking about what is a concept or not. If I have some algortithm I want to use (programmatic or human) to decide what the label means then I can use that to decide what actual concepts it might be referring to.


GregorHagedorn Tue 08-06-2004 16:35

The use of any of these name variants does not allow to deduce the concept. Thus "if these changes result in a new name i.e. a first usage of that name then what we have done is in fact created a new concept - even if that concept is congruent with an exisitng one." I think is false.


JessieKennedy 17/6/2004

I'm not saying that we create concepts for every misspelling/use of a name by anyone referring to concepts - so I wouldn't do what you suggest. I would only create new concpets with new names if they were intneding to creat a new concept - not simply mis-labelling an identification. Now I'm not saying this information isn't important but it is an identification labelling issue not a concpet creation/definition issue.


GregorHagedorn Tue 08-06-2004 16:35

Note that only a limited amount can be recognized automatically by algorithm, which as I read Robert seems to be a hidden assumption in your model. However, the dependency on expertise makes it necessary to store this knowledge. If I don't get a handle to store the relation with a name object, and relate this with GBIF data, I fail to see how we can achieve data integration of specimens, descriptions, molecular data, pathology, etc.


JessieKennedy 17/6/2004

The fact that the label matches to some degree on the name string gives you a handle as good as you know. If the infomration necessary isn't explicit then you will be reliant on your expertise to do the matching and I would expect a system to let you specify the accuracy of string matching to help you get a handle on the potentuial concepts that might be relevant.


GregorHagedorn Tue 08-06-2004 16:35

Also, I hope the example clarifies that the knowledge about name- equivalence is culture and context dependent, but independent from the knowledge of whether the researcher or the dancer concept is meant. To me it is an independent dimension. You seem to treat it as hierarchically nested knowledge within each concept, which explodes the amount to statements you have to make. For each name in biology you have perhaps a few dozen taxonomic concepts, 100s of identification concepts, and possibly millions of name usage concepts (e.g. a DNA has been sequenced).


JessieKennedy 17/6/2004

I know that names in the most general sense are culture dependent but for concepts - we should I believe restrict ourselves to concepts that have been created - not treating labels used by anyone anywhere as a concept - I don't think this is feasible or sensible. If we sort out the valid concepts first (and I would propose sorting out the original concepts first i.e. the first usages of names with their associated concepts on which to build a revisionary concept database) then we might get somewhere.


GregorHagedorn Tue 08-06-2004 16:35

The cultures in biological names equivalent to the western-culture assumption in judging person name equivalence are the Bot./Zoo etc. codes. The nomenclatural information defines by its publication data a unique object, governed by uniqueness rules. Knowing the nomenclature and having sufficient name lists to make judgements about homonymy, I can, with limited experience associate many name variants with a name object. I know for my data, that I cannot do this in regard to concepts, because I am lacking any clues which concept may have been meant.


JessieKennedy 17/06/2004

If you can do this then you should relate your data to the original concept (first usage of the name) with a relationship that is called something like "of this name" - I don't see this as a big issue. We have agreed that we need to decide on what types of relationships are required - this is an example of an "identification type of realtionship as opposed to a concept relationship or name relationship - although this has to be fully worked out yet.


GregorHagedorn Tue 08-06-2004 16:35

Similar to the SSID example another example from the real world: ISBN for books work fine without requiring abstracts or text checksums. The yield only limited information, and may actually sometimes get in the way, because multiple ISBN are issued for the same publication, and it is impossible to decide whether pagination or content is different or not, but ISBN/ISSN are the foundation of much ecommerce and libary management. See http://www.arl.org/newsltr/194/identifier.html for a very interesting article...


JessieKennedy Tue 08-06-2004 15:45

Now whether or not they have the same definition of the concept in their mind as the one originally (or later) defined is another issue - and to me it is one of identification. Regardelss of how good our definitions are there will always be some degree of inaccuracy in identification - and we just have to accept this. But we do know that people try to identify things to some concpet and getting them to refer to a concept (even nearest to their idea) rather than a name is better than simply a name.

Of course in the article he talks about registration of names/concepts and the big issue for taxonomy IMO is that we need to sort out a retrospective registration for legacy (existing) names/concepts. Too many organisations are semi-registering legacy names and concepts independently without cross checking what others are doing or reusing existing "registrations", whereas if this was co-ordinated then we might actually start getting some agreement on what names/concepts exist and start ironing out the problems. This of course is more political as we have to look at "ownership" of legacy concepts/names - but this is for another discussion.

Re the legacy names/concepts, we have heard that some of these concepts are so badly described you can't even reason about what they might mean (well a human might be able to guess), but that kind of information could be handled with some quality measure - which could be discussed further.