Posted in Technology on 2006-08-17 21:23
I've been doing some thinking about identity lately, but to explain myself I need to first get across the concept of an equivalence class. Equivalence classes for me are one of the wonders of modern mathemathics, in the sense that it's a concept that's brilliantly simple, obvious, and elegant, and at the same time incredibly useful.
A relation in mathemathics is a relationship between two objects that may or may not hold true for any given pair of objects. One example would be "is taller than". Geir Ove is taller than me, and I'm taller than my girlfriend. Other examples might be "is the same as", "is a part of", and so on. The interesting thing about this is that if you have a relation a ~ b you can start saying something about the properties the relation has. And once you know something about those you can really start doing stuff.
On example of this is what is called an equivalence relation. It's actually easy to understand what makes a relation an equivalence relation. It has to satisfy three simple conditions:
- The relation has to consider everything to be equivalent to itself. That is, a ~ a must be true for all a.
- The relation also cannot be directional (in the way that e.g. "is taller than" is), so if a ~ b then b ~ a must also be true.
- Finally, it must be what is called transitive. "Is taller than" also has this property, since you can tell from the two examples above that Geir Ove is taller than my girlfriend. In other words, if a ~ b and b ~ c then it must also be true that a ~ c.
So what's an example of an equivalence relation? A imaginary relation between decimal numbers that says they are equivalent when the part before the decimal point is the same (ie: 10.1 and 10.2 are equivalent, but 10.1 and 9.1 are not) would be an equivalence relation, since it would satisfy all three rules. However, a relation that says decimal numbers are equivalent if the difference between them is 1 or less would not, because although 9.5 is equivalent to 10.5 and 10.5 equivalent to 11.5, 9.5 would not be equivalent to 11.5.
This is easy to accept, but probably makes no sense at this point, because there seems to be no reason why anyone should care whether a relation has these three specific properties or not. What's special about equivalence relations is that if you apply them to a bunch of things they'll divide that bunch of things into a set of equivalence classes. Which of course needs to be explained.
An equivalence class is a set of objects that are considered to be equivalent by some equivalence relation. If you have a starter set of objects (let's call it A), applying an equivalence relation to that set will divide it into one or more equivalence classes, where each object will belong to exactly one equivalence class, and all objects in each class are equivalent to each other.
It's actually quite easy to see why this will happen. Let's say we're doing the partitioning of A into classes, and we're doing this by picking objects out of A one by one, and adding it into our growing set of classes. Let's look at what happens to one of those classes (call it C). When we add the first object (a) to it nothing much happens. When we find another object (b) that's equivalent to a we can lump the two together into the class, and still know that all objects in the class are equivalent to each other. Now let's say we find a third (c), which we find is equivalent to a. Now, we've found that c ~ a and a ~ b, and by the third rule above this means that c ~ b. Also, by the second rule, the reversal of all these three must also be true. This means that any two objects that you pick out of C will have to be equivalent. It's pretty obvious by now that this will continue to hold as you add more objects.
One thing that we haven't covered, though, is what happens if we've got C with its three members, and a new object comes along that is equivalent to something from C but also to something from another class D. This cannot happen, because if this were the case C and D would have to merge, and so we would have already turned C and D into a single class if we were careful when building the classes. (Which, presumably, we were.)
If we return to our two numeric example relations above it's easy to see that the first will form one equivalence class for each integer of all the decimal numbers between that integer and the next. It's also easy to see that the second will fail to do this.
Again, I suspect the reader will feel that this is all good and well, but probably also totally useless. So let's try to use it for something real.
What is a language?
A common definition of a language in linguistics is that dialects of the same language are mutually intelligible whereas different languages are not. In other words, if you are trying to decide whether Flemish (a variant of Dutch spoken in Belgium) is the same as Dutch you should look at whether speakers of the two languages can understand each other or not.
From the definition it's clear that each language is expected to consist of a set of dialects, with mutual intelligibility as an equivalence relation dividing the world's dialects into equivalence classes (that is, languages). This seems fine, as far as it goes, but does it really work? Well, let's see. I'll ignore the issue of how much intelligibility is required, since I would imagine that there are ways to define this in order to avoid ambiguity. But is mutual intelligibility really an equivalence relation?
- The first rule is fine. Speakers of the same dialect can understand each other just fine.
- Is it symmetric? There is usually a degree of asymmetry here, in that speakers of one dialect may find it easier to understand the other dialect than the other way around. It's not clear that these assymetries are large enough to really be a problem, so let's accept this for the time being.
- Is it transitive? Actually, no. To take a simple example I recently found myself in a dinner discussion with a couple from Sweden and another from western Norway. I could understand both couples perfectly (I'm from eastern Norway, which is between Sweden and western Norway), but neither couple could understand the other. This is pretty much exactly equivalent to the "two decimal numbers are equivalent if their difference is 1 or less"-relation, in that you can make small equivalence jumps that take you further and further away from where you started, until you are suddenly not equivalent with the starting point any more. (That is, I'm like the 10.5 between 9.5 and 11.5 above.)
So what does this mean? It's hard to escape the conclusion that the "mutual intelligibility" definition of language is fundamentally flawed, because it will fail to produce consistent groups of dialects. The relation violates the third rule, which means it fails to break dialects into the nice, consistent groups that we'd like to call languages. It's a nice idea, but it doesn't work.
What is a species?
I originally thought a species was a group of animals that could mate and produce fertile offspring, and this is a definition that's widely used in biology. However, this always seemed weird to me, because how could you then get new species through evolution? Let's say a set of (beneficial) mutations happen, and a new child is born, belonging to a new species never seen before. That seems pretty unlikely, but then evolution has some serious time spans (and numbers of births) to work with, so why not. But who does this child mate with? By definition, any attempt to mate with another animal will at best produce infertile offspring, unless evolution has happened to by accident produce another animal of the same species at the same time in roughly the same place. The chances of this happening are so exceedingly remote that we can obviously just forget about it.
Much later I read that no less of an authority on the subject than Charles Darwin himself actually entertained doubts about the concept of species, which made me wonder. Reading more, I realized what was going on, and it's pretty similar to what you've already read above. Clearly, a species (under the interbreeding definition) is an equivalence class of individual animals where the equivalence relation is "can these two produce fertile offspring?". (Yes, I know animals of the same sex will fail this test miserably, but let's say the test is really whether their genes are similar enough to allow fertile offspring in theory to get around this difficulty.)
But is this really an equivalence relation? It meets the first two tests easily, but again, for the same reason as before, it fails the third. Imagine a species that lives all over Africa, where individuals from western Africa can mate with those from central Africa, which again can mate with those from southern Africa. You can still find, however, that those from western Africa cannot mate with the ones from southern Africa because the accumulation of genetic differences simply becomes too big, and so the definition is once again flawed.
Of course, what happens in reality is that mutations happen all the time, and over time the changes accumulate until eventually individuals are very different from what they used to be. Enough, for example, to no longer be the same species by the interbreeding definition. This, it turns out, was one of the initial criticisms of Darwin's theory of evolution, in that it would invalidate the classification of animals into fixed species with a fixed typology. Of course, if evolution conforms with reality (as we all know it does) then it's the fixed concept of species that's in trouble, and not evolution.
So am I really saying that there is no such thing as language, and no such thing as a species? That, for example, the concepts "English" and "dog" are invalid? No, not really. I'm just saying that these are concepts with very fuzzy edges, and that any attempt to create definitions that will produce razor-sharp edges around these concepts is doomed to failure.
Or, to put it another way, you actually can use mathematics for something real... :)
The concept of identity is at the heart of both Topic Maps and RDF, and consequently it has been much discussed over the last few years, but I've seen very little writing on what identity itself actually is
Read | 2006-12-31 16:26
Lots of people think that the hierarchical association type used in taxonomies is the supertype-subtype association, but this is, unfortunately, wrong
Read | 2006-03-31 16:17
grace hwingwiri - 2007-10-02 13:44:07
ok,i understand your view on mutual inteligibility as a concept but you failed to hoighlight the major weaknesses of it actually .you tent to include too many mathematical issues as an escape god. thank you
David - 2008-04-15 13:11:53
I'm a math student trying to understand equivalence relations and classes. This short post was more helpful that the library of books I looked through. Thanks.
Jakob - 2009-09-11 17:34:49
I agree with David. None of the other definitions I read made it clear to me that an element can only belong to a single equivalence class. Thanks!
Jim - 2010-06-07 18:38:44
I completely agree with you about biological species (defined by interbreeding) not being equivalence classes. I also agree that Darwin was the one that kicked this idea into touch.
In my view, a great deal of misunderstanding about species and speciation comes from trying to force species into equivalence classes, to make them "real" objects for study in a sense.
The misunderstanding (again, in my view) has been strongly perpetuated by certain biologists who sincerely believe that species need to be "real" (by which I think they mean something like equivalence sets) in order to be valid objects for study. And then everyone else since then (1930s and 1940s) has been educated in that tradition.
Keep up the good work!
gaurav - 2011-03-07 14:52:42
Thanks for your post. You have written it so well that my concepts about equivalence relation & classes got strengthened. Very simple style and nicely organized. Thanks again.
s - 2011-10-24 02:13:51
give some example for the equivalence class
Noleen - 2012-03-20 09:07:32
thank you very much for your explanation, made me look silly though, so easy capeasy..!
Sylvia Klimicek - 2012-04-18 22:13:34
Thank you so much for putting this concept in simple words. New knowledge is always processed through the concepts we already understand. No matter how much I read the definitions and tried to "learn" them, they were never as clear to me as when you brought them to life by tying them to concepts I already know and understand.
Bruce - 2014-02-20 21:53:52
I am a developer in Shen Zhen of China, a city near Hong Kong... And I want to say this is definitely the best article about equivalence class I have read.
Looking forward your new posts on computer mathematics...