This seems to agree with your intuitions: In a Turing test you can probably distinguish females and males in which case most transsexuals hopefully come out as the gender they consider themselves do be
Distinguish based on what attributes, exactly? Can you suggest contents for such a test?
Take 1000 typical males, 1000 typical females, 1000 transexual males, 1000 transexual females, 1000 typical males tasked to pretend they are female and 1000 typical females tasked to pretend they are male. Then you let each of these talk anonymously over text chat with 100 randomly chosen of the others and assign probabilities of them being in each of these categories. Then you run statistics to determine the general ability to distinguish each of the categories from each of the others.
I’d expect that {typical!male, trans!male, and troll!male} would be almost complexity distinguishable from {typical!female, trans!female, and troll!female}, that it often would be possible to distinguish typical!X from trans!X, but that trans!X are very rarely mistaken for troll!X… this matrix of possibilities is kinda huge so I wont bother filling it out more unless you specifically request it since you probably get my point by now.
I suspect that you’re vastly underestimating how similar people are.
My guess is that people’s guesses will be essentially random, except possibly for the trolls (because they’re trying, and so will be portraying caricatures of the opposite sex instead of actual people).
I know that I personally have never so far been able to tell men from women over a purely text channel without having been told explicitly, which I assume would be off limits. Though now I think of it that’s not entirely true; I would guess from lesswrong demographics that you, Armok, are male. (’course, if you happened to be female that would prove my point nicely.)
I know that I personally have never so far been able to tell men from women over a purely text channel without having been told explicitly
I have. There are text analyzers which give statistical likelihoods on the gender of the author of a given piece of writing. They generally give fairly wide confidence margins, but their algorithms are pretty simple and they don’t apply a lot of heuristics that humans can use. Even the best gender analyzer can only guess with limited confidence, but a person’s writing style offers considerably more than zero information about their gender.
It wouldn’t be off limits, and you’re supposed to specifically be fishing for their gender and they’re supposed to be cooperative except for the trolls.
Ignoring for a minute that such a test would be infeasible to realistically implement (good luck getting so many trans volunteers), it is loaded with cultural assumptions, a vague definition of “typical”, and it ignores such issues as experience in the target gender role, skill in the language of the test, and culture-specific stereotypes and presuppositions.
Well, no, you’d be measuring how people come across to other people, which is an important aspect of gender but far from the most important. Still, I’d find the results of such an experiment quite interesting and informative.
It means taking averages over such an extremely diverse sample that the results end up having no real meaning—like literal average temperature per hospital, which includes sampling over corpses in the morgue and severe fever sufferers. So if the average temperature hospital 1 turns out to be 0.1 degrees higher than in hospital 2, it tells us nothing about the relative distribution of patient traits in each hospital.
That’s your hypothesis over the results, not inherent in the testing procedure. If that is the case it would show up as a specific result not be mistake for somehting else.
I’d say there being very clear trends is orders of magnitude more probable.
The way I described the experiment means the raw data would be very rich, and you should be able to see very clear things like some people being better at distinguishing than others, people being better at distinguishing between people who are otherwise similar to their culture, some people being better at pretending than others, some of the “typicals” being a lot more or less typical than others, etc. There’s lots of redundancy.
Distinguish based on what attributes, exactly? Can you suggest contents for such a test?
Take 1000 typical males, 1000 typical females, 1000 transexual males, 1000 transexual females, 1000 typical males tasked to pretend they are female and 1000 typical females tasked to pretend they are male. Then you let each of these talk anonymously over text chat with 100 randomly chosen of the others and assign probabilities of them being in each of these categories. Then you run statistics to determine the general ability to distinguish each of the categories from each of the others.
I’d expect that {typical!male, trans!male, and troll!male} would be almost complexity distinguishable from {typical!female, trans!female, and troll!female}, that it often would be possible to distinguish typical!X from trans!X, but that trans!X are very rarely mistaken for troll!X… this matrix of possibilities is kinda huge so I wont bother filling it out more unless you specifically request it since you probably get my point by now.
I suspect that you’re vastly underestimating how similar people are.
My guess is that people’s guesses will be essentially random, except possibly for the trolls (because they’re trying, and so will be portraying caricatures of the opposite sex instead of actual people).
I know that I personally have never so far been able to tell men from women over a purely text channel without having been told explicitly, which I assume would be off limits. Though now I think of it that’s not entirely true; I would guess from lesswrong demographics that you, Armok, are male. (’course, if you happened to be female that would prove my point nicely.)
I have. There are text analyzers which give statistical likelihoods on the gender of the author of a given piece of writing. They generally give fairly wide confidence margins, but their algorithms are pretty simple and they don’t apply a lot of heuristics that humans can use. Even the best gender analyzer can only guess with limited confidence, but a person’s writing style offers considerably more than zero information about their gender.
It wouldn’t be off limits, and you’re supposed to specifically be fishing for their gender and they’re supposed to be cooperative except for the trolls.
Ignoring for a minute that such a test would be infeasible to realistically implement (good luck getting so many trans volunteers), it is loaded with cultural assumptions, a vague definition of “typical”, and it ignores such issues as experience in the target gender role, skill in the language of the test, and culture-specific stereotypes and presuppositions.
Presumably all those things should be as randomized as possible.
There is an expression in Russian net folklore: “average temperature per hospital”. This is, in effect, what you’d be measuring here.
Well, no, you’d be measuring how people come across to other people, which is an important aspect of gender but far from the most important. Still, I’d find the results of such an experiment quite interesting and informative.
I’ not sure what that means and Google isn’t being helpful.
It means taking averages over such an extremely diverse sample that the results end up having no real meaning—like literal average temperature per hospital, which includes sampling over corpses in the morgue and severe fever sufferers. So if the average temperature hospital 1 turns out to be 0.1 degrees higher than in hospital 2, it tells us nothing about the relative distribution of patient traits in each hospital.
That’s your hypothesis over the results, not inherent in the testing procedure. If that is the case it would show up as a specific result not be mistake for somehting else.
I’d say there being very clear trends is orders of magnitude more probable.
The way I described the experiment means the raw data would be very rich, and you should be able to see very clear things like some people being better at distinguishing than others, people being better at distinguishing between people who are otherwise similar to their culture, some people being better at pretending than others, some of the “typicals” being a lot more or less typical than others, etc. There’s lots of redundancy.
That expression would require less explanation if it were “average body temperature in a hospital”.
Somehow the Russian version is more suggestive of that, without explicitly saying “body temperature”. Languages are funny that way.