The problem is that you don’t understand the purpose of the studies at all and you’re violating several important principles which need to be kept in mind when applying logic to the real world.
Our primary goal is to determine net harm or benefit. If I do a study as to whether or not something causes harm or benefit, and see no change in underlying rates, then it is non-harmful. If it is making some people slightly more likely to get cancer, and others slightly less likely to get cancer, then there’s no net harm—there are just as many cancers as there were before. I may have changed the distribution of cancers in the population, but I have certainly not caused any net harm to the population.
This study’s purpose is to look at the net effect of the treatment. If we see the same amount of hyperactivity in the population prior to and after the study, then we cannot say that the dye causes hyperactivity in the general population.
“But,” you complain, “Clearly some people are being harmed!” Well yes, some people are worse off after the treatment in such a theoretical case. But here’s the key: for the effect NOT to show up in the general population, then you have only three major possibilities:
1) The people who are harmed are such a small portion of the population as to be statistically irrelevant.
2) There are just as many people who are benefitting from the treatment and as such NOT suffering from the metric in question, who would be otherwise, as there are people who would not be suffering from the metric without the treatment but are as a result of it. (this is extremely unlikely, as the magnitude of the effects would have to be extremely close to cancel out in this manner)
3) There is no effect.
If our purpose is to make [b]the best possible decision with the least possible amount of money spent[/b] (as it should always be), then a study on the net effect is the most efficient way of doing so. Testing every single possible SNP substitution is not possible, ergo, it is an irrational way to perform a study on the effects of anything. The only reason you would do such a study is if you had good reason to believe that a specific substitution had an effect either way.
Another major problem you run into when you try to run studies “your way” (more commonly known as “the wrong way”) is the blue M&M problem. You see, if you take even 10 things, and test them for an effect, you have a 40% chance of finding at least one false correlation. This means that in order to have a high degree of confidence in the results of your study, you must increase the threshold for detection—massively. Not only do you have to account for the fact that you’re testing more things, you also have to account for all the studies that don’t get published which would contradict your findings (publication bias—people are far more likely to report positive effects than non-effects).
In other words, you are not actually making a rational criticism of these studies. In fact, you can see exactly where you go wrong:
[quote]If 10% of kids become more hyperactive and 10% become less hyperactive after eating food coloring, such a methodology will never, ever detect it.[/quote]
While possible, how [b]likely[/b] is this? The answer is “Not very.” And given Occam’s Razor, we can mostly discard this barring evidence to the contrary. And no, moronic parents are not evidence to the contrary; you will find all sorts of idiots who claim that all sorts of things that don’t do anything do something. Anecdotes are not evidence.
This is a good example of someone trying to apply logic without actually trying to understand what the underlying problem is. Without understanding what is going on in the first place, you’re in real trouble.
I will note that your specific example is flawed in any case; the idea that these people are in fact being effected is deeply controvertial, and unfortunately a lot of it seems to involve the eternal crazy train (choo choo!) that somehow, magically, artifically produced things are more harmful than “naturally” produced things. Unfortunately this is largely based on the (obviously false and irrational) premise that things which are natural are somehow good for you, or things which are “artificial” are bad for you—something which has utterly failed to have been substantiated by and large. You should always automatically be deeply suspect of any such people, especially when you see “parents claim”.
The reason that the FDA says that food dyes are okay is because there is no evidence to the contrary. Food dye does not cause hyperactivity according to numerous studies, and in fact the studies that fail to show the effect are massively more convincing than those which do due to publication bias and the weakness of the studies which claim positive effects.
If we see the same amount of hyperactivity in the population prior to and after the study, then we cannot say that the dye causes hyperactivity in the general population.
Correct. But neither can we say that the dye does not cause hyperactivity in anyone.
The reason that the FDA says that food dyes are okay is because there is no evidence to the contrary. Food dye does not cause hyperactivity according to numerous studies,
Like that. That’s what we can’t say from the result of this study, and some other similar studies. For the reasons I explained in detail above.
Your making the claim “no evidence to the contrary” shows that you have not read the literature, have not done a PubMed search on “ADHD, food dye”, and have no familiarity with toxicity studies in general. There is always evidence to the contrary. An evaluation weighs the evidence on both sides. You can take any case where the FDA has said “There is no evidence that X”, and look up the notes from the panel they held where they considered the evidence for X and decided that the evidence against X outweighed it.
If you believe that there is no evidence that food dyes cause hyperactivity, fine. That is not the point of this post. This post analyzes the use of a statistical test in one study, and shows that it was used incorrectly to justify a conclusion which the data does not justify.
If 10% of kids become more hyperactive and 10% become less hyperactive after eating food coloring, such a methodology will never, ever detect it.[/quote]
While possible, how [b]likely[/b] is this? The answer is “Not very.”
(A) I analyzed their use of math and logic in an attempt to prove a conclusion, and showed that they used them incorrectly and their conclusions are therefore not logically correct. They have not proven what they claim to have proven.
(B) The answer is, “This is very likely.” This is how studies turn out all the time, partly due to genetics. Different people have different genetics, different bacteria in their gut, different lifestyles, etc. This makes them metabolize food differently. It makes their brain chemistry different. Different people are different.
This means that in order to have a high degree of confidence in the results of your study, you must increase the threshold for detection—massively.
That’s one of the problems I was pointing out! The F-test did not pass the threshold for detection. The threshold is set so that things that pass it are considered to be proven, NOT so that things that don’t pass it are considered disproven. Because of the peculiar nature of an F-test, not passing the threshold is not even weak evidence that the hypothesis being tested is false.
People aren’t that different. I really doubt that, for example, there are people whose driving skills improve after drinking the amount of alcohol contained in six cans of beer.
Consider the negative effects of high nervousness on driving skills, the nervousness-reducing effects of alcohol, the side effects of alcohol withdrawal on alcoholics, and the mediating effects of high body mass on the effects of alcohol:
A severely obese alcoholic who is nervous enough about driving and suffering from the shakes might perform worse stone-cold sober than he does with the moderate BAC that he has after drinking a six-pack.
What are the odds that there exists at lease one sufficiently obese alcoholic who is nervous about driving?
That data point would not provide notable evidence that alcohol improves driving in the general population.
There is always evidence to the contrary. An evaluation weighs the evidence on both sides. You can take any case where the FDA has said “There is no evidence that X”, and look up the notes from the panel they held where they considered the evidence for X and decided that the evidence against X outweighed it.
The phrase “There is no evidence that X” is the single best indicator of someone statistically deluded or dishonest.
I’d normally take “evidence that [clause]” or “evidence for [noun phrase]” to mean ‘(non-negligible) positive net evidence’. (But of course that can still be a lie, or the result of motivated cognition.) If I’m talking about evidence of either sign, I’d say “evidence whether [clause]” or “evidence about [noun phrase]”.
I think your usage is idiosyncratic. People routinely talk about evidence for and against, and evidence for is not the net, but the evidence in favor.
where they considered the evidence for X and decided that the evidence against X outweighed it.
It’s quite standard to talk about evidence for and against a proposition in exactly this way, as he reports the FDA did. Having talked about “the evidence for” and weighing against the “evidence against”, you don’t then deny the existence of the “evidence for” just because, in balance, you find the evidence against more convincing.
You’re slicing the language so thinly, and in such a nonstandard way, it seems like rationalization and motivated reasoning. No evidence means no evidence. No means no. It can mean *very very little too”. Fine. But it doesn’t mean “an appreciable amount that has a greater countervailing amount”.
But here the FDA has taken “The balance of the evidence is not enough for to be sure enough” and said “There is no evidence for”. The evidence cited as “no evidence” should move the estimate towards 84% certain that there is an effect in the general population.
In this case, honest eyeballing of the data would lead one to conclude that there is an effect.
There actually isn’t any evidence against an effect hypothesis, because they’re not testing an effect hypothesis for falsification at all. There just isn’t enough evidence against the null by their arbitrarily too high standard.
And this is the standard statistical test in medicine, whereby people think they’re being rigorously scientific. Still just 2 chromosomes away from chimpanzees.
This is why you never eyeball data. Humans are terrible at understanding randomness. This is why statistical analysis is so important.
Something that is at 84% is not at 95%, which is a low level of confidence to begin with - it is a nice rule of thumb, but really if you’re doing studies like this you want to crank it up even further to deal with problems with publication bias. publish regardless of whether you find an effect or not, and encourage others to do the same.
Publication bias (positive results are much more likely to be reported than negative results) further hurt your ability to draw conclusions.
The reason that the FDA said what they did is that there isn’t evidence to suggest that it does anything. If you don’t have statistical significance, then you don’t really have anything, even if your eyes tell you otherwise.
Some are more terrible than others. A little bit of learning is a dangerous thing. Grown ups eyeball their data and know the limits of standard hypothesis testing.
The reason that the FDA said what they did is that there isn’t evidence to suggest that it does anything.
Yeah, evidence that the FDA doesn’t accept doesn’t exist.
The people who believe that they are grown-ups who can eyeball their data and claim results which fly in the face of statistical rigor are almost invariably the people who are unable to do so. I have seen this time and again, and Dunning-Kruger suggests the same—the least able are very likely to do this based on the idea that they are better able to do it than most, whereas the most able people will look at it and then try to figure out why they’re wrong, and consider redoing the study if they feel that there might be a hidden effect which their present data pool is insufficient to note. However, repeating your experiment is always dangerous if you are looking for an outcome (repeating your experiment until you get the result you want is bad practice, especially if you don’t adjust things so that you are looking for a level of statistical rigor that is sufficient to compensate for the fact that you’re doing it over again), so you have to keep it very carefully in mind and control your experiment and set your expectations accordingly.
The problem we started with was that “statistical rigor” is generally not rigorous. Those employing it don’t know what it would mean under the assumptions of the test, and fewer still know that the assumptions make little sense.
[quote]Correct. But neither can we say that the dye does not cause hyperactivity in anyone.[/quote]
No, but that is not our goal in the first place. Doing a test on every single possible trait is economically infeasible and unreasonable; ergo, net impact is our best metric.
The benefit is “we get a new food additive to use”.
The net cost is zero in terms of health impact (no more hyperactivity in the general population).
Ergo, the net benefit is a new food additive. This is very simple math here. Net benefit is what we care about in this case, as it is what we are studying. If it redistributes ailments amongst the population, then there may be even more optimal uses, but we’re still looking at a benefit.
If you want to delve deeper, that’s going to be a seperate experiment.
[quote]Your making the claim “no evidence to the contrary” shows that you have not read the literature, have not done a PubMed search on “ADHD, food dye”, and have no familiarity with toxicity studies in general. There is always evidence to the contrary. An evaluation weighs the evidence on both sides. You can take any case where the FDA has said “There is no evidence that X”, and look up the notes from the panel they held where they considered the evidence for X and decided that the evidence against X outweighed it.[/quote]
Your making the claim “evidence to the contrary” suggests that any of this is worth anything. The problem is that, unfortunately, it isn’t.
If someone does a study on 20 different colors of M&Ms, then they will, on average, find that one of the M&Ms will change someone’s cancer risk. The fact that their study showed that, with 95% confidence, blue M&Ms increased your odds of getting cancer, [b]is not evidence for the idea that blue M&M’s cause cancer[/b].
Worse, the odds of the negative finding studies being published is considerably less than the probability of the positive finding study being published. This is known as “publication bias”. Additionally, people are more likely to be biased against artificial additives than towards them, particularly “independent researchers” who very likely are researching it precisely because they harbor the belief that it does in fact have an effect.
This is very basic and is absolutely essential to understanding any sort of data of this sort. When I say that there is no evidence for it, I am saying precisely that—just because someone studied 20 colors of M&M’s and found that one has a 95% chance of causing more cancer tells me nothing. It isn’t evidence for anything. It is entirely possible that it DOES cause cancer, but the study has failed to provide me for evidence of that fact.
You are thinking in terms of formal logic, but that is not how science works. If you lack sufficient evidence to invalidate the null hypothesis, then you don’t have evidence. And the problem is that a mere study is often insufficient to actually demonstrate it unless the effects are extremely blatant.
quote The answer is, “This is very likely.” This is how studies turn out all the time, partly due to genetics. Different people have different genetics, different bacteria in their gut, different lifestyles, etc. This makes them metabolize food differently. It makes their brain chemistry different. Different people are different.[/quote]
For this to happen, you would require that the space to be very similar in size on both ends.
Is it possible for things to help one person and harm another? Absolutely.
Is it probable that something will help almost exactly as many people as it harms? No. Especially not some random genetic trait (there are genetic traits, such as sex, where this IS likely because it is an even split in the population, so you do have to be careful for that, but sex-dependence of results is pretty obvious).
The probability of equal distribution of the traits is vastly outweighed by the probability of it not being equally distributed. Ergo the result you are espousing is in fact extremely unlikely.
This is very basic and is absolutely essential to understanding any sort of data of this sort. When I say that there is no evidence for it, I am saying precisely that—just because someone studied 20 colors of M&M’s and found that one has a 95% chance of causing more cancer tells me nothing. It isn’t evidence for anything. It is entirely possible that it DOES cause cancer, but the study has failed to provide me for evidence of that fact.
When I said that “making the claim “no evidence to the contrary” shows that you have not read the literature, have not done a PubMed search on “ADHD, food dye”, and have no familiarity with toxicity studies in general,” I meant that literally. I’m well-aware of what 95% means and what publication bias means. If you had read the literature on ADHD and food dye, you would see that it is closer to a 50-50 split between studies concluding that there is or is not an effect on hyperactivity. You would know that some particular food dyes, e.g., tartrazine, are more controversial than others. You would also find that over the past 40 years, the list of food dyes claimed not to be toxic by the FDA and their European counterparts has been shrinking.
If you were familiar with toxicity studies in general, you would know that this is usually the case for any controversial substance. For instance, the FDA says there is “no evidence” that aspartame is toxic, and yet something like 75% of independent studies of aspartame concluded that it was toxic. The phrase “no evidence of toxicity”, when used by the FDA, is shorthand for something like “meta-analysis does not provide us with a single consistent toxicity narrative that conforms to our prior expectations”. You would also know that toxicity studies are frequently funded by the companies trying to sell the product being tested, and so publication bias works strongly against findings of toxicity.
Suppose their exists a medication that kills 10% of the rationalists who take it (but kills nobody of other thought patterns), and saves the lives of 10% of the people who take it, but only by preventing a specific type of heart disease that is equally prevalent in rationalists as in the general population.
A study on the general population would show benefits, while a study on rationalists would show no effects, and a study on people at high risk for a specific type of heart disease would show greater benefits.
Food dye is allegedly less than 95% likely to cause hyperactivity in the general population. It has been alleged to be shown that it is more than 95% likely to cause hyperactivity in specific subgroups. It is possible for both allegations to be true.
Yes, but it is not a probable outcome, as for it to be true, it would require a counterbalancing group of people who benefit from it or for the subgroups to be extremely small; however, the allegations are that the subgroups are NOT small enough that the effect could have been hidden in this manner, suggesting that there is no effect on said subgroups as the other possibility is unlikely.
Strictly speaking, the subgroup in question only has to be one person smaller than everybody for those two statements to be compatible.
Suppose that there is no effect on 10% of the population, and a consistent effect in 90% of the population that just barely meets the p<.05 standard when measured using that subgroup. If that measurement is make using the whole population, p>.05
95% is an arbitrarily chosen number which is a rule of thumb. Very frequently you will see people doing further investigation into things where p>0.10, or if they simply feel like there was something interesting worth monitoring. This is, of course, a major cause of publication bias, but it is not unreasonable or irrational behavior.
If the effect is really so minor it is going to be extremely difficult to measure in the first place, especially if there is background noise.
It’s not a rule of thumb; it’s used as the primary factor in making policy decisions incorrectly. In this specific example, the regulatory agency made the statement “There is no evidence that artificial colorings are linked to hyperactivity” based on the data that artificial colorings are linked to hyperactivity with p~.13
There are many other cases in medical where 0.05p<.5 is used as evidence against p.
The problem is that you don’t understand the purpose of the studies at all and you’re violating several important principles which need to be kept in mind when applying logic to the real world.
Our primary goal is to determine net harm or benefit. If I do a study as to whether or not something causes harm or benefit, and see no change in underlying rates, then it is non-harmful. If it is making some people slightly more likely to get cancer, and others slightly less likely to get cancer, then there’s no net harm—there are just as many cancers as there were before. I may have changed the distribution of cancers in the population, but I have certainly not caused any net harm to the population.
This study’s purpose is to look at the net effect of the treatment. If we see the same amount of hyperactivity in the population prior to and after the study, then we cannot say that the dye causes hyperactivity in the general population.
“But,” you complain, “Clearly some people are being harmed!” Well yes, some people are worse off after the treatment in such a theoretical case. But here’s the key: for the effect NOT to show up in the general population, then you have only three major possibilities:
1) The people who are harmed are such a small portion of the population as to be statistically irrelevant.
2) There are just as many people who are benefitting from the treatment and as such NOT suffering from the metric in question, who would be otherwise, as there are people who would not be suffering from the metric without the treatment but are as a result of it. (this is extremely unlikely, as the magnitude of the effects would have to be extremely close to cancel out in this manner)
3) There is no effect.
If our purpose is to make [b]the best possible decision with the least possible amount of money spent[/b] (as it should always be), then a study on the net effect is the most efficient way of doing so. Testing every single possible SNP substitution is not possible, ergo, it is an irrational way to perform a study on the effects of anything. The only reason you would do such a study is if you had good reason to believe that a specific substitution had an effect either way.
Another major problem you run into when you try to run studies “your way” (more commonly known as “the wrong way”) is the blue M&M problem. You see, if you take even 10 things, and test them for an effect, you have a 40% chance of finding at least one false correlation. This means that in order to have a high degree of confidence in the results of your study, you must increase the threshold for detection—massively. Not only do you have to account for the fact that you’re testing more things, you also have to account for all the studies that don’t get published which would contradict your findings (publication bias—people are far more likely to report positive effects than non-effects).
In other words, you are not actually making a rational criticism of these studies. In fact, you can see exactly where you go wrong:
[quote]If 10% of kids become more hyperactive and 10% become less hyperactive after eating food coloring, such a methodology will never, ever detect it.[/quote]
While possible, how [b]likely[/b] is this? The answer is “Not very.” And given Occam’s Razor, we can mostly discard this barring evidence to the contrary. And no, moronic parents are not evidence to the contrary; you will find all sorts of idiots who claim that all sorts of things that don’t do anything do something. Anecdotes are not evidence.
This is a good example of someone trying to apply logic without actually trying to understand what the underlying problem is. Without understanding what is going on in the first place, you’re in real trouble.
I will note that your specific example is flawed in any case; the idea that these people are in fact being effected is deeply controvertial, and unfortunately a lot of it seems to involve the eternal crazy train (choo choo!) that somehow, magically, artifically produced things are more harmful than “naturally” produced things. Unfortunately this is largely based on the (obviously false and irrational) premise that things which are natural are somehow good for you, or things which are “artificial” are bad for you—something which has utterly failed to have been substantiated by and large. You should always automatically be deeply suspect of any such people, especially when you see “parents claim”.
The reason that the FDA says that food dyes are okay is because there is no evidence to the contrary. Food dye does not cause hyperactivity according to numerous studies, and in fact the studies that fail to show the effect are massively more convincing than those which do due to publication bias and the weakness of the studies which claim positive effects.
Correct. But neither can we say that the dye does not cause hyperactivity in anyone.
Like that. That’s what we can’t say from the result of this study, and some other similar studies. For the reasons I explained in detail above.
Your making the claim “no evidence to the contrary” shows that you have not read the literature, have not done a PubMed search on “ADHD, food dye”, and have no familiarity with toxicity studies in general. There is always evidence to the contrary. An evaluation weighs the evidence on both sides. You can take any case where the FDA has said “There is no evidence that X”, and look up the notes from the panel they held where they considered the evidence for X and decided that the evidence against X outweighed it.
If you believe that there is no evidence that food dyes cause hyperactivity, fine. That is not the point of this post. This post analyzes the use of a statistical test in one study, and shows that it was used incorrectly to justify a conclusion which the data does not justify.
(A) I analyzed their use of math and logic in an attempt to prove a conclusion, and showed that they used them incorrectly and their conclusions are therefore not logically correct. They have not proven what they claim to have proven.
(B) The answer is, “This is very likely.” This is how studies turn out all the time, partly due to genetics. Different people have different genetics, different bacteria in their gut, different lifestyles, etc. This makes them metabolize food differently. It makes their brain chemistry different. Different people are different.
That’s one of the problems I was pointing out! The F-test did not pass the threshold for detection. The threshold is set so that things that pass it are considered to be proven, NOT so that things that don’t pass it are considered disproven. Because of the peculiar nature of an F-test, not passing the threshold is not even weak evidence that the hypothesis being tested is false.
People aren’t that different. I really doubt that, for example, there are people whose driving skills improve after drinking the amount of alcohol contained in six cans of beer.
You haven’t searched hard:
Consider the negative effects of high nervousness on driving skills, the nervousness-reducing effects of alcohol, the side effects of alcohol withdrawal on alcoholics, and the mediating effects of high body mass on the effects of alcohol:
A severely obese alcoholic who is nervous enough about driving and suffering from the shakes might perform worse stone-cold sober than he does with the moderate BAC that he has after drinking a six-pack.
What are the odds that there exists at lease one sufficiently obese alcoholic who is nervous about driving?
That data point would not provide notable evidence that alcohol improves driving in the general population.
The phrase “There is no evidence that X” is the single best indicator of someone statistically deluded or dishonest.
I’d normally take “evidence that [clause]” or “evidence for [noun phrase]” to mean ‘(non-negligible) positive net evidence’. (But of course that can still be a lie, or the result of motivated cognition.) If I’m talking about evidence of either sign, I’d say “evidence whether [clause]” or “evidence about [noun phrase]”.
I think your usage is idiosyncratic. People routinely talk about evidence for and against, and evidence for is not the net, but the evidence in favor.
It’s quite standard to talk about evidence for and against a proposition in exactly this way, as he reports the FDA did. Having talked about “the evidence for” and weighing against the “evidence against”, you don’t then deny the existence of the “evidence for” just because, in balance, you find the evidence against more convincing.
You’re slicing the language so thinly, and in such a nonstandard way, it seems like rationalization and motivated reasoning. No evidence means no evidence. No means no. It can mean *very very little too”. Fine. But it doesn’t mean “an appreciable amount that has a greater countervailing amount”.
But here the FDA has taken “The balance of the evidence is not enough for to be sure enough” and said “There is no evidence for”. The evidence cited as “no evidence” should move the estimate towards 84% certain that there is an effect in the general population.
Very good point.
In this case, honest eyeballing of the data would lead one to conclude that there is an effect.
There actually isn’t any evidence against an effect hypothesis, because they’re not testing an effect hypothesis for falsification at all. There just isn’t enough evidence against the null by their arbitrarily too high standard.
And this is the standard statistical test in medicine, whereby people think they’re being rigorously scientific. Still just 2 chromosomes away from chimpanzees.
This is why you never eyeball data. Humans are terrible at understanding randomness. This is why statistical analysis is so important.
Something that is at 84% is not at 95%, which is a low level of confidence to begin with - it is a nice rule of thumb, but really if you’re doing studies like this you want to crank it up even further to deal with problems with publication bias. publish regardless of whether you find an effect or not, and encourage others to do the same.
Publication bias (positive results are much more likely to be reported than negative results) further hurt your ability to draw conclusions.
The reason that the FDA said what they did is that there isn’t evidence to suggest that it does anything. If you don’t have statistical significance, then you don’t really have anything, even if your eyes tell you otherwise.
Some are more terrible than others. A little bit of learning is a dangerous thing. Grown ups eyeball their data and know the limits of standard hypothesis testing.
Yeah, evidence that the FDA doesn’t accept doesn’t exist.
The people who believe that they are grown-ups who can eyeball their data and claim results which fly in the face of statistical rigor are almost invariably the people who are unable to do so. I have seen this time and again, and Dunning-Kruger suggests the same—the least able are very likely to do this based on the idea that they are better able to do it than most, whereas the most able people will look at it and then try to figure out why they’re wrong, and consider redoing the study if they feel that there might be a hidden effect which their present data pool is insufficient to note. However, repeating your experiment is always dangerous if you are looking for an outcome (repeating your experiment until you get the result you want is bad practice, especially if you don’t adjust things so that you are looking for a level of statistical rigor that is sufficient to compensate for the fact that you’re doing it over again), so you have to keep it very carefully in mind and control your experiment and set your expectations accordingly.
The problem we started with was that “statistical rigor” is generally not rigorous. Those employing it don’t know what it would mean under the assumptions of the test, and fewer still know that the assumptions make little sense.
[quote]Correct. But neither can we say that the dye does not cause hyperactivity in anyone.[/quote]
No, but that is not our goal in the first place. Doing a test on every single possible trait is economically infeasible and unreasonable; ergo, net impact is our best metric.
The benefit is “we get a new food additive to use”.
The net cost is zero in terms of health impact (no more hyperactivity in the general population).
Ergo, the net benefit is a new food additive. This is very simple math here. Net benefit is what we care about in this case, as it is what we are studying. If it redistributes ailments amongst the population, then there may be even more optimal uses, but we’re still looking at a benefit.
If you want to delve deeper, that’s going to be a seperate experiment.
[quote]Your making the claim “no evidence to the contrary” shows that you have not read the literature, have not done a PubMed search on “ADHD, food dye”, and have no familiarity with toxicity studies in general. There is always evidence to the contrary. An evaluation weighs the evidence on both sides. You can take any case where the FDA has said “There is no evidence that X”, and look up the notes from the panel they held where they considered the evidence for X and decided that the evidence against X outweighed it.[/quote]
Your making the claim “evidence to the contrary” suggests that any of this is worth anything. The problem is that, unfortunately, it isn’t.
If someone does a study on 20 different colors of M&Ms, then they will, on average, find that one of the M&Ms will change someone’s cancer risk. The fact that their study showed that, with 95% confidence, blue M&Ms increased your odds of getting cancer, [b]is not evidence for the idea that blue M&M’s cause cancer[/b].
Worse, the odds of the negative finding studies being published is considerably less than the probability of the positive finding study being published. This is known as “publication bias”. Additionally, people are more likely to be biased against artificial additives than towards them, particularly “independent researchers” who very likely are researching it precisely because they harbor the belief that it does in fact have an effect.
This is very basic and is absolutely essential to understanding any sort of data of this sort. When I say that there is no evidence for it, I am saying precisely that—just because someone studied 20 colors of M&M’s and found that one has a 95% chance of causing more cancer tells me nothing. It isn’t evidence for anything. It is entirely possible that it DOES cause cancer, but the study has failed to provide me for evidence of that fact.
You are thinking in terms of formal logic, but that is not how science works. If you lack sufficient evidence to invalidate the null hypothesis, then you don’t have evidence. And the problem is that a mere study is often insufficient to actually demonstrate it unless the effects are extremely blatant.
quote The answer is, “This is very likely.” This is how studies turn out all the time, partly due to genetics. Different people have different genetics, different bacteria in their gut, different lifestyles, etc. This makes them metabolize food differently. It makes their brain chemistry different. Different people are different.[/quote]
For this to happen, you would require that the space to be very similar in size on both ends.
Is it possible for things to help one person and harm another? Absolutely.
Is it probable that something will help almost exactly as many people as it harms? No. Especially not some random genetic trait (there are genetic traits, such as sex, where this IS likely because it is an even split in the population, so you do have to be careful for that, but sex-dependence of results is pretty obvious).
The probability of equal distribution of the traits is vastly outweighed by the probability of it not being equally distributed. Ergo the result you are espousing is in fact extremely unlikely.
This is very basic and is absolutely essential to understanding any sort of data of this sort. When I say that there is no evidence for it, I am saying precisely that—just because someone studied 20 colors of M&M’s and found that one has a 95% chance of causing more cancer tells me nothing. It isn’t evidence for anything. It is entirely possible that it DOES cause cancer, but the study has failed to provide me for evidence of that fact.
When I said that “making the claim “no evidence to the contrary” shows that you have not read the literature, have not done a PubMed search on “ADHD, food dye”, and have no familiarity with toxicity studies in general,” I meant that literally. I’m well-aware of what 95% means and what publication bias means. If you had read the literature on ADHD and food dye, you would see that it is closer to a 50-50 split between studies concluding that there is or is not an effect on hyperactivity. You would know that some particular food dyes, e.g., tartrazine, are more controversial than others. You would also find that over the past 40 years, the list of food dyes claimed not to be toxic by the FDA and their European counterparts has been shrinking.
If you were familiar with toxicity studies in general, you would know that this is usually the case for any controversial substance. For instance, the FDA says there is “no evidence” that aspartame is toxic, and yet something like 75% of independent studies of aspartame concluded that it was toxic. The phrase “no evidence of toxicity”, when used by the FDA, is shorthand for something like “meta-analysis does not provide us with a single consistent toxicity narrative that conforms to our prior expectations”. You would also know that toxicity studies are frequently funded by the companies trying to sell the product being tested, and so publication bias works strongly against findings of toxicity.
Suppose their exists a medication that kills 10% of the rationalists who take it (but kills nobody of other thought patterns), and saves the lives of 10% of the people who take it, but only by preventing a specific type of heart disease that is equally prevalent in rationalists as in the general population.
A study on the general population would show benefits, while a study on rationalists would show no effects, and a study on people at high risk for a specific type of heart disease would show greater benefits.
Food dye is allegedly less than 95% likely to cause hyperactivity in the general population. It has been alleged to be shown that it is more than 95% likely to cause hyperactivity in specific subgroups. It is possible for both allegations to be true.
Yes, but it is not a probable outcome, as for it to be true, it would require a counterbalancing group of people who benefit from it or for the subgroups to be extremely small; however, the allegations are that the subgroups are NOT small enough that the effect could have been hidden in this manner, suggesting that there is no effect on said subgroups as the other possibility is unlikely.
Strictly speaking, the subgroup in question only has to be one person smaller than everybody for those two statements to be compatible.
Suppose that there is no effect on 10% of the population, and a consistent effect in 90% of the population that just barely meets the p<.05 standard when measured using that subgroup. If that measurement is make using the whole population, p>.05
95% is an arbitrarily chosen number which is a rule of thumb. Very frequently you will see people doing further investigation into things where p>0.10, or if they simply feel like there was something interesting worth monitoring. This is, of course, a major cause of publication bias, but it is not unreasonable or irrational behavior.
If the effect is really so minor it is going to be extremely difficult to measure in the first place, especially if there is background noise.
It’s not a rule of thumb; it’s used as the primary factor in making policy decisions incorrectly. In this specific example, the regulatory agency made the statement “There is no evidence that artificial colorings are linked to hyperactivity” based on the data that artificial colorings are linked to hyperactivity with p~.13
There are many other cases in medical where 0.05p<.5 is used as evidence against p.