These are probably useful categories in many cases, but I really don’t like the labels. Garbage is mildly annoying, as it implies that there’s no useful signal, not just difficult-to-identify signal. It’s also putting the attribute on the wrong thing—it’s not garbage data, it’s data that’s useful for other purposes than the one at hand. “verbose” or “unfiltered” data, or just “irrelevant” data might be better.
Blessed and cursed are much worse as descriptors. In most cases there’s nobody doing the blessing or cursing, and it focuses the mind on the perception/sanctity of the data, not the use of it. “How do I bless this data” is a question that shows a misunderstanding of what is needed. I’d call this “useful” or “relevant” data, and “misleading” or “wrongly-applied” data.
To repeat, though, the categories are useful—actively thinking about what you know, and what you could know, about data in a dataset, and how you could extract value for understanding the system, is a VERY important skill and habit.
It’s also putting the attribute on the wrong thing—it’s not garbage data, it’s data that’s useful for other purposes than the one at hand.
Mostly it’s not useful for anything. Like the logs contains lots of different types of information, and all the different types of information are almost always useless for all purposes, but each type of information has a small number of purpose for which a very small fraction of that information is useful.
Blessed and cursed are much worse as descriptors. In most cases there’s nobody doing the blessing or cursing, and it focuses the mind on the perception/sanctity of the data, not the use of it.
This is somewhat intentional. One thing one can do with information is give it to others who would not have seen it. Here one sometimes needs to be careful to preserve and highlight the blessed information and eliminate the cursed information.
These are probably useful categories in many cases, but I really don’t like the labels. Garbage is mildly annoying, as it implies that there’s no useful signal, not just difficult-to-identify signal. It’s also putting the attribute on the wrong thing—it’s not garbage data, it’s data that’s useful for other purposes than the one at hand. “verbose” or “unfiltered” data, or just “irrelevant” data might be better.
Blessed and cursed are much worse as descriptors. In most cases there’s nobody doing the blessing or cursing, and it focuses the mind on the perception/sanctity of the data, not the use of it. “How do I bless this data” is a question that shows a misunderstanding of what is needed. I’d call this “useful” or “relevant” data, and “misleading” or “wrongly-applied” data.
To repeat, though, the categories are useful—actively thinking about what you know, and what you could know, about data in a dataset, and how you could extract value for understanding the system, is a VERY important skill and habit.
Mostly it’s not useful for anything. Like the logs contains lots of different types of information, and all the different types of information are almost always useless for all purposes, but each type of information has a small number of purpose for which a very small fraction of that information is useful.
This is somewhat intentional. One thing one can do with information is give it to others who would not have seen it. Here one sometimes needs to be careful to preserve and highlight the blessed information and eliminate the cursed information.