Say I have a 10 of coins that could either be red or blue, and heads or tails. This has 20 bits of information.
My friend is interested in knowing about which of them are heads or tails, but doesn’t care about the color. I then decide to only tell my friend the heads/tails information; so, 10 bits of information.
Another example: With image compression, there’s a big difference between “Accurately reconstruct this image, pixel by pixel” vs. “Accurately reconstruct what the viewer would remember, which is basically just ‘Lion on grass’”.
I’d feel uncomfortable calling this translation “compression”, because there was definitely intentional information loss. Most of the literature on compression is about optimally maintaining information, not on optimally loosing it.
Are there other good terms or literature for this?
The field of information theory for calculating how much data can be thrown away while minimising distortion is called rate-distortion theory, and the term for compression with some intentional data loss is lossy compression. This article on JPEGs is an interesting start on some lossy compression techniques, in particular the explanation of Discrete Cosine Transforms. https://parametric.press/issue-01/unraveling-the-jpeg/
Progressive Summarization by Tiago Forte is a note taking technique that focuses on compression as the primary knowledge work that you do on information (books/articles/lectures). For this technique, loss of information by summarizing further and further is a feature of knowledge work. It’s called “progressive” summerization because you do not compress all sources as much as possible. Instead, you begin by marking up your source. Then if the information is useful you summarize further in a separate document, and so on.
This is a usage of information loss as something to be embraced. I would think filtering information is also another way of losing information intentionally—for example when you curate information.
What you describe is how I understand how Pattern Recognition theories of mind and Categories/Concepts in neurological prediction models work. I first read about this in How emotions are made by Lisa Feldmann Barrett. Look into Google scholar or into that books reference section to go down that rabbit hole if you please.
Link to overview article: https://praxis.fortelabs.co/progressive-summarization-a-practical-technique-for-designing-discoverable-notes-3459b257d3eb/ Also has something about prediction models on his site.
EDIT also look into conceptional hierarchies; I don’t know if that’s the direction you’re looking for,though.
“partial information”
Not sure there’s a general term for it, but “psychoacoustic compression” is the term for modeling the importance of information in lossy audio encoding such as MP3.
It may be related to statistical mechanics, with the concepts of microstates, macrostates and entropy. In your first example there are 2 microstates per macrostate, so the entropy of the system, as far as your friend is concerned, is log 2 = 1. In your second example there are say, 2^20 pixels 2^5 bit each, and if there are, say, 2^13 different possible distinct pictures that can still be reasonably called “Lion on grass” (a macrostate), then the entropy of “Lion on grass” is 2^22.