ozziegooen comments on ozziegooen’s Shortform

ozziegooen 1 Jan 2020 15:55 UTC
4 points
Instillation, Proliferation, Amplification

Paul Christiano and Ought use the terminology of Distillation and Amplification to describe a high-level algorithm of one type of AI reasoning.

I’ve wanted to come up with an analogy to forecasting systems. I previously named a related concept Prediction-Augmented Evaluation Systems, one somewhat renamed to “Amplification” by Jacobjacob in this post.

I think one thing that’s going on is that “distillation” doesn’t have an exact equivalent with forecasting setups. The term “distillation” comes with the assumptions:
1. The “Distilled” information is compressed.
2. Once something is distilled, it’s trivial to execute.
I believe that (1) isn’t really necessary, and (2) doesn’t apply for other contexts.

A different proposal: Instillation, Proliferation, Amplification

In this proposal, we split the “distillation” step into “instillation” and “proliferation”. Instillation refers to the learning of system A into system B. Proliferation refers to the use of system B to apply this learning to various things in a straightforward manner. Amplification refers to the ability of either system A or system B to be able to spend marginal resources to marginally improve a specific estimate or knowledge set.

For instance, in a Prediction-Augmentation Evaluation System, imagine that “Evaluation Procedure A” is to rate movies on a 1-10 scale.

Instillation
Some acquisition process is done to help “Forecasting Team B” learn how “Evaluation Procedure A” does its’ evaluations.

Proliferation
“Forecasting Team B” now applies their understanding of the evaluations of “Evaluation Procedure A” to evaluate 10,000 movies.

Amplification
If there are movies that are particularly important to evaluate well, then there are specific methods available to do so.

I think this is a more complex but generic pattern. Instillation seems purely more generic than distillation, and proliferation like an important aspect that sometimes will be quite expensive.

Back to forecasting, instillation and proliferation are two different things and perhaps should eventually be studied separately. Instillation is about “can a group of forecasters learn & replicate an evaluation procedure”, and Proliferation is about “Can this group do that cost-effectively?”
What links here?
- ozziegooen's comment on [Part 1] Amplifying generalist research via forecasting – Models of impact and challenges by Bird Concept (4 Jan 2020 17:47 UTC; 5 points)
- Ben Goldhaber 2 Jan 2020 22:04 UTC
  3 points
  Parent
  Is there not a distillation phase in forecasting? One model of the forecasting process is person A builds up there model, distills a complicated question into a high information/highly compressed datum, which can then be used by others. In my mind its:
  Model → Distill - > “amplify” (not sure if that’s actually the right word)
  I prefer the term scalable instead of proliferation for “can this group do it cost-effectively” as it’s a similar concept to that in CS.
  - ozziegooen 3 Jan 2020 14:09 UTC
    5 points
    Parent
    Distillation vs. Instillation
    
    My main point here is that distillation is doing 2 things: transitioning knowledge (from training data to a learned representation), and then compressing that knowledge.[1] The fact that it’s compressed in some ways arguably isn’t always particularly important; the fact that it’s transferred is the main element. If a team of forecasters basically learned a signal, but did so in a very uncompressed way (like, they wrote a bunch of books about said signal), but still were somewhat cost-effective, I think that would be fine.
    
    Around “Profileration” vs. “Scaling”; I’d be curious if there are better words out there. I definitely considered scaling, but it sounds less concrete and less specific. To “proliferate” means “to generate more of”, but to “scale” could mean, “to make look bigger, even if nothing is really being done.”
    
    I think my cynical guess is that “instillation/proliferation” won’t catch on because they are too uncommon, but also that “distillation” won’t catch on because it feels like a stretch from the ML use case. Could use more feedback here.
    
    [1] Interestingly, there seem to be two distinct stages in Deep Learning that map to these two different things, according to Naftali Tishby’s claims.