tailcalled comments on Prizes for ELK proposals

tailcalled 18 Jan 2022 22:00 UTC
1 point
AF
If I understand the problem statement correctly, I think I could take a stab at easier versions of the problem, but that the current formulation is too much to swallow in one bite. In particular I am concerned about the following parts:
Setting
We start with an unaligned benchmark:

* An architecture M_θ
<snip>
Goal
To solve ELK in this case we must:

* Supply a modified architecture M_θ⁺ which has the same inputs and outputs as M_θ<snip>
Does this mean that the method needs to work for ~arbitrary architectures, and that the solution must use substantially the same architecture as the original?
except that after producing all other outputs it can answer a question Q in natural language
Does this mean that it must be able to deal with a broad variety of questions, so that we cannot simply sit down and think about how to optimize the model for getting a single question (e.g. “Where is the diamond?”) right?
According to my current model of how these sorts of things work, such constraints makes the problem fundamentally unsolvable, so I am not even going to attempt it, while loosening the constraints may make it solvable, and so I might attempt it.
What links here?
- tailcalled's comment on ELK prize results by paulfchristiano (12 Mar 2022 12:12 UTC; 4 points)
- Mark Xu 20 Jan 2022 2:05 UTC
  LW: 2 AF: 1
  AF Parent
  
  Does this mean that the method needs to work for ~arbitrary architectures, and that the solution must use substantially the same architecture as the original?
  
  Yes, approximately. If you can do it for only e.g. transformers, but not other things, that would be interesting.
  
  Does this mean that it must be able to deal with a broad variety of questions, so that we cannot simply sit down and think about how to optimize the model for getting a single question (e.g. “Where is the diamond?”) right?
  
  Yes, approximately. Thinking about how to get one question right might be a productive way to do research. However, if you have a strategy for answering 1 question right, it should also work for other questions.
  - tailcalled 20 Jan 2022 8:39 UTC
    1 point
    Parent
    
    Yes, approximately. If you can do it for only e.g. transformers, but not other things, that would be interesting.
    
    I guess a closer analogy would be “What if the family of strategies only works for transformer-based GANs?” than “What if the family of strategies only works for transformers?”. As in there’d be heavy restrictions on both the “layer types”, the I/O, and the training procedure?
    
    Yes, approximately. Thinking about how to get one question right might be a productive way to do research. However, if you have a strategy for answering 1 question right, it should also work for other questions.
    
    What if each question/family of questions you want to answer requires careful work on the structure of the model? So the strategy does generalize, but it doesn’t generalize “for free”?

tailcalled comments on Prizes for ELK proposals

Setting

Goal