Butanium comments on Decision Transformer Interpretability

Butanium 8 Feb 2023 1:00 UTC
1 point
0
AF
Are you using decision transformers or other RL agents on procgens ? Also, do you plan to work on coinrun ?
- TurnTrout 13 Feb 2023 23:39 UTC
  LW: 4 AF: 4
  0
  AF Parent
  We’re analyzing the mech-int-ungodly Impala architecture, from the paper. Basically
```
=== Impala
conv
maxpool2D
---- residual x2:
relu
conv
relu
conv
residual add from input to this residual block
=== /IMPALA
(repeat 2 more impalas)
---
relu
flatten
fully connected
relu
---
linear policy and value heads
```
  so this mess has sixteen conv layers, was trained on pixels. We’re not doing coinrun for this MATS sprint, although a good amount of tooling should cross over.
  This has presented some challenges—no linearity from decomposing an ongoing residual stream into head-contributions.
  What links here?
  - TurnTrout's comment on Decision Transformer Interpretability by Joseph Bloom (14 Feb 2023 0:14 UTC; 4 points)
- [ ]
  [deleted]