Paul Bricman

Karma: 189

Oversight Leagues: The Training Game as a Feature

Paul BricmanSep 9, 2022, 10:08 AM

20 points

Paul BricmanAug 20, 2022, 10:01 AM

25 points

Paul BricmanAug 13, 2022, 9:59 AM

24 points

(paulbricman.com)

Paul BricmanMay 7, 2022, 11:01 AM

1 point

(paulbricman.com)

Paul Bricman May 2, 2022, 11:30 AM
3 points
in reply to: Charlie Steiner’s comment on: [Linkpost] Value extraction via language model abduction
We tried using (1) subjectivity (based on simple bag-of-words), and (2) zero-shot text classification (NLI-based) to help us sift through the years of tweets in search for bold claims. (1) seemed a pretty poor heuristic overall, and (2) was still super noisy (e.g. It would identify “that’s awesome” as a bold claim, not particularly useful). The second problem was that even if tweets were identified as containing bold claims, those were often heavily contextualized in a reply thread, and so we tried decontextualizing those manually to increase the signal-to-noise ratio. Also, we were initially really confident that we’d use our automatic negation pipeline (i.e. few-shot prompt + DALL-E-like reranking of generations based on detected contradictions and minimal token edit distance), though in reality it would take way way longer than manual labeling given our non-existent infra.
I agree that all those manual steps are huge sources of experimenter bias, though. Doing it the way you suggested would improve replicability, but also increase noise and compute demands.

Paul BricmanMay 1, 2022, 7:11 PM

5 points

(paulbricman.com)