AI Capabilities

TagLast edit: 29 Aug 2021 12:57 UTC by plex

AI Capabilities are the growing abilities of AIs to act effectively in increasingly complex environments. It is often compared to to AI Alignment, which refers to efforts to ensure that these effective actions taken by AIs are also intended by the creators and beneficial to humanity.

EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised

gwern2 Nov 2021 2:32 UTC

137 points

52 comments1 min readLW link

(arxiv.org)

A small update to the Sparse Coding interim research report

Lee Sharkey, Dan Braun and beren

30 Apr 2023 19:54 UTC

61 points

5 comments1 min readLW link

[Paper] Stress-testing capability elicitation with password-locked models

Fabien Roger and ryan_greenblatt

4 Jun 2024 14:52 UTC

84 points

10 comments12 min readLW link

(arxiv.org)

Memorizing weak examples can elicit strong behavior out of password-locked models

Fabien Roger and ryan_greenblatt

6 Jun 2024 23:54 UTC

58 points

5 comments7 min readLW link

Getting 50% (SoTA) on ARC-AGI with GPT-4o

ryan_greenblatt17 Jun 2024 18:44 UTC

262 points

49 comments13 min readLW link

Competitive programming with AlphaCode

Algon2 Feb 2022 16:49 UTC

58 points

36 comments15 min readLW link

(deepmind.com)

EfficientZero: How It Works

1a3orn26 Nov 2021 15:17 UTC

297 points

50 comments29 min readLW link 1 review

Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)

Jacy Reese Anthis22 Nov 2022 16:50 UTC

93 points

64 comments1 min readLW link

(www.science.org)

DeepMind on Stratego, an imperfect information game

sanxiyn24 Oct 2022 5:57 UTC

15 points

9 comments1 min readLW link

(arxiv.org)

What will the scaled up GATO look like? (Updated with questions)

Amal 25 Oct 2022 12:44 UTC

34 points

22 comments1 min readLW link

[Question] The thing I don’t understand about AGI

Jeremy Kalfus18 Jun 2024 4:25 UTC

7 points

12 comments1 min readLW link

Devil’s Advocate: Adverse Selection Against Conscientiousness

lionhearted (Sebastian Marshall)28 May 2023 17:53 UTC

10 points

2 comments1 min readLW link

Is AI Progress Impossible To Predict?

alyssavance15 May 2022 18:30 UTC

277 points

39 comments2 min readLW link

[Crosspost] AlphaTensor, Taste, and the Scalability of AI

jamierumbelow9 Oct 2022 19:42 UTC

16 points

4 comments1 min readLW link

(jamieonsoftware.com)

What DALL-E 2 can and cannot do

Swimmer963 (Miranda Dixon-Luinenburg) 1 May 2022 23:51 UTC

353 points

303 comments9 min readLW link

The case for a negative alignment tax

Cameron Berg, Judd Rosenblatt, Diogo de Lucena and AE Studio

18 Sep 2024 18:33 UTC

79 points

20 comments7 min readLW link

[linkpost] The final AI benchmark: BIG-bench

RomanS10 Jun 2022 8:53 UTC

25 points

21 comments1 min readLW link

Timelines to Transformative AI: an investigation

Zershaaneh Qureshi26 Mar 2024 18:28 UTC

20 points

2 comments50 min readLW link

AlphaGeometry: An Olympiad-level AI system for geometry

alyssavance17 Jan 2024 17:17 UTC

45 points

9 comments1 min readLW link

(deepmind.google)

Capabilities and alignment of LLM cognitive architectures

Seth Herd18 Apr 2023 16:29 UTC

86 points

18 comments20 min readLW link

Principles of Privacy for Alignment Research

johnswentworth27 Jul 2022 19:53 UTC

72 points

31 comments7 min readLW link

The longest training run

Jsevillamol, Tamay, Owen D and anson.ho

17 Aug 2022 17:18 UTC

71 points

12 comments9 min readLW link

(epochai.org)

[Question] Are language models close to the superhuman level in philosophy?

Roman Leventov19 Aug 2022 4:43 UTC

6 points

2 comments2 min readLW link

What’s the Most Impressive Thing That GPT-4 Could Plausibly Do?

bayesed26 Aug 2022 15:34 UTC

24 points

22 comments1 min readLW link

[Question] What would you expect a massive multimodal online federated learner to be capable of?

Aryeh Englander27 Aug 2022 17:31 UTC

13 points

4 comments1 min readLW link

Readability is mostly a waste of characters

vlad.proex21 Apr 2023 22:05 UTC

21 points

7 comments3 min readLW link

No, human brains are not (much) more efficient than computers

Jesse Hoogland6 Sep 2022 13:53 UTC

22 points

21 comments3 min readLW link

(www.jessehoogland.com)

AlexaTM − 20 Billion Parameter Model With Impressive Performance

MrThink9 Sep 2022 21:46 UTC

5 points

0 comments1 min readLW link

Evaluations project @ ARC is hiring a researcher and a webdev/engineer

Beth Barnes9 Sep 2022 22:46 UTC

99 points

7 comments10 min readLW link

[Question] Are Speed Superintelligences Feasible for Modern ML Techniques?

DragonGod14 Sep 2022 12:59 UTC

9 points

7 comments1 min readLW link

Steering subsystems: capabilities, agency, and alignment

Seth Herd29 Sep 2023 13:45 UTC

26 points

0 comments8 min readLW link

ACT-1: Transformer for Actions

Daniel Kokotajlo14 Sep 2022 19:09 UTC

52 points

4 comments1 min readLW link

(www.adept.ai)

[Question] Could transformer network models learn motor planning like they can learn language and image generation?

mu_(negative)23 Apr 2023 17:24 UTC

2 points

4 comments1 min readLW link

Molecular dynamics data will be essential for the next generation of ML protein models

Abhishaike Mahajan26 Aug 2024 14:50 UTC

9 points

0 comments11 min readLW link

(www.owlposting.com)

The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs

Quentin FEUILLADE--MONTIXI and Pierre Peigné

7 Nov 2023 16:12 UTC

52 points

20 comments6 min readLW link

Will we run out of ML data? Evidence from projecting dataset size trends

Pablo Villalobos14 Nov 2022 16:42 UTC

75 points

12 comments2 min readLW link

(epochai.org)

Mastering Stratego (Deepmind)

svemirski2 Dec 2022 2:21 UTC

6 points

0 comments1 min readLW link

(www.deepmind.com)

Can GPT-3 Write Contra Dances?

jefftk4 Dec 2022 3:00 UTC

6 points

4 comments10 min readLW link

(www.jefftk.com)

A Year of AI Increasing AI Progress

TW12330 Dec 2022 2:09 UTC

148 points

3 comments2 min readLW link

[Question] Is “Recursive Self-Improvement” Relevant in the Deep Learning Paradigm?

DragonGod6 Apr 2023 7:13 UTC

32 points

36 comments7 min readLW link

Language models can generate superior text compared to their input

ChristianKl17 Jan 2023 10:57 UTC

48 points

28 comments1 min readLW link

Google announces ‘Bard’ powered by LaMDA

M. Y. Zuo6 Feb 2023 19:40 UTC

31 points

3 comments2 min readLW link

Sydney can play chess and kind of keep track of the board state

Erik Jenner3 Mar 2023 9:39 UTC

64 points

19 comments6 min readLW link

Google’s PaLM-E: An Embodied Multimodal Language Model

SandXbox7 Mar 2023 4:11 UTC

87 points

7 comments1 min readLW link

(palm-e.github.io)

Squeezing foundations research assistance out of formal logic narrow AI.

Donald Hobson8 Mar 2023 9:38 UTC

16 points

1 comment2 min readLW link

A chess game against GPT-4

Rafael Harth16 Mar 2023 14:05 UTC

24 points

23 comments1 min readLW link

Why the technological singularity by AGI may never happen

hippke3 Sep 2021 14:19 UTC

5 points

14 comments1 min readLW link

Epistemic Strategies of Safety-Capabilities Tradeoffs

adamShimi22 Oct 2021 8:22 UTC

5 points

0 comments6 min readLW link

The alignment problem in different capability regimes

Buck9 Sep 2021 19:46 UTC

88 points

12 comments5 min readLW link

Google announces Pathways: new generation multitask AI Architecture

Ozyrus29 Oct 2021 11:55 UTC

6 points

1 comment1 min readLW link

(blog.google)

Benchmarking LLM Agents on Kaggle Competitions

aogara22 Mar 2024 13:09 UTC

15 points

4 comments5 min readLW link

“AI achieves silver-medal standard solving International Mathematical Olympiad problems”

gjm25 Jul 2024 15:58 UTC

133 points

38 comments2 min readLW link

(deepmind.google)

Diffusion Guided NLP: better steering, mostly a good thing

Nathan Helm-Burger10 Aug 2024 19:49 UTC

13 points

0 comments1 min readLW link

(arxiv.org)

Interpreting Yudkowsky on Deep vs Shallow Knowledge

adamShimi5 Dec 2021 17:32 UTC

100 points

32 comments24 min readLW link

Request: stop advancing AI capabilities

So8res26 May 2023 17:42 UTC

153 points

24 comments1 min readLW link

AI doing philosophy = AI generating hands?

Wei Dai15 Jan 2024 9:04 UTC

46 points

22 comments1 min readLW link

OpenAI Solves (Some) Formal Math Olympiad Problems

Michaël Trazzi2 Feb 2022 21:49 UTC

78 points

27 comments2 min readLW link

[Question] Killing Recurrent Memory Over Self Attention?

Del Nobolo6 Jun 2023 23:02 UTC

3 points

0 comments1 min readLW link

Personal imitation software

Flaglandbase7 Mar 2022 7:55 UTC

6 points

6 comments1 min readLW link

Elon Musk announces xAI

Jan_Kulveit13 Jul 2023 9:01 UTC

75 points

35 comments1 min readLW link

(www.ft.com)

ChatGPT and Bing Chat can’t play Botticelli

Asha Saavoss29 Mar 2023 17:39 UTC

11 points

0 comments6 min readLW link

PaLM in “Extrapolating GPT-N performance”

Lukas Finnveden6 Apr 2022 13:05 UTC

83 points

19 comments2 min readLW link

Dual-Useness is a Ratio

jimrandomh6 Apr 2023 5:46 UTC

35 points

2 comments1 min readLW link

We have achieved Noob Gains in AI

phdead18 May 2022 20:56 UTC

117 points

20 comments7 min readLW link

Uncompetitive programming with GPT-3

Bezzi6 Feb 2022 10:19 UTC

7 points

8 comments3 min readLW link

$300 for the best sci-fi prompt: the results

RomanS3 Jan 2024 19:10 UTC

16 points

19 comments7 min readLW link

Questions I’d Want to Ask an AGI+ to Test Its Understanding of Ethics

sweenesm26 Jan 2024 23:40 UTC

14 points

6 comments4 min readLW link

On agentic generalist models: we’re essentially using existing technology the weakest and worst way you can use it

Yuli_Ban28 Aug 2024 1:57 UTC

10 points

2 comments9 min readLW link

An Introduction to AI Sandbagging

Teun van der Weij, Felix Hofstätter and Francis Rhys Ward

26 Apr 2024 13:40 UTC

44 points

10 comments8 min readLW link

[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der Weij, Felix Hofstätter, Ollie J, Sam F. Brown and Francis Rhys Ward

13 Jun 2024 10:04 UTC

84 points

10 comments2 min readLW link

(arxiv.org)

What’s the future of AI hardware?

Itay Dreyfus17 Jun 2024 13:05 UTC

2 points

0 comments8 min readLW link

(productidentity.co)

A short project on Mamba: grokking & interpretability

Alejandro Tlaie18 Oct 2024 16:59 UTC

21 points

0 comments6 min readLW link

Agentized LLMs will change the alignment landscape

Seth Herd9 Apr 2023 2:29 UTC

157 points

97 comments3 min readLW link

Stability AI releases StableLM, an open-source ChatGPT counterpart

Ozyrus20 Apr 2023 6:04 UTC

11 points

3 comments1 min readLW link

(github.com)

[Thought Experiment] Tomorrow’s Echo—The future of synthetic companionship.

Vimal Naran26 Oct 2023 17:54 UTC

−7 points

2 comments2 min readLW link

AI as Super-Demagogue

RationalDino5 Nov 2023 21:21 UTC

0 points

11 comments9 min readLW link

A call for a quantitative report card for AI bioterrorism threat models

Juno4 Dec 2023 6:35 UTC

12 points

0 comments10 min readLW link

GPT4 is capable of writing decent long-form science fiction (with the right prompts)

RomanS23 May 2023 13:41 UTC

22 points

28 comments65 min readLW link

AGI-Automated Interpretability is Suicide

__RicG__10 May 2023 14:20 UTC

23 points

33 comments7 min readLW link

GPT-4 implicitly values identity preservation: a study of LMCA identity management

Ozyrus17 May 2023 14:13 UTC

21 points

4 comments13 min readLW link

TinyStories: Small Language Models That Still Speak Coherent English

Ulisse Mini28 May 2023 22:23 UTC

66 points

8 comments2 min readLW link

(arxiv.org)

[Question] Hypothetical: what would you do?

JNS3 Aug 2023 22:39 UTC

4 points

2 comments1 min readLW link

LLMs are (mostly) not helped by filler tokens

Kshitij Sachan10 Aug 2023 0:48 UTC

66 points

35 comments6 min readLW link

Inflection.ai is a major AGI lab

nikola9 Aug 2023 1:05 UTC

137 points

13 comments2 min readLW link

Google DeepMind’s RT-2

SandXbox11 Aug 2023 11:26 UTC

9 points

1 comment1 min readLW link

(robotics-transformer2.github.io)

Stupidity is also hard

walkthroughwalls12 Sep 2023 2:45 UTC

−8 points

4 comments2 min readLW link

Basic Mathematics of Predictive Coding

Adam Shai29 Sep 2023 14:38 UTC

49 points

6 comments9 min readLW link

Towards Better Milestones for Monitoring AI Capabilities

snewman27 Sep 2023 21:18 UTC

11 points

0 comments14 min readLW link

[Question] Is there a publicly available list of examples of frontier model capabilities?

Max Kearney19 Sep 2023 17:45 UTC

1 point

0 comments1 min readLW link

Interpretability Externalities Case Study—Hungry Hungry Hippos

Magdalena Wache20 Sep 2023 14:42 UTC

64 points

22 comments2 min readLW link

This anime storyboard doesn’t exist: a graphic novel written and illustrated by GPT4

RomanS5 Oct 2023 14:01 UTC

12 points

7 comments55 min readLW link

I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines

307th20 Oct 2023 16:37 UTC

118 points

33 comments9 min readLW link

Eleuther releases Llemma: An Open Language Model For Mathematics

mako yass17 Oct 2023 20:03 UTC

22 points

0 comments1 min readLW link

(blog.eleuther.ai)

[Question] What are the relative speeds of AI capabilities and AI safety?

NunoSempere24 Apr 2020 18:21 UTC

8 points

2 comments1 min readLW link

DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo27 Jul 2021 14:19 UTC

247 points

53 comments2 min readLW link

(deepmind.com)

OpenAI Codex: First Impressions

specbug13 Aug 2021 16:52 UTC

49 points

8 comments4 min readLW link

(sixeleven.in)

To contribute to AI safety, consider doing AI research

Vika16 Jan 2016 20:42 UTC

39 points

39 comments2 min readLW link

[Question] What’s the difference between newer Atari-playing AI and the older Deepmind one (from 2014)?

Raemon2 Nov 2021 23:36 UTC

27 points

8 comments1 min readLW link

AI Tracker: monitoring current and near-future risks from superscale models

Edouard Harris and Jeremie Harris

23 Nov 2021 19:16 UTC

67 points

13 comments3 min readLW link

(aitracker.org)

HIRING: Inform and shape a new project on AI safety at Partnership on AI

Madhulika Srikumar24 Nov 2021 8:27 UTC

6 points

0 comments1 min readLW link

How to measure FLOP/s for Neural Networks empirically?

Marius Hobbhahn29 Nov 2021 15:18 UTC

16 points

5 comments7 min readLW link

What’s the backward-forward FLOP ratio for Neural Networks?

Marius Hobbhahn and Jsevillamol

13 Dec 2021 8:54 UTC

20 points

12 comments10 min readLW link

How I’m thinking about GPT-N

delton13717 Jan 2022 17:11 UTC

54 points

21 comments18 min readLW link

Estimating training compute of Deep Learning models

lennart, Jsevillamol, Marius Hobbhahn, Tamay Besiroglu and anson.ho

20 Jan 2022 16:12 UTC

37 points

4 comments1 min readLW link

Lifelogging for Alignment & Immortality

Dev.Errata17 Aug 2024 23:42 UTC

13 points

3 comments7 min readLW link

Testing PaLM prompts on GPT3

Yitz6 Apr 2022 5:21 UTC

103 points

14 comments8 min readLW link

Gato’s Generalisation: Predictions and Experiments I’d Like to See

Oliver Sourbut18 May 2022 7:15 UTC

43 points

3 comments10 min readLW link

[Question] What is the most probable AI?

Zeruel01720 Jun 2022 23:26 UTC

−2 points

0 comments3 min readLW link

AI Forecasting: One Year In

jsteinhardt4 Jul 2022 5:10 UTC

132 points

12 comments6 min readLW link

(bounded-regret.ghost.io)

A Critique of AI Alignment Pessimism

ExCeph19 Jul 2022 2:28 UTC

9 points

1 comment9 min readLW link

Alignment being impossible might be better than it being really difficult

Martín Soto25 Jul 2022 23:57 UTC

13 points

2 comments2 min readLW link

[Question] How might we make better use of AI capabilities research for alignment purposes?

Jemal Young31 Aug 2022 4:19 UTC

11 points

4 comments1 min readLW link

How should DeepMind’s Chinchilla revise our AI forecasts?

Cleo Nardo15 Sep 2022 17:54 UTC

35 points

12 comments13 min readLW link

It matters when the first sharp left turn happens

Adam Jermyn29 Sep 2022 20:12 UTC

44 points

9 comments4 min readLW link

Anonymous advice: If you want to reduce AI risk, should you take roles that advance AI capabilities?

Benjamin Hilton11 Oct 2022 14:16 UTC

54 points

9 comments1 min readLW link

Is GPT-N bounded by human capabilities? No.

Cleo Nardo17 Oct 2022 23:26 UTC

48 points

8 comments2 min readLW link

They gave LLMs access to physics simulators

ryan_b17 Oct 2022 21:21 UTC

50 points

18 comments1 min readLW link

(arxiv.org)

Article Review: Google’s AlphaTensor

Robert_AIZI12 Oct 2022 18:04 UTC

8 points

4 comments10 min readLW link

Paper: Discovering novel algorithms with AlphaTensor [Deepmind]

LawrenceC5 Oct 2022 16:20 UTC

82 points

18 comments1 min readLW link

(www.deepmind.com)

[Question] Is the speed of training large models going to increase significantly in the near future due to Cerebras Andromeda?

Amal 15 Nov 2022 22:50 UTC

13 points

11 comments1 min readLW link

When AI solves a game, focus on the game’s mechanics, not its theme.

Cleo Nardo23 Nov 2022 19:16 UTC

88 points

7 comments2 min readLW link

Notes on Meta’s Diplomacy-Playing AI

Erich_Grunewald22 Dec 2022 11:34 UTC

14 points

2 comments14 min readLW link

(www.erichgrunewald.com)

A case for capabilities work on AI as net positive

Noosphere8927 Feb 2023 21:12 UTC

10 points

37 comments1 min readLW link

plex 29 Aug 2021 15:56 UTC
1 point
I think this should be in the AI category, likely under Engineering.