RSS

Cole Wyeth

Karma: 2,204

I am a PhD student in computer science at the University of Waterloo, supervised by Professor Ming Li and advised by Professor Marcus Hutter.

My current research is related to applications of algorithmic probability to sequential decision theory (universal artificial intelligence). Recently I have been trying to start a dialogue between the computational cognitive science and UAI communities. Sometimes I build robots, professionally or otherwise. Another hobby (and a personal favorite of my posts here) is the Sherlockian abduction master list, which is a crowdsourced project seeking to make “Sherlock Holmes” style inference feasible by compiling observational cues. Give it a read and see if you can contribute!

See my personal website colewyeth.com for an overview of my interests and work.

I do ~two types of writing, academic publications and (lesswrong) posts. With the former I try to be careful enough that I can stand by ~all (strong/​central) claims in 10 years, usually by presenting a combination of theorems with rigorous proofs and only more conservative intuitive speculation. With the later, I try to learn enough by writing that I have changed my mind by the time I’m finished—and though I usually include an “epistemic status” to suggest my (final) degree of confidence before posting, the ensuing discussion often changes my mind again.

For­mal­iz­ing Embed­ded­ness Failures in Univer­sal Ar­tifi­cial Intelligence

Cole WyethMay 26, 2025, 12:36 PM
30 points
0 comments1 min readLW link
(arxiv.org)

Align­ment Pro­posal: Ad­ver­sar­i­ally Ro­bust Aug­men­ta­tion and Distillation

May 25, 2025, 12:58 PM
53 points
40 comments13 min readLW link

Model­ing ver­sus Implementation

Cole WyethMay 18, 2025, 1:38 PM
27 points
10 comments3 min readLW link

Glass box learn­ers want to be black box

Cole WyethMay 10, 2025, 11:05 AM
46 points
10 comments4 min readLW link

Why does METR score o3 as effec­tive for such a long time du­ra­tion de­spite over­all poor scores?

Cole WyethMay 2, 2025, 10:58 PM
19 points
3 comments1 min readLW link

Judg­ing types of con­se­quen­tial­ism by in­fluence and normativity

Cole WyethApr 29, 2025, 11:25 PM
20 points
1 comment2 min readLW link

Is al­ign­ment re­ducible to be­com­ing more co­her­ent?

Cole WyethApr 22, 2025, 11:47 PM
19 points
0 comments3 min readLW link

Re­ac­tions to METR task length pa­per are insane

Cole WyethApr 10, 2025, 5:13 PM
58 points
43 comments4 min readLW link

Chang­ing my mind about Chris­ti­ano’s ma­lign prior argument

Cole WyethApr 4, 2025, 12:54 AM
27 points
34 comments7 min readLW link

I “in­vented” semimea­sure the­ory and all I got was im­pre­cise prob­a­bil­ity theory

Cole WyethApr 3, 2025, 4:33 PM
14 points
1 comment6 min readLW link

Ex­ist­ing UDTs test the limits of Bayesi­anism (and con­sis­tency)

Cole WyethMar 12, 2025, 4:09 AM
28 points
21 comments7 min readLW link

Levels of anal­y­sis for think­ing about agency

Cole WyethFeb 26, 2025, 4:24 AM
11 points
0 comments7 min readLW link

In­tel­li­gence as Priv­ilege Escalation

Cole WyethFeb 23, 2025, 7:31 PM
28 points
0 comments5 min readLW link

[Question] Have LLMs Gen­er­ated Novel In­sights?

Feb 23, 2025, 6:22 PM
158 points
38 comments2 min readLW link

What makes a the­ory of in­tel­li­gence use­ful?

Cole WyethFeb 20, 2025, 7:22 PM
16 points
0 comments11 min readLW link

[Question] Take over my pro­ject: do com­putable agents plan against the uni­ver­sal dis­tri­bu­tion pes­simisti­cally?

Cole WyethFeb 19, 2025, 8:17 PM
25 points
3 comments3 min readLW link

My model of what is go­ing on with LLMs

Cole WyethFeb 13, 2025, 3:43 AM
104 points
49 comments7 min readLW link

[Question] What is the most im­pres­sive game LLMs can play well?

Cole WyethJan 8, 2025, 7:38 PM
19 points
20 comments1 min readLW link

Re­but­tals for ~all crit­i­cisms of AIXI

Cole WyethJan 7, 2025, 5:41 PM
25 points
17 comments14 min readLW link

Here­sies in the Shadow of the Sequences

Cole WyethNov 14, 2024, 5:01 AM
19 points
12 comments2 min readLW link