Robert Cousineau

Karma: 167

Robert Cousineau Apr 11, 2025, 10:20 PM
1 point
0
on: Debunk the myth -Testing the generalized reasoning ability of LLM
The failures seem to be often related to the model get stuck trying to reason about your problem in a way that pattern matches too strongly to similar problems, and that is why it is failing. Did you notice this as well?

Robert Cousineau Apr 11, 2025, 10:06 PM
7 points
0
on: Debunk the myth -Testing the generalized reasoning ability of LLM
I found this failure to be interesting, unexpected (to me), and it was honestly frustrating to watch Claude get it wrong over and over again. It seems like this deserves to be received by people smarter and more important than me.

I found your writing style to be off putting and confusing, which seems counterproductive given you seem to have put a lot of work into this benchmark.

I sincerely recommend using Claude to rewrite this post and putting the actual results of the benchmark in the style of a long post or research paper.

It’s not worth much but I’ll commit to strong upvoting it and posting it on my twitter if you do so.
Offputting: Why 4 em dashes in your title? Why does the tone, word choice, and style switch between fancy and not so often? Why the typoes? Claiming something is 50 times lower than commonly believed, redefining “times”, and then minimally supporting that redefinition seems fishy. Not actually giving the results in an understandable format (in this post, not in your benchmark where you seem to have done a really good job backing this up).

Confusing: What is the numbered list of ways you could come up with these questions? It seem like you are describing increasingly malfeasant ways to do so, but I can’t tell. Why not show some example responses from the LLM’s and/or explain their error modes? Telll us how you made these questions. What was your method for coming up with the formula you are using? etc.

Claude would genuinely fix most of these problems—run the post past him! He may not be so good at reasoning as I thought, but he is really good at writing things.

Robert Cousineau Apr 10, 2025, 9:49 PM
1 point
0
on: Should I fundraise for open source search engine?
Kagi seems to fully satisfy “provides a competitor to Big Tech” as well as any non-big tech competitor can be expected to (actively and consistently growing, good product, etc).

I do not believe they are open source, but they certainly seem less censorious.

I would not personally consider this a reasonable use of money or time.

Harry Potter and the Methods of Rationality 10 Year Anniversary Party!

Robert CousineauMar 8, 2025, 9:29 PM

6 points

0 comments1 min readLW link

Robert Cousineau Feb 13, 2025, 6:11 AM
7 points
5
on: My model of what is going on with LLMs
I found this to be a valuable post!

I disagree with your conclusion though—the thoughts that come to my mind as to why are:
1. You seem overly anchored on COT as the only scaffolding system in the near-mid future (2-5 years). While I’m uncertain what specific architectures will emerge, the space of possible augmentations (memory systems, tool use, multi-agent interactions, etc.) seems vastly larger than current COT implementations.
2. Your crux that “LLMs have never done anything important” feels only mildly compelling. Anecdotally, many people do feel LLM’s significantly improve their ability to do important and productive work, both work which requires creativity/cross field information integration and work that does not.
  Further, I am not aware of any large scale ($10 million+) instances of people trying something like a better version of “Ask an LLM to list out in context fields it feels like would be ripe for information integration leading to a breakthrough, and then do further reasoning on what those breakthroughs are/actually perform them.”
  Something like that seems like it would be a MVP of “actually try and get an LLM to come up with something significantly economically valuable. I expect that the lack of this type of experiment existing is because major AI labs feel like that would be choosing to exploit while there are still many gains to be made from exploring further architectural and scaffolding-esque improvements.
3. Where you say “Certainly LLMs should be useful tools for coding, but perhaps not in a qualitatively different way than the internet is a useful tool for coding, and the internet didn’t rapidly set off a singularity in coding speed.”, I find this to be untrue both in terms of the impact of the internet (while it did not cause a short takeoff, it did dramatically increase the amount of new programmers and the effective transfer of information between them. I expect without it we would see computers having <20% of their current economic impact), and in terms of the current and expected future impact of LLM’s (LLM’s simply are widely used by smart/capable programmers. I trust them to evaluate if it is noticeably better than StackOverflow/the rest of the internet).

Robert Cousineau Feb 7, 2025, 7:05 AM
7 points
3
on: When you downvote, explain why
I am strong down voting in this case as when I put a noticeable amount of effort responding to your prior post “are there 2 types of alignment?”, you gave an unsubstantiative followup to my answer to your question, and no followup to the 5 other people who commented in response to your post.

When I attempted to communicate with you clearly and helpfully in response to one of your low effort questions, I saw little value. Why should others listen to you when you tell them to do what I did?

Robert Cousineau Jan 27, 2025, 6:04 AM
4 points
2
on: Are we trying to figure out if AI is conscious?
I quite enjoyed reading this—I’m surprised I’d not read something like it before and quite happy you did the work and posted it here.

Do you have plans of using the dataset you built here to work on “figuring out if AI is conscious”?

Robert Cousineau Jan 27, 2025, 4:13 AM
1 point
0
in reply to: ChristianKl’s comment on: Quotes from the Stargate press conference
Agreed—that’s what I was trying to say with the link under “80b number is the same number Microsoft has been saying repeatedly.”

Robert Cousineau Jan 23, 2025, 2:08 AM
1 point
0
in reply to: KvmanThinking’s comment on: are there 2 types of alignment?
That would be described well by the CEV link above.

Robert Cousineau Jan 23, 2025, 12:31 AM
1 point
0
in reply to: Milan W’s comment on: are there 2 types of alignment?
I think having a single word like “Alignment” mean multiple things is quite useful, similar to how I think having a single word like “Dog” mean many things is also useful.

Robert Cousineau Jan 23, 2025, 12:24 AM
5 points
0
on: are there 2 types of alignment?
I’m having trouble remembering many times people here say “AI Alignment” in a way that would be best described as “making an AI that builds utopia and stuff”. Maybe Coherent Extrapolated Volition would be close.

My general understanding is that when people here talk about AI Alignment, they are talking about something closer to what you call “making an AI that does what we mean when we say ‘minimize rate of cancer’ (that is, actually curing cancer in a reasonable and non-solar-system-disassembling way)”.

On a somewhat related point, I’d say that “making an AI that does what we mean when we say “minimize rate of cancer” (that is, actually curing cancer in a reasonable and non-solar-system-disassembling way)” is entirely encapsulated under “making an AI that builds utopia and stuff”, as it is very very unlikely an AI makes a utopia while misunderstanding what we intended its goal to be that much.

You would likely enjoy reading through this (short) post: Clarifying inner alignment terminology, and I expect it would help you get a better understanding of what people mean when they are discussing AI Alignment.

Another resource you might enjoy would be reading through the Tag and Subtags around AI: https://www.lesswrong.com/tag/ai

PS: In the future, I’d probably make posts like this in the Open Thread.

Robert Cousineau Jan 22, 2025, 6:45 PM
6 points
0
in reply to: Thane Ruthenis’s comment on: Thane Ruthenis’s Shortform
Here is what I posted on “Quotes from the Stargate Press Conference”:

On Stargate as a whole:

This is a restatement with a somewhat different org structure of the prior OpenAI/Microsoft data center investment/partnership, announced early last year (admittedly for $100b).

Elon Musk states they do not have anywhere near the 500 billion pledged actually secured:
I do take this as somewhat reasonable, given the partners involved just barely have $125 billion available to invest like this on a short timeline.

Microsoft has around 78 billion cash on hand at a market cap of around 3.2 trillion.
Softbank has 32 billion dollars cash on hand, with a total market cap of 87 billion.
Oracle has around 12 billion cash on hand, with a market cap of around 500 billion.
OpenAI has raised a total of 18 billion, at a valuation of 160 billion.

Further, OpenAI and Microsoft seem to be distancing themselves somewhat—initially this was just an OpenAI/Microsoft project, and now it involves two others and Microsoft just put out a release saying “This new agreement also includes changes to the exclusivity on new capacity, moving to a model where Microsoft has a right of first refusal (ROFR).”

Overall, I think that the new Stargate numbers published may (call it 40%) be true, but I also think there is a decent chance this is new administration trump-esque propoganda/bluster (call it 45%), and little change from the prior expected path of datacenter investment (which I do believe is unintentional AI~~Not~~KillEveryone-ism in the near future).

Edit: Satya Nadella was just asked about how funding looks for stargate, and said “Microsoft is good for investing 80b”. This 80b number is the same number Microsoft has been saying repeatedly.

Robert Cousineau Jan 22, 2025, 4:14 PM
12 points
0
on: Quotes from the Stargate press conference
On Stargate as a whole:

This is a restatement with a somewhat different org structure of the prior OpenAI/Microsoft data center investment/partnership, announced early last year (admittedly for $100b).

Elon Musk states they do not have anywhere near the 500 billion pledged actually secured:
I do take this as somewhat reasonable, given the partners involved just barely have $125 billion available to invest like this on a short timeline.

Microsoft has around 78 billion cash on hand at a market cap of around 3.2 trillion.
Softbank has 32 billion dollars cash on hand, with a total market cap of 87 billion.
Oracle has around 12 billion cash on hand, with a market cap of around 500 billion.
OpenAI has raised a total of 18 billion, at a valuation of 160 billion.

Further, OpenAI and Microsoft seem to be distancing themselves somewhat—initially this was just an OpenAI/Microsoft project, and now it involves two others and Microsoft just put out a release saying “This new agreement also includes changes to the exclusivity on new capacity, moving to a model where Microsoft has a right of first refusal (ROFR).”

Overall, I think that the new Stargate numbers published may (call it 40%) be true, but I also think there is a decent chance this is new administration trump-esque propoganda/bluster (call it 45%), and little change from the prior expected path of datacenter investment (which I do believe is unintentional AI~~Not~~KillEveryone-ism in the near future).

Edit: Satya Nadella was just asked about how funding looks for stargate, and said “Microsoft is good for investing 80b”. This 80b number is the same number Microsoft has been saying repeatedly.

Robert Cousineau Jan 21, 2025, 3:26 AM
6 points
−1
on: Robert Cousineau’s Shortform
As best I can tell, the US AI Safety institute is likely to be shuttered in the near future. I bet accordingly on this market.

Trump rescinded Executive Order 14110, which established the U.S. AI Safety Institute (AISI).

There was some congressional work going on (HR 9497 and S.4769) and that would have formalized the AISI outside of the executive order, but that has not been enacted per my best understanding of the machinations of our government.

Here’s to hoping I’m wrong! also maybe next time I’ll place a smaller bet...
Edit: I sold my shares at a modest profit. It seems the AISI is less directly linked to 14110 than I expected. Further, no news on it yet seems unlikely if it was actually ending.

Robert Cousineau’s Shortform

Robert CousineauJan 21, 2025, 3:26 AM

2 points

2 comments LW link

Robert Cousineau Jan 20, 2025, 6:25 PM
26 points
4
in reply to: Thane Ruthenis’s comment on: Thane Ruthenis’s Shortform
I personally put a relatively high probability of this being a galaxy brained media psyop by OpenAI/Sam Altman.

Eliezer makes a very good point that confusion around people claiming AI advances/whistleblowing benefits OpenAI significantly, and Sam Altman has a history of making galaxy brained political plays (attempting to get Helen fired (and then winning), testifying to congress that it is good he has oversight via the board and he should not be full control of OpenAI and then replacing the board with underlings, etc).

Sam is very smart and politically capable. This feels in character.

Robert Cousineau Jan 20, 2025, 4:43 PM
5 points
2
on: Monthly Roundup #26: January 2025
I often find the insinuations people make with this graph to be misleading. The increase in time spent at home is in very large part due to the rise in remote work, which I would say is a public good (and at least for me, leads to much easier high quality socialization, as I can make my schedule work for my friends). Additionally. time spent not at home includes people commuting, with all of the negative internalities (risk of crash, wasted time, etc) and negative externalities (emissions, greater traffic load, etc) that driving includes.

That it is trending down post covid seems like a negative, not a positive.

Robert Cousineau Jan 20, 2025, 7:07 AM
2 points
0
on: It is (probably) time for a Buterlian Jihad
I mostly agree with the body of this post, and think your calls to action make sense.

On your title and final note: Butlerian Jihad feels out of place. It’s catchy, but it seems like you are recommending AI concerned people more or less do what AI concerned people already do. I feel like we should save our ability to use words that are a call to arms for a time when that is what we are doing.

Robert Cousineau Jan 9, 2025, 11:40 PM
10 points
0
on: (The) Lightcone is nothing without its people: LW + Lighthaven’s big fundraiser
I’ve donated 5k. Lesswrong (and the people it brings together) deserve credit for the majority of my intellectual growth over the last 6 years. I cannot think of a higher signal:noise place to learn, nor can I think of a more enjoyable and growth inducing community than the community which has grown around it.

Thank you to both those who directly work on it and those who contribute to it!

Lighthaven’s wonder is self evident.

Wagering on Will And Worth (Pascals Wager for Free Will and Value)

Robert CousineauNov 27, 2024, 12:43 AM

−1 points

2 comments3 min readLW link

Robert Cousineau

Harry Pot­ter and the Meth­ods of Ra­tion­al­ity 10 Year An­niver­sary Party!

Robert Cousineau’s Shortform

Wager­ing on Will And Worth (Pas­cals Wager for Free Will and Value)

Harry Potter and the Methods of Rationality 10 Year Anniversary Party!

Wagering on Will And Worth (Pascals Wager for Free Will and Value)