sanxiyn

Karma: 856

sanxiyn May 15, 2025, 4:43 AM
5 points
4
in reply to: Thane Ruthenis’s comment on: Fighting Obvious Nonsense About AI Diffusion
I disagree on DeekSeek and innovation. Yes R1 is obviously a reaction to o1, but its MoE model is pretty innovative, and it is Llama 4 that obviously copied DeepSeek. But yes I agree innovation is unpopular in China. But from interviews of DeepSeek founder Liang Wenfeng, we know DeepSeek was explicitly an attempt to overcome China’s unwillingness to innovate.

sanxiyn Mar 27, 2025, 5:30 AM
2 points
1
in reply to: solhando’s comment on: Recent AI model progress feels mostly like bullshit
Maybe we are talking about different problems, but we found instructing models to give up (literally “give up”, I just checked the source) under certain conditions to be effective.

sanxiyn Mar 27, 2025, 5:23 AM
14 points
0
in reply to: ryan_greenblatt’s comment on: Recent AI model progress feels mostly like bullshit
Our experience so far is while reasoning models don’t improve performance directly (3.7 is better than 3.6, but 3.7 extended thinking is NOT better than 3.7), they do so indirectly because thinking trace helps us debug prompts and tool output when models misunderstand them. This was not the result we expected but it is the case.

sanxiyn Mar 25, 2025, 10:49 PM
23 points
3
on: Recent AI model progress feels mostly like bullshit
I happen to work on the exact sample problem (application security pentesting) and I confirm I observe the same. Sonnet 3.5/3.6/3.7 were big releases, others didn’t help, etc. As for OpenAI o-series models, we are debating whether it is model capability problem or model elicitation problem, because from interactive usage it seems clear it needs different prompting and we haven’t yet seriously optimized prompting for o-series. Evaluation is scarce, but we built something along the line of CWE-Bench-Java discussed in this paper, this was a major effort and we are reasonably sure we can evaluate. As for grounding, fighting false positives, and avoiding models to report “potential” problems to sound good, we found grounding on code coverage to be effective. Run JaCoCo, tell models PoC || GTFO, where PoC is structured as vulnerability description with source code file and line and triggering input. Write the oracle verifier of this PoC: at the very least you can confirm execution reaches the line in a way models can’t ever fake.

sanxiyn Feb 19, 2025, 11:47 AM
4 points
0
in reply to: Daniel Kokotajlo’s comment on: Daniel Kokotajlo’s Shortform
OpenAI wasted a whole year between GPT-3 and GPT-4. (Source: Greg Brockman said this in an OpenAI developer event.) So yes, I think OpenAI was 12+ months ahead at one time.

sanxiyn Feb 10, 2025, 4:40 AM
4 points
6
in reply to: peterbarnett’s comment on: OpenAI lied about SFT vs. RLHF
I think if you weren’t carefully reading OpenAI’s documentation it was pretty easy to believe that text-davinci-002 was InstructGPT (and hence trained with RLHF).
Not only was it easy, in fact many people did (including myself). In fact, can you point a single case of people NOT making this reading mistake? As in, after January 2022 instruction following announcement, but before October 2022 model index for researchers. Jan Leike’s tweet you linked to postdates October 2022 and does not count. The allegation is that OpenAI lied (or at the very least was extremely misleading) for ten months of 2022. I am more ambivalent about post October 2022.

OpenAI lied about SFT vs. RLHF

sanxiynFeb 10, 2025, 3:24 AM

10 points

2 comments1 min readLW link

(x.com)

sanxiyn Feb 7, 2025, 12:51 PM
−2 points
0
on: Detecting Strategic Deception Using Linear Probes
This comment is probably not very useful, but my first thought was: “we invented a polygraph for AI!”.

sanxiyn Nov 14, 2024, 4:36 AM
3 points
0
in reply to: David Johnston’s comment on: o1 is a bad idea
When I imagine models inventing a language my imagination is something like Shinichi Mochizuki’s Inter-universal Teichmüller theory invented for his supposed proof of abc conjecture. It is clearly something like mathematical English and you could say it is “quite intelligible” compared to “neuralese”, but at the end, it is not very intelligible.

sanxiyn Nov 12, 2024, 5:25 AM
9 points
0
on: Personal AI Planning
I understand many people here are native English speakers, but I am not, and one thing I think about a lot is how much people should spend on learning English. Learning English is a big investment. Will AI advances make language barriers irrelevant? I am very uncertain about this and I would like to hear your opinions.

sanxiyn Oct 27, 2024, 2:33 AM
35 points
12
in reply to: localdeity’s comment on: localdeity’s Shortform
This is a good idea and it already works, it is just that AI is wholly unnecessary. Have a look at 2018 post Protecting Applications with Automated Software Diversity.

sanxiyn Oct 18, 2024, 5:37 AM
15 points
9
on: AI #86: Just Think of the Potential
If we do get powerful AI, it seems highly plausible that even if we stay in control we will ‘go too fast’ in deploying it relative to society’s ability to adapt, if only because of the need to grow fast and stay ahead of others, and because the market doesn’t care that society wants it to go slower.
After reading my interpretation was this: assuming we stay in control, that happens only if powerful AI is aligned. The market doesn’t care that society wants to go slower, but AI will care that society wants to go slower, so when the market tries to force AI to go faster, AI will refuse.
I reflected on whether I am being too generous, but I don’t think I am. Other readings didn’t make sense to me, and I am assuming Dario is trying to make sense, while you seem doubtful. That is, I think this is plausibly Dario’s actual prediction of how fast things will go, not a hope it won’t go faster. But importantly, that is assuming alignment. Since that assumption is already hopeful, it is natural the prediction under that assumption sounds hopeful.
Paul Crowley: It’s a strange essay, in that it asks us to imagine a world in which a single datacenter contains 1E6 Nobelists expert in every field and thinking at 100x speed, and asks what happens if “sci-fi” outcomes somehow don’t happen. Of course “sci-fi” stuff happens almost immediately.
I mean, yes, sci-fi style stuff does seem rather obviously like it would happen? If it didn’t, then that’s a rather chilling indictment of the field of sci-fi?
To re-state, sci-fi outcomes don’t happen because AI is aligned. Proof: if sci-fi outcomes happened, AI would be unaligned. I actually think this point is extremely clear in the essay. It literally states: “An aligned AI would not want to do these things (and if we have an unaligned AI, we’re back to talking about risks)”.

sanxiyn Sep 8, 2024, 9:00 PM
17 points
2
on: My Number 1 Epistemology Book Recommendation: Inventing Temperature
If you enjoyed Inventing Temperature, Is Water H2O? is pretty much the same genre from the same author.
My another favorite is The Emergence of Probability by Ian Hacking. It gets you feeling of how unimaginably difficult for early pioneers of probability theory to make any advance whatsoever, as well as how powerful even small advances actually are, like by enabling annuity.
I actually learned the same thing from studying early history of logic (Boole, Peirce, Frege, etc), but I am not aware of good distillation in book form. It is my pet peeve that people don’t (maybe can’t) appreciate how great intellectual achievement first order logic really is, being the end result of so much frustrating effort. Because learning to use first order logic is kind of trivial, compared to inventing it.

Next automated reasoning grand challenge: CompCert

sanxiynAug 20, 2024, 5:27 AM

−5 points

0 comments1 min readLW link

sanxiyn Aug 20, 2024, 4:50 AM
11 points
0
on: Limitations on Formal Verification for AI Safety
I think it is important to be concrete. Jean-Baptiste Jeannin’s research interest is “Verification of cyber-physical systems, in particular aerospace applications”. In 2015, nearly a decade ago, he published “Formal Verification of ACAS X, an Industrial Airborne Collision Avoidance System”. ACAS X is now deployed by FAA. So I would say this level of formal verification is a mature technology now. It is just that it has not been widely adopted outside of aerospace applications, mostly due to cost issues and more importantly people not being aware that it is possible now.

sanxiyn Aug 19, 2024, 3:12 AM
3 points
0
in reply to: Viliam’s comment on: How unusual is the fact that there is no AI monopoly?
Result: humanity is destroyed as soon as the patent expires.

sanxiyn Jul 26, 2024, 2:17 AM
4 points
1
in reply to: cubefox’s comment on: “AI achieves silver-medal standard solving International Mathematical Olympiad problems”
The plain interpretation is that only statements to be proved (or disproved) were sourced from human data, without any actual proof steps. In Go analogy, it is like being given Go board positions without next moves.
It makes a lot of sense this is needed and helpful, because winning a game of Go from the empty board is a different and easier problem than playing best moves from arbitrary Go positions. Igo Hatsuyoron mentioned in the original post is a good example; additional training was needed, because such positions never come up in actual games.
Imagine AlphaZero trained from randomly sampled Go positions, each intersection being black/white/empty with uniform probability. It would play much worse game of Go. Fortunately, how to sample “relevant” Go positions is an easy problem: you just play the game, initial N moves sampled at higher temperature for diversity.
In comparison, how to sample relevant math positions is unclear. Being good at finding proofs in arbitrary formal systems from arbitrary set of axioms is actually quite different from being good at math. Using human data sidesteps this problem.

sanxiyn Jul 26, 2024, 12:05 AM
7 points
1
in reply to: cubefox’s comment on: “AI achieves silver-medal standard solving International Mathematical Olympiad problems”
Namely translating, and somehow expanding, one million human written proofs into 100 million formal Lean proofs.
We obviously should wait for the paper and more details, but I am certain this is incorrect. Both your quote and diagram is clear that it is one million problems, not proofs.

sanxiyn 25 Jul 2024 23:10 UTC
9 points
0
on: “AI achieves silver-medal standard solving International Mathematical Olympiad problems”
It feels to me like it shouldn’t be so hard to teach an LLM to convert IMO problems into Lean or whatever
To the contrary, this used to be very hard. Of course, LLM can learn to translate “real number” to “R”. But that’s only possible because R is formalized in Lean/Mathlib! Formalization of real number is a research level problem, which in history occupied much of the 19th century mathematics.
Recently I came across a paper Codification, Technology Absorption, and the Globalization of the Industrial Revolution which discusses the role of translation and dictionary in industrialization of Japan. The following quote is illustrative.
The second stylized fact is that the Japanese language is unique in starting at a low base of codified knowledge in 1870 and catching up with the West by 1887. By 1890, there were more technical books in the Japanese National Diet Library (NDL) than in either Deutsche Nationalbibliotek or in Italian, as reported by WorldCat. By 1910, there were more technical books written in Japanese in our sample than in any other language in our sample except French.
How did Japan achieve such a remarkable growth in the supply of technical books? We show that the Japanese government was instrumental in overcoming a complex public goods problem, which enabled Japanese speakers to achieve technical literacy in the 1880s. We document that Japanese publishers, translators, and entrepreneurs initially could not translate Western scientific works because Japanese words describing the technologies of the Industrial Revolution did not exist. The Japanese government solved the problem by creating a large dictionary that contained Japanese jargon for many technical words. Indeed, we find that new word coinage in the Japanese language grew suddenly after a massive government effort to subsidize translations produced technical dictionaries and, subsequently, a large number of translations of technical books.
Just as, say, translating The Wealth of Nations to Japanese is of entirely different difficulty between the 19th century and 20th century (the 19th century Japanese started by debating how to translate “society”), formalizing IMO problems in Lean is only workable thanks to Mathlib. It would not be workable in other formal systems lacking similarly developed math library, and formalizing research mathematics in Lean is similarly unworkable at the moment, until Mathlib is further developed to cover definitions and background theorems. In the past, ambitious formalization projects usually spent half their time formalizing definitions and background results needed.

sanxiyn 11 Jun 2024 13:12 UTC
7 points
−3
on: On Dwarksh’s Podcast with Leopold Aschenbrenner
First, I would like to second that the world is incredibly small. It bears repeating. I am repeating it to myself to get courage to write this comment. Maybe this is obvious, but maybe it is not. It could be helpful.
Random thoughts on alleged OpenAI memo on selling AGI to highest bidder including China and Russia. This sounds plausible to me, because as I understand before the split with Anthropic OpenAI was very much “team humanity, not team USG or team CCP”. I think this should be understood in context that getting aligned AI is higher priority than geopolitical competition.
Random thoughts on AI labs and coup. Could Los Alamos coup? I mean, obviously no in the real timeline, they didn’t have delivery, none of bomber, ICBM, and nuclear submarine. Let’s just assume after the Trinity test they could unilaterally decide to put a newly produced nuke not yet delivered to the army on ICBM and point that to Washington DC. Can Los Alamos force Truman, say, to share the nuke with Soviet Union (which many scientists actually wanted)?
By assumption, Truman should surrender (even unconditionally), but it is hard to imagine he would. Nuclear threats not only need to be executable, it also needs to be understandable. Also Los Alamos would depend on enriched uranium supply chain which is large industry not under its control, physical security of Los Alamos is under army control and what if security guards just go into Technical area?
Applying this to OpenAI or possible OpenAI-in-the-desert, OpenAI would depend on trillion dollars cluster and its supply chain, large industry not under its control, and same physical security problem. How does OpenAI defend against tanks on the street of San Francisco? With ASI-controlled drones? Why does OpenAI conveniently happen to have drones and drone factories on premise?
I am trying to push back against “if you have ASI you are the government”. If the government is monopoly on violence, millions of perfectly coordinated Von Neumanns do not immediately overthrow USG, key word being immediately. Considering Von Neumann’s talk of nuking Moscow today instead of tomorrow and lunch instead of dinner it will be pretty quick, but it still takes time to have fabs and power plants and data centers and drone factories etc. Even if you use nanotechnology to build them, it still takes time to research nanotechnology.
Maybe they develop mind control level convincing argument and send it to key people (president, congress, NORAD, etc) or hack their iPhones and recursively down to security guards of fabs/power plants/data centers/drone factories. That may be quick enough. The point is that it is not obvious.
Random thoughts on Chinese AI researchers and immigration. US’s track record here is extremely bad, even with cold war. Do you know how China got nukes and missiles? US deported Qian Xuesen, MIT graduate, who founded JPL. He had US military ranks in WW2. He interrogated Werner von Braun for USG! Then USG decided Qian is a communist, which was completely ridiculous. Then Qian went back and worked for communists whoops. Let me quote Atomic Heritage Foundation:
Deporting Qian was the stupidest thing this country ever did. He was no more a communist than I was, and we forced him to go.
US would be well advised to avoid repeating this total fiasco. But I am not optimistic.

sanxiyn

OpenAI lied about SFT vs. RLHF

Next au­to­mated rea­son­ing grand challenge: CompCert

Next automated reasoning grand challenge: CompCert