bohaska

Karma: 167

bohaska Apr 9, 2025, 2:12 PM
2 points
1
in reply to: ank’s comment on: Top Down & Physics-based Worldwide AI Alignment
- Like Wolfram present a diffusion model as a world of concepts. But remove the noise, make the generated concepts like pictures in an art galley (so make 2D pictures stand upright like pictures in this 3D simulated art gallery), this way gamers and YouTubers will see how dreadful those model really are inside. There is a new monster every month on YT, they get millions of views. We want the public to know that AI companies make real-life Frankenstein monsters with some very crazy stuff inside of their electronic “brains” (inside of AI models). It can help to spread the outrage, if people also see their personal photos are inside of those models. If they used the whole output of humanity to train their models, those models should benefit the whole humanity, not cost $200/month like paid ChatGPT. People should be able to see what’s in the model, right now a chatbot is like a librarian that spits quotes at you but doesn’t let you enter the library (the AI model).
Okay, so you propose a mechanistic interpretability program where you create a virtual gallery of AI concepts extracted from Stable Diffusion, represented as images. I am slightly skeptical that this would move the needle on AI safety significantly, we already have databases like LAION which are open-source databases of scraped images used to train AI models, and I don’t see that much outrage over it. I mean, there is some outrage, but not a significantly large amount to be a cornerstone of an AI safety plan.
gamers and YouTubers will see how dreadful those model really are inside. There is a new monster every month on YT, they get millions of views. We want the public to know that AI companies make real-life Frankenstein monsters with some very crazy stuff inside of their electronic “brains” (inside of AI models).
What exactly do you envision that is being hidden inside these Stable Diffusion concepts? What “crazy stuff” is in it? I’m currently not aware of anything about their inner representations that is especially concerning.
It can help to spread the outrage, if people also see their personal photos are inside of those models.
It is probably a lot more efficient to show that by modifying the LAION database and slapping some sort of image search on it, so people can see that their pictures were used to train the model.

bohaska Apr 9, 2025, 1:05 PM
2 points
0
on: Top Down & Physics-based Worldwide AI Alignment
Well, this assumes that we have control of most of the world’s GPU’s, and that we have “Math-Proven Safe GPUs” which can block the execution of bad AI models and only output safe AIs (how this is achieved is not really explained in the text), and if we grant this, then AI safety already gets a lot easier.

This is a solution, but a solution similar to “nuke all the datacenters” and I don’t see how this outlines any steps that gets us closer to achieving it.

bohaska Apr 2, 2025, 2:38 AM
5 points
0
on: Bohaska’s Shortform
A helpful page to see and subscribe to all 31 Substack writers (out of 122 total) who were invited to LessOnline: https://lessonline2025invitedlist.substack.com/recommendations

bohaska Apr 2, 2025, 1:18 AM
6 points
5
in reply to: Aella’s comment on: Consider showering
I guess this is another case of ‘Universal’ Human Experiences That Not Everyone Has

bohaska Mar 23, 2025, 1:23 AM
1 point
0
on: Bohaska’s Shortform
Made a small, quick website showing GPQA benchmark scores plotted against LLM inference cost, at https://ai-benchmark-price.glitch.me/. See how much you get for your buck:
Most benchmark data is from Epoch AI, except for those marked “not verified”, which I got from the model developer. Pricing data is from OpenRouter.

All the LLMs on this graph which are on the Pareto frontier of performance vs price were released December 2024 or later...

bohaska Mar 18, 2025, 3:37 AM
9 points
2
in reply to: Christopher Green’s comment on: Levels of Friction
Zvi has a Substack, there are usually more comments on his posts to there compared to his LessWrong posts https://thezvi.substack.com/p/levels-of-friction/comments

This particular post has 30+ comments in that link

bohaska Mar 8, 2025, 1:21 PM
3 points
0
in reply to: Daniel Kokotajlo’s comment on: What a 20-year-lead in military tech might look like
https://www.nytimes.com/interactive/2025/03/03/world/europe/ukraine-russia-war-drones-deaths.html Here’s a link to an NYT article about that.

Here are some quotes:
Drones, not the big, heavy artillery that the war was once known for, inflict about 70 percent of all Russian and Ukrainian casualties, said Roman Kostenko, the chairman of the defense and intelligence committee in Ukraine’s Parliament. In some battles, they cause even more — up to 80 percent of deaths and injuries, commanders say.
The conflict now bears little resemblance to the war’s early battles, when Russian columns lumbered into towns and small bands of Ukrainian infantry moved quickly, using hit-and-run tactics to slow the larger enemy.

today most soldiers die or lose limbs to remote-controlled aircraft rigged with explosives, many of them lightly modified hobby models. Drone pilots, in the safety of bunkers or hidden positions in tree lines, attack with joysticks and video screens, often miles from the fighting.

Ukrainian officials said they had made more than one million first-person-view, or FPV, drones in 2024. Russia claims it can churn out 4,000 every day. Both countries say they are still scaling up production, with each aiming to make three to four million drones in 2025.

They’re being deployed far more often, too. With each year of the war, Ukraine’s military has reported huge increases in drone attacks by Russian forces.
2022: 2,600+ reported attacks
2023: 4,700+ reported attacks
2024: 13,800+ reported attacks

bohaska Dec 24, 2024, 8:50 AM
8 points
3
on: Bohaska’s Shortform
People are better defined by smaller categories.

If someone is part of both a large category and a small category that usually don’t overlap, it’s likely that they are an outlier in the large category, not a representative member.
For example, if someone is both a rationalist and a Muslim, you shouldn’t expect them to be very similar to a normal Muslim but much more similar to a rationalist, and that it’s possible that they may not be very good at representing Muslims in general to a rationalist audience.

bohaska Dec 23, 2024, 12:26 AM
1 point
0
in reply to: Sam Rosen’s comment on: Sam Rosen’s Shortform
The category of “social construction” is a social construct.

bohaska Oct 18, 2024, 9:47 AM
1 point
0
in reply to: Lech Mazur’s comment on: Paper: Large Language Models Can Self-improve [Linkpost]
Good job for guessing that it was Google correctly! Here’s the arXiv print.

bohaska Oct 16, 2024, 3:36 PM
4 points
−1
on: Bohaska’s Shortform
UBI would probably work better in Kenya compared to the US because when people are in extreme poverty, funds get used to meet pressing needs, but when these basic needs are met, extra funds are probably more likely to go into having more leisure time. People in Kenya generally have more low-hanging fruit to buy a motorcycle or to get a home renovation, but in the US, major life-changing things probably cost too much for UBI to cover and the income might be used on something like alcohol.

bohaska Oct 14, 2024, 3:23 AM
3 points
0
on: Open Thread Fall 2024
Why are comments on older posts sorted by date, but comments on newer posts are sorted by top scoring?

bohaska Oct 7, 2024, 7:01 AM
3 points
0
in reply to: gwern’s comment on: Open Thread Fall 2024
What about a goal that isn’t competitive, such as “get grade 8 on the ABRSM music exam for <instrument>”? Plenty of Asian parents have that particular goal and yet they usually ask/force their children to practice daily. Is this irrational, or is it good at achieving this goal? Would we be able to improve efficiency by using spaced repetition in this scenario as opposed to daily practice?

bohaska Oct 6, 2024, 9:48 AM
10 points
0
on: Open Thread Fall 2024
If spaced repetition is the most efficient way of remembering information, why do people who learn a music instrument practice every day instead of adhering to a spaced repetition schedule?

bohaska Oct 6, 2024, 7:19 AM
1 point
0
in reply to: Raemon’s comment on: “Slow” takeoff is a terrible term for “maybe even faster takeoff, actually”
hare/tortoise takeoff

bohaska Oct 5, 2024, 11:49 AM
3 points
2
in reply to: Alexander Gietelink Oldenziel’s comment on: Vladimir_Nesov’s Shortform
This seems like the sort of R&D that China is good at: research that doesn’t need superstar researchers and that is mostly made of incremental improvements. But yet they don’t seem to be producing top LLMs. Why is that?

bohaska Oct 4, 2024, 6:36 AM
3 points
1
on: Is Text Watermarking a lost cause?
Google Gemini uses a watermarking system called Synth ID that claims to be able to watermark text by skewing its probability distribution. Do you think it’ll be effective? Do you think that it’s useful to have this?

bohaska Oct 1, 2024, 10:23 AM
1 point
0
in reply to: cSkeleton’s comment on: Changes in College Admissions
The digital version of the SAT actually uses dynamic scoring now where you get harder questions if you get the ones in the first section correct, but it’s still approximately as difficult as the normal test so tons of people still tie at 1600

bohaska Sep 22, 2024, 10:22 AM
1 point
0
on: How to teach things well
We call on our knowledge when something related triggers, so in order for a lesson to be useful, you need to build those connections and triggers in the student’s mind.

Seems related to trigger-action plans...

bohaska Sep 22, 2024, 8:47 AM
3 points
0
in reply to: ChristianKl’s comment on: Making intentions concrete—Trigger-Action Planning
Such as this one!