.CLI

Karma: 11

Alignment normie. MS CS&AI @ Oregon State University.

.CLI Dec 4, 2024, 7:15 AM
1 point
0
on: Changbai Li’s Shortform
How did safety engineering get invented for different disciplines, and how do their invention relate to engineering and theory?
Inspired by davidad’s tweets: 1, 2, 3
It seems commonsense that a deeper (theoretical) understanding helps both engineering as well as safety engineering. Which one do you think does theory help more? And which development helped grow theory research more?
My intuition is that:
1. First we started building something by trial-and-error, empirical results.
2. We formulated some safety best practices. But they are all heuristics from the trial-and-error.
3. Then we started gaining theoretical understanding of what we are doing.
4. Only then do we become able to advance “safety engineering”.
5. At the same time, we also get much better at building that thing—much better at engineering.
How well does this mesh with real-life? In the bridges’ case, safety engineering was invented separately, well after we understood how to build bridges—and well after we built a lot of bridges. The pioneers in safety engineering oft have formal math background. This seems to match the intuition above.
That said -
- We did build a lot of bridges, and a lot of them failed, before safety engineering came about. And how much did theories for safety engineering help with bridge capability?
- Did the field advance by novel theory works? Or was it more about the application of existing theories?
- Related to that question is: did safety engineering require an entirely different set of theories that have little to do with bridge capability? (This seems obviously true to me: for example, environmental wear-and-tear and the process of metal rusting does not affect capability, but we need to understand them for safety.)

.CLI Sep 5, 2024, 9:43 PM
3 points
0
in reply to: mako yass’s comment on: Changbai Li’s Shortform
is this part of the reason so many AI researchers think it’s cool and enlightened to not believe in highly general architectures
I do hear No Free Lunch theorem get thrown around when an architecture fails to solve some problem which its inductive bias doesn’t fit. But I think it’s just thrown around as a vibe.

.CLI Sep 4, 2024, 11:57 PM
1 point
0
on: A Bird’s Eye View of the ML Field [Pragmatic AI Safety #2]
Love the post! http://pragmaticaisafety.com/ is down for me right now though. Do the authors still endorse this sequence?

.CLI Sep 4, 2024, 11:56 PM
1 point
0
in reply to: joshc’s comment on: A Bird’s Eye View of the ML Field [Pragmatic AI Safety #2]
Just spent one year in academia; my experience trying to talk to researchers about AGI match what Dan wrote about.

.CLI Sep 1, 2024, 4:39 AM
1 point
−3
on: Changbai Li’s Shortform
(ramblingly) Does the No Free Lunch Theorem imply that there’s no one single technique that would always work for AGI alignment? Initial thought is probably not, because the theorem states that the performance of all optimization algorithms are identical across all possible problems. However, AGI alignment is a subset of these problems.

.CLI Jul 29, 2024, 6:28 AM
1 point
0
on: Language and Capabilities: Testing LLM Mathematical Abilities Across Languages
GPT-4 can do math because it has learned particular patterns associated with tokens, including heuristics for certain digits, without fully learning the abstract generalized pattern.
This finding seems consistent with some literatures, such as this where they found that if the multiplication task has an unseen computational graph, then performance deteriorates rapidly. Perhaps check out the keyword “shortcut learning” too.

.CLI Jul 14, 2024, 7:42 PM
2 points
1
on: The Best Tacit Knowledge Videos on Every Subject
Game Design
The videos under this category fits better the label “game development” instead. Game Design is more focused on designing rules, mechanics, sometimes narratives, instead of programming.

.CLI Jul 5, 2024, 4:43 PM
1 point
0
on: North Oakland: Projects, June 11th
Is the event happening on June 11th or July 9th?

.CLI Jun 6, 2024, 10:02 PM
1 point
2
on: Changbai Li’s Shortform
I think there should be more effort into researching the limits of controllability for self-improving machines. That aspect of rapid self improvement seems pretty important to me since it’s there regardless of which architecture we use to get to the singularity. If the singularity is dangerous no matter how we get there, or how aligned our first try is, then, [clears throat and raises sign] don’t build AGI?

.CLI’s Shortform

.CLIJun 6, 2024, 10:02 PM

1 point

7 comments LW link

.CLI Feb 13, 2024, 7:29 PM
3 points
0
on: More on the Apple Vision Pro
I bought the device and watched Interstellar on top of Mt. Hood with the stars as the background. It was a phenomenal experience. That said, having to bear the weight of the device for 2.5 hours, and other limits such as FOV & lens glare makes me hesitant to say movie’s the one killer app right now. I don’t think there is a killer app yet—Apple wants us to come in for that.

.CLI Aug 6, 2022, 10:32 AM
7 points
1
on: Where are the red lines for AI?
The strategic awareness property would be an interesting one to measure. Which existing system would you say are more or less strategically aware? Are there examples we could point toward, like the social media algorithm one?

.CLI

How did safety engineering get invented for different disciplines, and how do their invention relate to engineering and theory?

.CLI’s Shortform