kerry

Karma: 7

kerry 10 Aug 2023 20:37 UTC
1 point
0
in reply to: Štěpán Los’s comment on: Race to the Top: Benchmarks for AI Safety
yea that’s cool to see. Very similar attempt at categorization. I feel we get caught up often in the potential / theoretical capabilities of systems. But there are already plenty of systems that fulfill self-replicating, harmful, intelligent behaviors. It’s entirely a question of degrees. That’s why a visual ranking of all systems’ metrics is in order I think.
Defining what comprises a ‘system’ would be the other big challenge. Is a hostile government a system? That’s fairly intelligent and self-replicating. etc.

kerry 3 May 2023 21:21 UTC
1 point
0
on: If Alignment is Hard, then so is Self-Improvement
I think the title should be rephrased, “If alignment is hard, then so is self-replication”.
Linear self-improvement seems a tenable proposition to me.
Your argument assumes (perhaps correctly) that a FOOM would require continual offloading of ‘greatest agency’ from one agent to another, as opposed to upgrading-in-place.

kerry 1 May 2023 1:34 UTC
3 points
0
in reply to: AnthonyC’s comment on: Which technologies are stuck on initial adoption?
yes to drawing and annotation. This has been an itch of mine ever since I got into web dev over a decade ago. The same way the mouse allowed us to designate “this thing” to the PC without having to literally name it, we could communicate the same way to each other on the web potentially

kerry 29 Apr 2023 17:29 UTC
4 points
0
on: [Linkpost] Sam Altman’s 2015 Blog Posts Machine Intelligence Parts 1 & 2
This is nice to read, because it seems Sam is more often on the defensive in public recently and comes across sounding more “accel” than I’m comfortable with. In this video from 6 years ago, various figures like Hassabis and Bostrom (Sam is not there) propose on several occasions exactly what’s happening now—a period of rapid development, perhaps to provoke people into action / regulation while the stakes are somewhat lower, which makes me think this may have been in part what Sam was thinking all along too.
https://www.youtube.com/watch?v=h0962biiZa4

kerry 30 Mar 2023 21:18 UTC
1 point
0
in reply to: Peter Chatain’s comment on: Race to the Top: Benchmarks for AI Safety
I didn’t. I’m sure words towards articulating this have been spoken many times, but the trick is in what forum / form does it need to exist more specifically in order for it to be comprehensible and lasting. Maybe I’m wrong that it needs to be highly public; as with nukes not many people are actually familiar with what is considered sufficient fissile material—governments (try to) maintain this barrier by themselves. But at this stage as it still seems a fuzzy concept, any input seems valid.
Consider the following combination of properties:
- (software—if that’s the right word?) capable of self replication / sustainability / improvement
- capable of eluding human control
- capable of doing harm
In isolation none of these is sufficient, but taken together I think we could all agree we have a problem. So we could begin to categorize and rank various assemblages of AI by these criteria, and not by how “smart” they are.

kerry 27 Feb 2023 21:32 UTC
2 points
0
in reply to: TinkerBird’s comment on: Fighting For Our Lives—What Ordinary People Can Do
maybe some type of oppositional game can help in this regard?
Along the same lines as the AI Box experiment. We have one group “trying to be the worst case AI” starting right at this moment. Not a hypothetical “worst case” but one taken from this moment in time, as if you were an engineer trying to facilitate the worst AI possible.
The Worst Casers propose one “step” forward in engineering. Then we have some sort of Reality Checking team (maybe just a general crowd vote?), where they rate to disprove the feasbility of the step, given the conditions that exist in the scenario so far. Anyone else can subit a “worse-Worst Case” if it is easier / faster / larger magnitude than the standing one.
Over time the goal is to crowd source the shortest credible path to the worst possible outcome, which if done very well, migth actually reach the realm of colloquial communicability.
I’ve started coding editable logic trees like this as web apps before, so if that makes any sense I could make it public while I work on it.
Another possibility is to get Steven Spielberg to make a movie but force him to have Yud as the script writer.

kerry 23 Feb 2023 4:17 UTC
1 point
1
on: Fighting For Our Lives—What Ordinary People Can Do
There is not currently a stock solution to convince someone of the -realistic- dangers of AI who is not willfully engaged already. There are many distracting stories about AI, which are worse than nothing. But describing a bad AI ought to be a far easier problem than aligning AI. I believe we should be focused, paradoxically, perhaps dangerously, on finding and illustrating very clearly the shortest, most realistic and most impactful path to disaster.
The most common misconception I think that people make is to look at the past, and our history of surviving disasters. But most disasters thus far in planetary history have been heavily based in physics and biology—wars, plagues, asteroids. An intelligence explosion would likely have only trivial constraints in either of these domains.
I would try to start in illustrating that point.

kerry 21 Feb 2023 16:14 UTC
1 point
1
in reply to: TinkerBird’s comment on: Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky
I’ve thought for awhile here that the primary problem in alarmism (I’m one) is painting a concise, believable picture. It takes a willful effort and open mind to build a “realistic” picture of this here-to-fore unknown mechanism/organism for oneself, and is near impossible to do for someone who is skeptical or opposed to the potential conclusion.
Being a web dev I’ve brainstormed on occasion ways to build short, crowd-ranked chains of causal steps which people could evaluate for themselves, with various doubts and supporting evidence given to each. It’s still a vague vision which is why I raise it here, to see if anyone else wants to get involved on that abstract design level. ( hmu )

I think the general problem of painting a believable picture could be solved by a number of different mediums though. Unfortunately we’re drowning in inaccurate or indulgent dystopias which end up creating a “boy who cried wolf” effect for earnest ones.

kerry 6 Dec 2022 2:05 UTC
1 point
0
in reply to: Stephen McAleese’s comment on: AGI as a Black Swan Event
I agree completely, and I’m currently looking for what is the most public and concise platform where these scenarios are mapped. Or as I think of them, recipes. There is a finite series of ingredients I think which result in extremely volatile situations. A software with unknown goal formation, widely distributed with no single kill switch, with the abiliity to create more computational power, etc. We have already basically created the first two but we should be thinking what it would take for the 3rd ingredient to be added.

kerry 5 Dec 2022 5:06 UTC
2 points
0
on: Race to the Top: Benchmarks for AI Safety
We need a clear definition of bad AI before we can know what is -not- that I think. These benchmarks seem to itemize AI as if it will have known, concrete components. But I think we need to first compose in the abstract a runaway self sustaining AI, and work backwards to see which pieces are already in place for it.
I haven’t kept up with this community for many years, so I have some catching up to do, but I am currently on the hunt for the most clear and concise places where the various runaway scenarios are laid out. I know there is a wealth of literature, I have the Bostrom book from years ago as well, but I think simplicity is the key here. In other words, where is the AI redline ?

kerry 5 Dec 2022 4:55 UTC
1 point
0
on: AGI as a Black Swan Event
I find the article well written and hits one nail on the head after another in regards to the potential scope of what’s to come, but the overarching question of the black swan is a bit distracting. To greatly oversimplify, I would say black swan is a category of massive event, on par with “catastrophe” and “miracle”, it just has overtones of financial investors having hedged their bets properly or not to prepare for it (that was the context of Taleb’s book iirc).
Imho, the more profound point you started to address, was our denial of these events—that we only in fact understand them retroactively. I think there is some inevitability to that, given that we can’t be living perpetually in hypothetical futures.
I did read the book many years ago but I forget Taleb’s prognosis—what are the strategies for preparing for uknown unknowns?