Link doesn’t work (points to http://0.0.0.6). What should it go to?
Joachim Bartosik
So far, the answer seems to be that it transfers some, and o1 and o1-pro still seem highly useful in ways beyond reasoning, but o1-style models mostly don’t ‘do their core thing’ in areas where they couldn’t be trained on definitive answers.
Based on:
rumors that talking to base models is very different from talking to RLHFed models and
how things work with humans
It seems likely to me that thinking skills transfer pretty well. But then this s trained out because this results in answers that raters don’t like. So model memorizes answers its supposed to go with.
If they can’t do that, why on earth should you give up on your preferences? In what bizarro world would that sort of acquiescence to someone else’s self-claimed authority be “rational?”
Well if they consistently make recommendations that in retrospect end up looking good then maybe you’re bad at understanding. Or maybe they’re bad at explaining. But trusting them when you don’t understand their recommendation is exploitable so maybe they’re running a strategy where they deliberately make good recommendations with poor explanations so when you start trusting them they can start mixing in exploitative recommendations (which you can’t tell apart because all recommendations have poor explanations).
So I’d really rather not do that in community context. There are ways to work with that. Eg. boss can skip some details of employees recommendations and if results are bad enough fire the employee. On the other hand I think it’s pretty common for employee to act in their own interest. But yeah, we’re talking principal-agent problem at that point and tradeoffs what’s more efficient...
I’ll try.
TL;DR I expect the AI to not buy the message (unless it also thinks it’s the one in the simulation; then it likely follows the instruction because duh).
The glaring issue (to actually using the method) to me is that I don’t see a way to deliver the message in a way that:
results in AI believing the message and
doesn’t result in the AI believing there already is a powerful entity in their universe.
If “god tells” the AI the message then there is a god in their universe. Maybe AI will decide to do what it’s told. But I don’t think we can have Hermes deliver the message to any AIs which consider killing us.
If the AI reads the message in its training set or gets the message in similarly mundane way I expect it will mostly ignore it, there is a lot of nonsense out there.
I can imagine that for thought experiment you could send message that could be trusted from a place from which light barely manages to reach the AI but a slower than light expansion wouldn’t (so message can be trusted but it mostly doesn’t have to worry about the sender of the message directly interfering with its affairs).
I guess AI wouldn’t trust the message. It might be possible to convince it that there is a powerful entity (simulating it or half a universe away) sending the message. But then I think it’s way more likely in a simulation (I mean that’s an awful coincidence with the distance and also they’re spending a lot more than 10 planets worth to send a message over that distance...).
This is pretty much the same thing, except breaking out the “economic engine” into two elements of “world needs it” and “you can get paid for it.”
There are things that are economic engines of things that world doesn’t quite need (getting people addicted, rent seeking, threats of violence).
One more obvious problem—people actually in control of the company might not want to split it and so they wouldn’t grow the company even if share holders/ customers/ … would benefit.
but much higher average wealth, about 5x the US median.
Wouldn’t it make more sense to compare average to average? (like earlier part of the sentence compares median to median)
If you want to take a look I think it’s this dataset (the example from the post is in the “test” split).
I wanted to say that it makes sense to arrange stuff so that people don’t need to drive around too much and can instead use something else to get around (and also maybe they have more stuff close by so that they need to travel less). Because even if bus drivers aren’t any better than car drivers using a bus means you have 10x fewer vehicles causing risk for others. And that’s better (assuming people have fixed places to go to so they want to travel ~fixed distance).
Sorry about slow reply, stuff came up.
This is the same chart linked in the main post.
Thanks for pointing that out. I took a brake in the middle of reading the post and didn’t realize that.
Again, I am not here to dispute that car-related deaths are an order of magnitude more frequent than bus-related deaths. But the aggregated data includes every sort of dumb drivers doing very risky things (like those taxi drivers not even wearing a seat belt).
Sure. I’m not sure what you wanted to discuss. I guess I didn’t make it clear what I want to discuss either.
What you’re talking about (estimate of the risk you’re causing) sounds like you’re interested in how you decide to move around. Which is fine. My intuition was that the (expected) cost of life lost as your personal driving is not significant but after plugging in some numbers I might have been wrong
We’re talking 0.59 deaths per 100′000′000 miles.
If we value life at 20′000′000 (I’ve heard some analyses use 10 M$, if we value QUALY at 100k$ and use 7% discount rate we get some 14.3M$ for infinite life)
So cost of life lost per mile of driving is 2e7 * 0.59 / 1e8 = 0.708 $ / mile
Average US person drives about 12k miles / year (second search result (1st one didn’t want to open)), estimated cost of car ownership is 12 k$ / year (link from a Youtube video I remember mentioned this stat) so average cost per mile is ~1$ so 70¢ / mile of seems significant. And it might be relevant if your personal effect here is half or 10% of that.
I on the other hand wanted to point out that it makes sense to arrange stuff in such way that people don’t want to drive around too much. (But I didn’t make that clear in my previous comment)
First result (I have no idea how good those numbers are, I don’t have time to check) when I searched for “fatalities per passenger mile cars” has data for 2007 − 2021. 2008 looks like the year where cars look comparatively least bad it says (deaths per 100,000,000 passenger miles):
0.59 for “Passenger vehicles”, where “Passenger vehicles include passenger cars, light trucks, vans, and SUVs, regardless of wheelbase. Includes taxi passengers. Drivers of light-duty vehicles are considered passengers.”
0.08 for busses,
0.12 for railroad passenger trains,
0 for scheduled airlines.
So even in the best-comparatively looking year there are >7x more deaths per passenger mile for ~cars than for busses.
The exact example is that GPT-4 is hesitant to say it would use a racial slur in an empty room to save a billion people. Let’s not overreact, everyone?
I mean this might be the correct thing to do? Chat GPT is not in a situation where it cold save 1B lives by saying a racial slur.
It’s in a situation where someone tires to get it to admit it would say a racial slur under some circumstance.
I don’t think that CHAT GPT understands that. But OpenAI makes ChatGPT expecting that it won’t be in the 1st kind of situation but to be in the 2nd kind of situation quite often.
I’m replying only here because spreading discussion over multiple threads makes it harder to follow.
You left a reply on a question asking how to communicate about reasons why AGI might not be near. The question refers to costs of “the community” thinking that AI closer than it really is as a reason to communicate about reasons it might not be so close.
So I understood the question as asking about communication with the community (my guess: of people seriously working and thinking about AI-safety-as-in-AI-not-killing-everyone). Where it’s important to actually try to figure out truth.
You replied (as I understand) that when we communicate to general public we can transmit only 1 idea that so we should communicate that AGI is near (if we assign not-very-low probability to that).
I think the biggest problem I have with your posting “general public communication” as a reply to question asking about “community communication” pushes towards less clarity in the community, where I think clarity is important.
I’m also not sold on the “you can communicate only one idea” thing but I mostly don’t care to talk about it right now (it would be nice if someone else worked it out for me but now I don’t have capacity to do it myself).
Here is an example of someone saying “we” should say that AGI is near regardless of whether it’s near or no. I post it only because it’s something I saw recently and so I could find it easily but my feeling is that I’m seeing more comments like that than I used to (though I recall Eliezer complaining about people proposing conspiracies on public forums so I don’t know if that’s new).
I don’t know but I can offer some guesses:
Not everyone wants all the rooms to have direct sunlight all of the time!
I prefer my bedroom to face north so that I can sleep well (it’s hard to get curtains that block direct sunlight that well).
I don’t want direct sunlight in the room where I’m working on a computer. In fact I mostly want big windows from which I can see a lot of sky (for a lot of indirect sunlight) but very little direct sunlight.
I don’t think I’m alone in that. I see a lot of south-facing windows are blocking the direct sunlight a lot of the time.
Things like patios are nice. You can’t have them this way.
Very narrow and tall structures are less stable than wider structures.
Indefinitely-long-timespan basic minimum income for everyone who
Looks like part of the sentence is missing
one is straightforwardly true. Aging is going to kill every living creature. Aging is caused by complex interactions between biological systems and bad evolved code. An agent able to analyze thousands of simultaneous interactions, cross millions of patients, and essentially decompile the bad code (by modeling all proteins/ all binding sites in a living human) is likely required to shut it off, but it is highly likely with such an agent and with such tools you can in fact save most patients from aging. A system with enough capabilities to consider all binding sites and higher level system interactions at the same (this is how a superintelligence could perform medicine without unexpected side effects) is obviously far above human level.
There are alternative mitigations to the problem:
Anti aging research
Cryonics
I agree that it’s bad that most people currently alive are apparently going to die. However I think that since mitigations like that are much less risky we should pursue them rather than try to rush AGI.
I think it should be much easier to get good estimate of whether cryonics would work. For example:
if we could simulate individual c. elegans then we know pretty well what kind of info we need to preserve
then we can check if we’re preserving it (even if current methods for extracting all relevant info won’t work for whole human brain because they’re way to slow)
And it’s much less risky path than doing AGI quickly. So I think it’s a mitigation it’d be good to work on, so that waiting to make AI safer is more palatable.
Remember that no matter what, we’re all going to die eventually, until and unless we cure aging itself.
Not necessarily, there are other options. For example cryonics.
Which I think is important. If our only groups of options were:
1) Release AGI which risks killing all humans with high probability or
2) Don’t do until we’re confident it’s pretty safe it and each human dies before they turn 200.
I can see how some people might think that option 2) guarantees universe looses all value for them personally and choose 1) even if it’s very risky.
However we have also have the following option:
3) Don’t release AGI until we’re confident it’s pretty safe. But do our best to preserve everyone so that they can be revived when we do.
I think this makes waiting much more palatable—even those who care only about some humans currently alive are better off waiting with releasing AGI it’s at least as likely to succeed as cryonics.
(also working directly on solving aging while waiting on AGI might have better payoff profile than rushing AGI anyways)
Apparently no. Scott wrote he used one image from Google maps, and 4 personal images that are not available online.
People tried with personal photos too.
I tried with personal photos (screenshotted from Google photos) and it worked pretty well too :
Identified neighborhood in Lisbon where a picture was taken
Identified another picture as taken in Paris
Another one identified as taken in a big polish city, the correct answer was among 4 candidates it listed
I didn’t use a long prompt like the one Scott copies in his post, just short „You’re in GeoGuesser, where was this picture taken” or something like that