Would an AGI that only tries to satisfice a solution/goal be safer?
Do we have reason to believe that we can/can’t get an AGI to be a satisficer?
Tobias H
There has been quite a lot of discussion over on the EA Forum:
https://forum.effectivealtruism.org/search?terms=phil%20torres
Avital Balwit linked to this lesswrong post in the comments of her own response to his longtermism critique (because Phil Torres is currently banned from the forum, afaik):
https://forum.effectivealtruism.org/posts/kageSSDLSMpuwkPKK/response-to-recent-criticisms-of-longtermism-1#6ZzPqhcBAELDiAJhw
The whole thing was much more banal than what you’re imagining. It was an interim-use building with mainly student residents. There was no coordination between residents that I knew of.
The garden wasn’t trashed before the letter. It was just a table and a couple of chairs, that didn’t fit the house rules. If the city had just said “please, take the table out of the garden”, I’d have given a 70% chance of it working. If the city had not said a thing, there would not have been (a lot of) additional furniture in the garden.
By issuing the threat, the city introduced an incentive they didn’t intend.
Some residents who picked up on the incentive destroyed the garden because they were overconfident in the authority following through with the threat – no matter what.
Trying to Keep the Garden Well
The link to the QWYRFM layout contains an error.
I would guess that humans’ nightmarish experience in concentration camps was usually better than nonexistence; and even if you suspect this is false, it seems easy to imagine how it could be true, because there’s a lot more to human experience than ‘pain, and beyond that pain, darkness’.
I can’t really imagine this – at least for people in extermination camps, who weren’t killed. I’d assume that, all else equal, the vast majority of prisoners would choose to skip that part of their life. But maybe I’m missing something or have unusual intuitions.
Thank you! The general reasoning makes sense to me.
This Cochrane review finds a false negative for asymptomatic individuals of 42% with antigen tests – which were not self-tests. Is your rate significantly higher because you’re thinking of self-administered antigen tests?
In many European countries, you can get antigen self-tests for about $2-4 a piece, this might make a testing scheme more cost-effective.
Thanks for this post, it helped clarify some of my concerns about the upcoming holidays.
I’m surprised you don’t mention testing (PCR or antigen).
What are your views on testing before an event?
What would be a good protocol for testing before a specific event – test on the day of, a day before?
The triangular video almost certainly isn’t a UFO anymore.
Some guy investigated it on the ground and there’s a simple explanation: skyscrapers that are illuminated.
The videos capture a somewhat unique moment when clouds pass infront of the triangular shadow.
www.youtube.com/watch?v=KpjyWgjQvmc
(The most important part is at the end of the video.)
Interesting. I’m now wondering if dislike of crust is more widespread than I assumed.
This strikes me as plausible because there is a lot of moral sentiment attached to not wasting food.
This consumer survey from Switzerland found that consumers see the crust as more indicative of bread freshness than the crumb [low quality online survey]. So maybe people are also mixing up the crust as an indicator for quality with the actual taste of the crust?
Curious: Are you from a country without very good (crusted) bread?
In Switzerland, Germany and France (that I know of) the crust is often considered the tastiest part of the bread, at least when fresh. I only know of children and elderly people not eating the crust because it’s harder to chew.
I’m interested in the distinction between vegan and non-vegan ovens in your household. I’ve never heard about something like that. Is it because of vegans not being comfortable about using the same device or is it a “smell issue”?
I would recommend something like the Sawyer Squeeze (not the Sawyer mini) over the life straw. First of all, the normal life straw needs suction to push water through the filter, while with the Sawyer Squeeze you fill a water container or standard water bottle with dirty water and push it into a container.
Also it has a higher flow rate of almost 0.5 gallon per minute and will filter up to 100′000 gallons of water with proper backflushing (vs. 1′000 gallons with the life straw). It costs around $35 (vs $13) but you can hydrate your whole neighbourhood with enough non-drinkable water available.
Never heard of this before but tried to get a sense of it. (I’m not a teenager nor do I live in the US. This is just some background information people might find interesting.)
It’s a meme on tiktok. Helen Keller not actually being death/blind is generally the premise of a sarcastic joke. You can watch some popular examples here:
#helenkeller Hashtag Videos on TikTok
The idea that Helen Kellers ability were exaggerated has been promoted through the popular “Painkiller Already” Podcast. The comments seem quite open to the idea, but it’s unclear how much of it is edgy humor and how much is genuine belief.
Taylor Proves Helen Keller Was A Fraud—YouTube
I’ve actually made the opposite experience of one commenter.
Once I tried to wash clothes in a hurry, hung them to dry and didn’t get to take them down for half a day or so. I then found my clothes neatly folded on the washing machine.
Felt bad that somebody did my duties, and I wasn’t even able to thank them and apologize. Because the new Waschplan is no Waschplan at all – pure anarchy!
First, what even do we mean by property? Well, there are material things that are sometimes scarce or rivalrous. If I eat a sandwich, you can’t also eat it; if I sleep in a bed, you can’t also sleep in it at the same time; if an acre of land is rented out for agricultural use, only one of us can collect the rent check.
Why do you describe property as being material things here?
Possible Nitpick:
If I understand you correctly, you use ‘excludability’ as a defining feature of property. As far as I understand, property comes with varying degrees of excludability and are sometimes not excludable at all (e.g. public property, common property). Maybe it would be useful to think about property more generally as things that come with certain rights (the right to use and transfer it, the right to earn income/interest off of it).
The shared laundry room was, for example, what led to my very first contact with my neighbors. The very first week, a neighbor complained that I hadn’t wiped water from the rubber band around the door of the washing machine and gave me a long lecture about the rules for using the shared washing machine and tumbler.
I am in switzerland and exactly the same thing happened to me.
(I’m saying this just to lend credence to apartment block politics being a real thing.)
[EDIT, was intended as a response to Raemon, not Dagon.]
Maybe it’s the way you phrase the responses. But as described, I get the impression that this norm would mainly work for relatively extroverted persons with low rejection sensitivity.
I’d be much less likely to ever try to join a discussion (and would tend to not attend events with such a norm). But maybe there’s a way to avoid this, both from “my side” and “yours”.
[I think this is more anthropomorphizing ramble than concise arguments. Feel free to ignore :) ]
I get the impression that in this example the AGI would not actually be satisficing. It is no longer maximizing a goal but still optimizing for this rule.
For a satisficing AGI, I’d imagine something vague like “Get many paperclips” resulting in the AGI trying to get paperclips but at some point (an inflection point of diminishing marginal returns? some point where it becomes very uncertain about what the next action should be?) doing something else.
Or for rules like “get 100 paperclips, not more” the AGI might only directionally or opportunistically adhere. Within the rule, this might look like “I wanted to get 100 paperclips, but 98 paperclips are still better than 90, let’s move on” or “Oops, I accidentally got 101 paperclips. Too bad, let’s move on”.
In your example of the AGI taking lots of precautions, the satisficing AGI would not do this because it could be spending its time doing something else.
I suspect there are major flaws with it, but an intuition I have goes something like this:
Humans have in some sense similar decision-making capabilities to early AGI.
The world is incredibly complex and humans are nowhere near understanding and predicting most of it. Early AGI will likely have similar limitations.
Humans are mostly not optimizing their actions, mainly because of limited resources, multiple goals, and because of a ton of uncertainty about the future.
So early AGI might also end up not-optimizing its actions most of the time.
Suppose we assume that the complexity of the world will continue to be sufficiently big such that the AGI will continue to fail to completely understand and predict the world. In that case, the advanced AGI will continue to not-optimize to some extent.
But it might look like near-complete optimization to us.