http://www.overcoming-bias.com/2007/06/against_free_th.html.
This link should be: https://www.overcomingbias.com/p/against_free_thhtml (removing the hyphen will allow a successful redirect).
http://www.overcoming-bias.com/2007/06/against_free_th.html.
This link should be: https://www.overcomingbias.com/p/against_free_thhtml (removing the hyphen will allow a successful redirect).
In the limit (what might be considered the ‘best imaginable case’), we might imagine researchers discovering an alignment technique that (A) was guaranteed to eliminate x-risk and (B) improve capabilities so clearly that they become competitively necessary for anyone attempting to build AGI.
I feel like throughout this post, you are ignoring that agents, “in the limit”, are (likely) provably taxed by having to be aligned to goals other than their own. An agent with utility function “A” is definitely going to be less capable at achieving “A” if it is also aligned to utility function “B”. I respect that current LLM’s not best described as having a singular consistent goal function, however, “in the limit” that is what they will be best described as.
I stopped paying for chatGPT earlier this week, while thinking about the departure of Jan and Daniel.
Whereas before they left I was able to say to myself “well, there are smarter people than me with worldviews similar to mine who have far more information about openAI than me, and they think it is not a horrible place, so 20 bucks a month is probably fine”, I am no longer able to do that.
They have explicitly sounded the best alarm they reasonably know how to currently. I should listen!
Market odds are currently at 54% that 2024 is hotter than 2023: https://manifold.markets/SteveRabin/will-the-average-global-temperature?r=Um9iZXJ0Q291c2luZWF1
I have some substantial limit orders +-8% if anyone strongly disagrees.
I like the writeup, but reccomend actually directly posting it to LessWrong. The writeup is of a much higher quality than your summary, and would be well suited to inline comments/the other features of the site.
I have a lot of trouble justifying to myself reading through more than the first five paragraphs. Below is my commentary on what I’ve read.
I doubt that the short term impacts of sub-human level AI, be it generating prose, photographs, or films, on our epistemology are negative enough to justify them being weighted high as the X-Risk that is likely to emerge upon creating human level AI.
We have been living in a adversarial information space for as long as we have had human civilization. Some of the most most impressive changes to our epistemology were made prior to photographs (empiricism, rationalism, etc), some of them were made after (they are cool!). It will require modifying how we judge the accuracy of a given claim when we can no longer trust photos/videos in low stakes situations (we’ve not been able to trust them in high stakes situations for as long as they have existed; see special effects, film making, conspiracies about any number of recorded claims, etc), but that is just a normal fact of human societal evolution.
If you want to convince (me, at least) that this is a threat that is potentially society ending, I would love an argument/hook that addresses my above claims.
This seems like highly relevant (even if odd/disconcerting) information. I’m not sure if it should necessarily get it’s own post (is this as important than the UK AI Summit or the Executive Order?), but it should certainly gets a top level item in your next roundup at least.
I think unless you take a very linguistics heavy understanding of the emergence of qualia, you are over-weighting your arguments around being able to communicate with an agent being highly related to how likely they are to have consciousness.
___________________________________________________________________________________________
You say:
In short, there are some neural circuits in our brains that run qualia. These circuits have inputs and outputs: signals get into our brains, get processed, and then, in some form, get inputted into these circuits. These circuits also have outputs: we can talk about our experience, and the way we talk about it corresponds to how we actually feel.
And:
It is valid to infer that, likely, qualia has been beneficial in human evolution, or it is a side effect of something that has been beneficial in human evolution.
I think both of the above statements are very likely true. From that, it is hard to say that a chimpanzee likely to lack those same circuits. Neither our mental circuits nor our ancestral environments are that different. Similarly, it is hard to say “OK, this is what a lemur is missing, as compared to a chimpanzee”.
I agree that as you go down the list of potentially conscious entities (e.g. Humans → Chimpanzees → Lemurs → Rats → Bees → Worms → Bacteria → Virus → Balloon) it gets less likely that each has qualia, but I am very hesitant to put anything like an order of magnitude jump at each level.
Did you hear back here?
Retracted—my apologies. I was debating if I should add a source when I commented that and clearly I should have.
What does this mean: More research is better, in my opinion. But why so small? AI Alignment is at least a $1T problem.
Open AI may have a valuation of 80 million in a couple days and they are below that currently.
I haven’t read the article yet, but that is a decent percentage of their current valuation.
This post prompted me to create the following Manifold Market on the likelihood of a Sharp Left Turn occuring (as described by the Tag Definition, Nate Soares, and Victoria Krakovna et. al), prior to 2050: https://manifold.markets/RobertCousineau/will-a-sharp-left-turn-occur-as-des?r=Um9iZXJ0Q291c2luZWF1
Take note: it is only 2 hours away if you are driving in the middle of the night, on a weeknight. Else, it is 3-4 hours depending on how bad traffic is (there will almost always be some along I-5).
From my beginners understanding, the two objects you are comparing are not mutually exclusive.
There is currently work being done on inner alignment and outer alignment, where inner alignment is more focused on making sure that an AI doesn’t coincidentally optimize humanity out of existence due to [us not teaching it a clear enough version of/it misinterpreting] our goals and outer alignment more focused on making sure we have goals aligned to human values we should teach it.
Different big names focus on different parts/subparts of the above (with crossover as well).
Nice work there!
“When we create an Artificial General Intelligence, we will be giving it the power to fundamentally transform human society, and the choices that we make now will affect how good or bad those transformations will be. In the same way that humanity was transformed when chemist and physicists discovered how to make nuclear weapons, the ideas developed now around AI alignment will be directly relevant to shaping our future.”
“Once we make an Artificial General Intelligence, it’s going to try and achieve its goals however it can, including convincing everyone around it that it should achieve them. If we don’t make sure that it’s goals are aligned with humanity’s, we won’t be able to stop it.”
“There are 7 billion people on this planet. Each one of them has different life experiences, different desires, different aspirations, and different values. The kinds of things that would cause two of us to act, could cause a third person to be compelled to do the opposite. An Artificial General Intelligence will have no choice but to act from the goals we give it. When we give it goals that 1⁄3 of the planet disagrees with, what will happen next?” (Policy maker)
Derek Lowe I believe does the closest to a Matt Levine for Pharma (and chem): https://www.science.org/blogs/pipeline
He has a really fun to read series titled “Things I Won’t Work With” where he talks a bunch about dangerous chemicals: https://www.science.org/topic/blog-category/things-i-wont-work-with