Épiphanie Gédéon

Karma: 243

Paris ACX Meetup January 2025

Lucie Philippon and Épiphanie Gédéon

Jan 5, 2025, 3:32 PM

4 points

0 comments1 min readLW link

Épiphanie Gédéon Jun 1, 2024, 12:01 AM
3 points
0
in reply to: Richard_Ngo’s comment on: We might be dropping the ball on Autonomous Replication and Adaptation.
Remember, while the ARA models are trying to survive, there will be millions of other (potentially misaligned) models being deployed deliberately by humans, including on very sensitive tasks (like recursive self-improvement). These seem much more concerning.

So my main reason for worry personally is that there might be an ARA that is deployed with just the goal of “just clone yourself as much as possible” or a goal similar to this. In this case, the AI does not really have to survive particularly among others as long as it is able to pay for itself and continue spreading, or infecting local computers by copying itself to local computers etc… This is a worrisome scenario in that the AI might just be dormant, and already hard to kill. If the AI furthermore has some ability for avoiding detection and adapting/modifying itself, then I really worry that its goals are going to evolve and get selected to converge progressively toward a full takeover (though it may also plateau) and that we will be completely oblivious to it for most of this time, as there won’t really be any incentive to either detect or fight this thoroughly.

Of course, there is also the scenario of an chaos-GPT ARA agent, and this I worry is going to kill many people without it ever truly shutting down, or if we can, it might take a while.

All in all, I think this is more of a question of costs-benefits than if how likely it is. For instance, I think that implementing Know Your Customer policy on all providers right now could be quite feasible and would slow down the initial steps of an ARA agent a lot.

I feel like the main crux of the argument is:
1. Whether an ARA agent plateaus or goes exponential in terms of abilities and takeover goals.
2. How much time it will take for an ARA agent that takes over to fully take over after being released.
I am still very unsure about 1, I could imagine many scenarios where the endemic ARA just stagnates and never really transforms into something more.

However, I feel like for 2. you have a model that it is going to take a long time for such an agent to really take over. I am unsure about that, but even if that were the case, my main concern is that once the seed ARA is released (and it might be only very slightly capable of adaptation at first), the n it is going to be extremely difficult to shut it down. If AI labs advance significantly with respect to superintelligence, implementing pause AI might not be too late, but if such an agent has already been released there is not going to be much we can do about it.

Separately, I think the “natural selection favors AIs over humans” argument is a fairly weak one; you can find some comments I’ve made about this by searching my twitter.

I would be very interested to hear more. I didn’t find anything from a quick search on your twitter, do you have a link or a pointer I could read more on for counterarguments about “natural selection favors AIs over humans”?

[Question] We might be dropping the ball on Autonomous Replication and Adaptation.

Charbel-Raphaël and Épiphanie Gédéon

May 31, 2024, 1:49 PM

61 points

30 comments4 min readLW link

Épiphanie Gédéon Apr 29, 2024, 1:50 PM
1 point
0
in reply to: Rafael Kaufmann Nedal’s comment on: Constructability: Plainly-coded AGIs may be feasible in the near future
Thanks a lot for the kind comment!

To scale this approach, one will want to have “structural regularizers” towards modularity, interoperability and parsimony

I am unsure of the formal architecture or requirements for these structural regularizers you mention. I agree with using shared building blocks to speed up development and verification. I am unsure credit assignment would work well for this, maybe in the form of “the more a block is used in a code, the more we can trust it”?

Constraints on the types of admissible model code. We have strongly advocated for probabilistic causal models expressed as probabilistic programs.

What do you mean? Why is this specifically needed? Do you mean that if we want to have a go-player, we should have one portion of the code dedicated to assigning probability to what the best move is? Or does it only apply in a different context of finding policies?

Scaling this to multiple (human or LLM) contributors will require a higher-order model economy of some sort

Hmm. Is the argument something like “We want to scale and diversify the agents who will review the code for more robustness (so not just one LLM model for instance), and that means varying level of competence that we will want to figure out and sort”? I had not thought of it that way, I was mainly thinking of just using the same model, and I’m unsure that having weaker code-reviewers will not bring the system down in terms of safety.

Regarding the Gaia Network, the idea seems interesting though I am unclear about the full details yet. I had thought of extending betting markets to a full bayesian network to have a better picture of what everyone believe, and maybe this is related to your idea. In any case, I believe that conveying one’s full model of the world through this kind of network and maybe more may be doable, and quite important to solve some sort of global coordination/truth seeking?

Overall I agree with your idea of a common library and I think there should be some very promising iterations on that. I will contact you more about colaboration ideas!

Épiphanie Gédéon Apr 27, 2024, 7:52 PM
1 point
0
in reply to: Erik Jenner’s comment on: Constructability: Plainly-coded AGIs may be feasible in the near future
Yes, thank you, it’s less than 1000 parameters for both

Épiphanie Gédéon Apr 27, 2024, 7:17 PM
5 points
6
in reply to: mesaoptimizer’s comment on: Constructability: Plainly-coded AGIs may be feasible in the near future
So the first image is based on AI control, which is indeed part of their strategies, and you could see constructability as mainly leading to this kind of strategy applied to plain code for specific subtasks. It’s important to note constructability itself is just a different approach to making understandable systems.

The main differences are :
1. Instead of using a single AI, we use many expert-like systems that compose together which we can see the interaction of (for instance, in the case of a go player, you would use KataGo to predict the best move and flag moves that lost the game, another LLM to explain the correct move, and another one to factor this explanation into the code)
2. We use supervision, both automatic and human, to overview the produced code and test it, through simulations, unit tests, and code review, to ensure the code makes sense and does its task well.

Constructability: Plainly-coded AGIs may be feasible in the near future

Épiphanie Gédéon and Charbel-Raphaël

Apr 27, 2024, 4:04 PM

85 points

13 comments13 min readLW link

Épiphanie Gédéon Oct 18, 2023, 7:32 PM
4 points
0
in reply to: Épiphanie Gédéon’s comment on: Announcing Dialogues
Thinking more on this and to friends who voiced that non-trees/dag dialog makes it non appealing to them also (especially in contexts where I was inviting them to have dialogue together), would there be interests in making a PR for this feature? I might work on this directly

Épiphanie Gédéon Oct 18, 2023, 7:27 AM
5 points
0
in reply to: Amal ’s comment on: Announcing Dialogues
Came here to comment this. As it is, this seems like just talking on discord/telegram, but with the notion of publishing it later. What I really lack when discussing something is the ability to branch out and backtrack easily and have a view of all conversation topics at once.

That being said, I really like the idea of dialogues, and I think this is a push in the right direction, I have enjoyed the dialogue I read so far greatly. Excited to see where this goes

Épiphanie Gédéon Jun 15, 2023, 1:19 PM
7 points
0
on: Why “AI alignment” would better be renamed into “Artificial Intention research”
Reposting (after a slight rewrite) from the telegram group:

This might be a nitpick, but to my (maybe misguided) understanding, alignment is only a very specific subfield of ai safety research, which basically boils down to “how do I give a set of rules/utility function/designs that avoid meta or mesa optimizations that have dramatic unforseen consequences” (This is at least how I understood MIRI’s focus pre-2020)

For instance, as I understand it, interpretability research is not directly alignment research. Instead, it is part of the broader “AI safety research” (which includes alignment research, interpretability, transparency, corrigeability, …)

With that being said, I do think that your points apply for renaming “AI safety research” to Artifical Intention Research still hold, and I would be very much in favor of it. It is more self-explanatory, catchier, does not require doom-assumptions to be worth investigating which I think matters a lot in public communication.

ACX spring meetup

Épiphanie GédéonMar 31, 2023, 6:39 PM

4 points

0 comments1 min readLW link

ACX/SSC/LW meetup

Épiphanie GédéonMar 2, 2023, 11:37 PM

8 points

0 comments1 min readLW link

Épiphanie Gédéon Jul 28, 2022, 3:12 PM
2 points
0
on: Monkeypox Post #2
(

Also, I wouldn’t ban orgies, that won’t work

I’m not sure on that point anymore. Monkeypox’ spread seems so slow, and the fact that only men gets infected while the proportions of women who have sex with men having sex with men is not that narrow. I wonder if the spread of Monkeypox is not mostly driven by orgies/big events at the moment, and if it’s not the best moment to do this. Though it might be a bit too late now, I think it’d have worked two-four weeks ago)

Épiphanie Gédéon Jul 28, 2022, 2:24 PM
7 points
0
on: Monkeypox Post #2

People can totally understand all of this, and also people mostly do understand most of this

On the other hand, the very fact that we say Monkeypox is spreading within the community of Men having Sex with Men is symbolic of the problem, to me. Being MSM and being in this monkeypox-spreading community is very correlated sure, but not synonymous, the cluster we’re talking about is more specific: It’s the cluster of people participating in orgies, have sex with other strangers several times a week, etc.

Seeing how we’re still conflating the two in discussions I’ve seen, I can understand the worries that this will reflect on the non-straight community, though I do agree that this messaging makes little sense.

Also

people mostly do understand most of this

I know four people who went to a sex-positive/kind of orgy event recently, this did not seem to be a concern to them (neither to the event itself, its safety consideration only included COVID). I also know someone who still had hookups two-three weeks ago.

It seems like the warning against monkeypox may be failing in the very community we’re talking about.

Épiphanie Gédéon Jul 17, 2022, 7:09 PM
1 point
0
in reply to: __nobody’s comment on: Potato diet: A post mortem and an answer to SMTM’s article
The more I think about it, the more I wonder if boiling the potatoes infused them with the peels and increased significantly the quantity of solanine I was consuming. An obvious confounder is that whole-boiled potatoes are less fun to eat than in more varied forms, so it doesn’t discriminate with the “fun food” theory

Thanks a lot for the estimate, I’ll look into recent studies of this to see what I find!

Épiphanie Gédéon Jul 17, 2022, 6:07 AM
2 points
−4
in reply to: npostavs’s comment on: Potato diet: A post mortem and an answer to SMTM’s article
Isn’t it common for people who fast for more than 3-5 days not to feel any hunger? I wonder if there’s a similar mechanism here

Épiphanie Gédéon Jul 16, 2022, 6:06 AM
1 point
1
in reply to: Natália’s comment on: Potato diet: A post mortem and an answer to SMTM’s article

very restrictive diets are very socially costly to follow. If you regularly eat from college dining halls, cafeterias at work, restaurants, other people’s homes, etc. you’ll have a very hard time following an all-potato diet.

To be fair to SMTM’s potato diet, the idea is that it still works even if you cheat a lot.
That was somewhat my experience with the diet though, it makes social interaction a lot more awkward

semaglutide and tirzepatide alone would massively reduce obesity rates if they were more popular

I mean, the idea of a cheap, not very effortful and efficient life intervention still appeals to me. It might not be the most pressing problem, and it might not solve global obesity, but if it indeed does give a boost of energy in a safe and reliable way, that is already worth knowing.

Épiphanie Gédéon Jul 15, 2022, 10:08 PM
4 points
−3
in reply to: Elizabeth’s comment on: Potato diet: A post mortem and an answer to SMTM’s article
“what would the cost/benefit analysis of address every single thing in this class of problems?”,

I mean, that is why I included “More research to shrink our unknown unknowns” as a general category. I do not think the research needs to be thorough, 3-4 very broad general area would suffice, but even if that does not fit into the points you mention, a statement along the lines you mentioned could work. In general, I do not think that more than one-two hours should be spent on writing this warning.

As for “If at any point you get sick or begin having side effects, stop the diet immediately”, that is indeed a good first step. My problem with it is that I am now understanding that you can develop side effects that indicate one should stop, that are not apparent unless you really track several variables intently. A general instruction to, in addition to doing one’s own research, decide what to look for in advance could have worked.

You note that you felt obliged to keep going in order to provide better data

I want to note three things:
- I did not feel pressured by SMTM either in their public or private communications at any point
- That is not quite correct. I wanted to keep going for the data to be included as a data point, especially if the diet turned out not to work, which I thought it would not.
- I had not noticed any side effects or problems at that point. I was tired of potatoes, but it was manageable.
Maybe I should better emphasize on what points I feel SMTM is responsible and on what points I do not. I do not feel that SMTM is responsible for my safety and what could have happened, or the impact of the diet on me. In fact, I do not believe SMTM to be responsible for anything me-related: I made my own choices, and the fact that I did not stop the diet sooner was my own mistake I am owning up to.

What I do feel is that SMTM had a very loose methodology in how they conducted their studies, were more trying to confirm their hypothesis than really challenge it, and as a result the data is quite muddled and probably not that useful.

The safety part is related in the sense that:
- It might be that many potato diet’s benefits are actually just coming from malnutrition. But if there were evidences of safety, that would not turn up to be a problem
- It is relevant about what kind of preemptive research they made to decide whether the potato diet was worth it
- For me at least, it goes with the general notion that it was very unclear what they were looking for

Épiphanie Gédéon Jul 15, 2022, 8:39 PM
5 points
−1
in reply to: Elizabeth’s comment on: Potato diet: A post mortem and an answer to SMTM’s article
it feels unfair judging them on your specific case because they did tell you not to do it

I notice I’m very confused. SMTM said not to do it because of bipolarity:

Yeah, we would actually recommend people with bipolar disorder not enroll, both because there might be interactions with medication (especially if we’re right about the lithium thing) and because the potato diet seems to sometimes trigger hypomania even in people without bipolar

(The reason I chose to do it is that I am not under medication and that I am type 2 bipolar, not type 1)

I did end up with more pronounced bipolar symptoms—which I included here to give an exhaustive list of effects the diet has had on me—and that was fine for me because I was warned of the risks and that I had done a risk analysis based on this.

My concerns about refeeding syndrome and physical weakness, as far as I know, have nothing to do with bipolarity. Also, as far as I know, no permanent or drastic damage has been effected in me by all this. I am not writing this so much to say “this went very poorly” but much more that “this could have gone way worse”

I want more experiments in general

Maybe I have not been explicit enough about this, but I do too. There is a reason I was excited to sign up, and I am still very enthusiastic about decentralized science. One of the concern I had in publishing this is that I do not want for the cost of such groups to run an experiment to be too high.

I want this data

(I am a little unsure how exploitable the data are at the moment, besides the weight loss. I think this is in big part because of the exploration/exploitation uncertainty mentioned above, and that had they gone in either direction more clearly, the data would have been a lot more useful)

But if we want this kind of data at all we either need to accept those costs, or become more willing to participate in experiments with accurate caveats.

So to be clear: If SMTM had written in big capital letters “we have not researched into any of the following:
- The best practices to stop a diet
- Whether potatoes are safe to consume in big quantities for an extended period of time
- What kind of signs to look for that are clear indications you should stop
- More research to shrink our unknown unknowns
If you sign up for this diet, you understand that we do not know anything and have no ideas about any of these points or any others that are not mentioned in our post.”

Then I would have been completely fine with it. My issue is that there were little to no communication about either what was unsafe, or their level of knowledge about the safety of the diet.

Also, this would have been less problematic for me if they went fully into the “explore” route and were not doing this study as a mean of replication (no set period of time, full communication between participants, frequent updates about the diet, etc). This issue is more relevant for the “please sign up and fill this form” type of study

Épiphanie Gédéon Jul 15, 2022, 8:07 PM
3 points
0
in reply to: Elizabeth’s comment on: Potato diet: A post mortem and an answer to SMTM’s article
Right, I must have phrased it poorly.

What I mean is that if the goal was just to explore, and modifying the diet while doing it is not a problem, then it would have made more sense to have access to everyone’s data as they were doing it (at least between participants). Also have more ways to communicate between participants to share our experience, tips and tricks, etc...

Épiphanie Gédéon

Paris ACX Meetup Jan­uary 2025

[Question] We might be drop­ping the ball on Au­tonomous Repli­ca­tion and Adap­ta­tion.

Con­structabil­ity: Plainly-coded AGIs may be fea­si­ble in the near future

ACX spring meetup

ACX/​SSC/​LW meetup

Paris ACX Meetup January 2025

[Question] We might be dropping the ball on Autonomous Replication and Adaptation.

Constructability: Plainly-coded AGIs may be feasible in the near future

ACX/SSC/LW meetup