rorygreig

Karma: 89

Research Engineer in AI Safety

rorygreig Dec 18, 2023, 11:21 AM
5 points
0
on: Talk: “AI Would Be A Lot Less Alarming If We Understood Agents”
Thanks again John for giving this talk! I really enjoyed the talk at the time and was pleasantly surprised with the positive engagement from the audience. I’m also pleased that this turned into a resource that can be re-shared.

rorygreig Dec 14, 2023, 2:44 PM
3 points
0
in reply to: Kaj_Sotala’s comment on: Quick thoughts on the implications of multi-agent views of mind on AI takeover
I suppose I can imagine an architecture that has something like a central planning agent that is capable of having a goal, observing the state of the world to check if the goal had been met, coming up with high level strategies to meet that goal, then delegating subtasks to a set of subordinate sub-agents (whilst making sure that these tasks are broken down enough that the sub-agents themselves don’t have to do much long time-horizon planning or goal directed behaviour).
With this architecture it seems like all the agent-y goal-directed stuff is done by a single central agent.
However I do agree that this may be less efficient or capable in practice than an architecture with more autonomous, decentralised sub-agents. But on the other hand it might be better at more consistently pursuing a stable goal, so that could compensate.

rorygreig Dec 12, 2023, 11:56 AM
1 point
0
in reply to: Kaj_Sotala’s comment on: Quick thoughts on the implications of multi-agent views of mind on AI takeover
Good points, however I’m still a bit confused about the difference between two different scenarios: “multiple sub-agents” vs “a single sub-agent that can use tools” (or can use oracle sub-agents that don’t have their own goals).
For example a human doing protein folding using alpha-fold; I don’t think of that as multiple sub-agents, just a single agent using an AI tool for a specialised task (protein folding). (Assuming for now that we can treat a human as a single agent, which isn’t really the case, but you can imagine a coherent agent using alpha-fold as a tool).
It still seems plausible to me that you might have a mind made of many different parts, but there is a clear “agent” bit that actually has goals and is controlling all the other parts.

rorygreig Dec 11, 2023, 9:17 AM
3 points
0
on: Quick thoughts on the implications of multi-agent views of mind on AI takeover
I agree that initially a powerful AGI would likely be composed of many sub-agents. However it seems plausible to me that these sub-agents may “cohere” under sufficient optimisation or training. This could result in the sub-agent with the most stable goals winning out. It’s possible that strong evolutionary pressure makes this more likely.

You could also imagine powerful agents that aren’t composed of sub-agents, for example a simpler agent with very computationally expensive search over actions.

Overall this topic seems under-discussed in my opinion. It would be great to have a better understanding of whether we expect sub-agents to turn into a single coherent agent.

rorygreig Dec 6, 2023, 10:19 AM
12 points
0
in reply to: rorygreig’s comment on: What’s next for the field of Agent Foundations?
The video of John’s talk has now been uploaded on YouTube here.

rorygreig Dec 2, 2023, 12:22 PM
2 points
0
on: Complex systems research as a field (and its relevance to AI Alignment)
I really enjoyed this dialogue, thanks!
A few points on complexity economics:
The main benefit of complexity economics in my opinion is that it addresses some of the seriously flawed and over-simplified assumptions that go into classical macroenomic models, such as rational expectations, homogenous agents, and that the economy is at equilibrium. However it turns out that replacing these with more relaxed assumptions is very difficult in practice. Approaches such as agent-based models (ABMs) are tricky to get right, since they have so many degrees of freedom. However I do think that this is a promising avenue of research, but it maybe it still needs more time and effort to pay off. Although it’s possible that I’m falling into a “real communism has never been tried” trap.
I also think that ML approaches are very complementary to simulation based approaches like ABMs.
In particular the complexity economics approach is useful for dealing with the interactions between the economy and other complex systems, such as public health. There was some decent research done on economics and the covid pandemic, such as Doyne Farmer’s work: https://www.doynefarmer.com/covid19-research, who is a well known complexity scientist.
It’s hard to know how much of this “heterodox” economics would have happened anyway, even in the absence of people who call themselves complexity scientists. But I do think complexity economics played a key role in advocating for these new approaches.
Having said that: I’m not an economist, so I’m not that well placed to criticise the field of economics.
More broadly I found the discussion on self-referential and recursive predictions very interesting, but I don’t necessarily think of that as central to complexity science.
I’d also be interested in hearing more about how this fits in with AI Alignment, in particular complexity science approaches to AI Governance.

rorygreig Dec 1, 2023, 11:57 AM
6 points
0
in reply to: Nora_Ammann’s comment on: What’s next for the field of Agent Foundations?
The workshop talks from the previous year’s ALIFE conference (2022) seem to be published on YouTube, so I’m following up with whether John’s talk from this year’s conference can be released as well.

rorygreig Aug 6, 2023, 8:51 AM
8 points
0
on: What are the best published papers from outside the alignment community that are relevant to Agent Foundations?
An interesting paper is The information theory of individuality, Krakauer et. al

Announcing AI Alignment workshop at the ALIFE 2023 conference

rorygreigJul 8, 2023, 1:52 PM

16 points

0 comments1 min readLW link

(humanvaluesandartificialagency.com)

rorygreig Apr 2, 2023, 10:58 AM
1 point
−1
in reply to: andrew sauer’s comment on: Some thought experiments on digital consciousness
This is a really interesting point that I hadn’t thought of!
I’m not sure where I land on the conclusion though. My intuition is that two copies of the same mind emulation running simultaneously (assuming they are both deterministic and are therefore doing identical computations) would have more moral value than only a single copy, but I don’t have a lot of confidence in that.

Some thought experiments on digital consciousness

rorygreigApr 1, 2023, 11:45 AM

22 points

13 comments6 min readLW link

rorygreig Mar 5, 2023, 11:20 AM
3 points
0
in reply to: the gears to ascension’s comment on: AGI in sight: our look at the game board
Yes it is indeed a hybrid event!
I have now added the following text to the website:
The conference is hybrid in-person / virtual. All sessions will have remote dial-in facilities, so authors are able to present virtually and do not need to attend in-person.
This was in our draft copy for the website, I could have sworn it was on there but somehow it got missed out, my apologies!

rorygreig Feb 25, 2023, 11:07 AM
1 point
0
on: Call for submissions: “(In)human Values and Artificial Agency”, ALIFE 2023
Update: The submissions deadline for this Special Session has been extended to 13th March.

rorygreig Jan 31, 2023, 7:41 PM
11 points
0
in reply to: Charlie Steiner’s comment on: Call for submissions: “(In)human Values and Artificial Agency”, ALIFE 2023
Hey, one of the co-organisers of this special session here (I was planning to make a post about this on LW myself but OP beat me to it!).
Clearly I am biased, but I would highly recommend the ALIFE conference (even outside the context of this special session). I published a paper there myself at ALIFE 2021 and really enjoyed the experience.
It has a diverse, open-minded and enthusiastic set of attendees from a wide range of academic disciplines, the topics are varied but interesting. Regarding being in touch with reality, this is harder to comment on but it does typically include a lot of practical and empirical research, for example computer simulations, as well as more theoretical and philosophical work.
We are arranging this special session because we think that Artificial Life as a field, and in particular attendees of this conference, may have a lot to contribute to AI safety, so we are excited about the potential overlap between these areas.
Please feel free to reach out to me directly if you have any questions.

rorygreig Dec 11, 2022, 12:19 PM
8 points
0
in reply to: the gears to ascension’s comment on: Consider using reversible automata for alignment research
I have been thinking about this for quite a while. In particular this paper which learns robust “agents” in Lenia seems very relevant to themes in alignment research: Learning Sensorimotor Agency in Cellular Automata
Continuous cellular automata have a few properties which in my view make them a potentially interesting testbed for agency research in AI alignment:
- They seem to be able to support (or make discoverable) much more robust and complex behaviours and agents than discrete CAs, which makes them seem a bit less like “toy” models.
- They can be differentiable, which allows for more efficient search for interesting behaviours (as in the linked paper). This should also be amenable to being accelerated by GPUs.
I am hoping to get the time at some point to explore some of these ideas using Lenia (I am working a full time job so it would have to be more of a side project). In particular I would like to re-implement the sensorimotor agency paper then see what avenues that opens. Perhaps trying to quantitatively measure abstraction within Lenia, for example can we come up with a measure of abstraction that can automatically identify these “agents”. Or something along the lines of the information theory of individuality, to see whether optimizing globally for these measures (with gradient descent) actually produces something that we recognise as agents / individuals.
I will admit that a lot of my motivation for this is just that I find continuous cellular automata fascinating and fun, rather than considering this the most promising direction for alignment research. But I do also think it could be fruitful for alignment research.

rorygreig Mar 20, 2022, 12:45 PM
5 points
in reply to: CarlShulman’s comment on: Wargaming AGI Development
He has written a paper on this too, link here.

rorygreig

An­nounc­ing AI Align­ment work­shop at the ALIFE 2023 conference

Some thought ex­per­i­ments on digi­tal consciousness

Announcing AI Alignment workshop at the ALIFE 2023 conference

Some thought experiments on digital consciousness