Call for submissions: “(In)human Values and Artificial Agency”, ALIFE 2023

key points:

Cash prize of $500 for the best presentation.
Deadline 3 March, 2023.
Organized by Simon McGregor (University of Sussex), Rory Greig (DeepMind), Chris Buckley (University of Sussex)

ALIFE 2023 (the 2023 conference on Artificial Life) will feature a Special Session on “(In)human Values and Artificial Agency”. This session focuses on issues at the intersection of AI Safety and Artificial Life. We invite the submission of research papers, or extended abstracts, that deal with related topics.
We particularly encourage submissions from researchers in the AI Safety community, who might not otherwise have considered submitting to ALIFE 2023.

...

EXAMPLES OF A-LIFE RELATED TOPICS
Here are a few examples of topics that engage with A-Life concerns:
Abstracted simulation models of complex emergent phenomena
Concepts such as embodiment, the extended mind, enactivism, sensorimotor contingency theory, or autopoiesis
Collective behaviour and emergent behaviour
Fundamental theories of agency or theories of cognition
Teleological and goal directed behaviour of artificial agents
Specific instances of adaptive phenomena in biological, social or robotic systems
Thermodynamic and statistical-mechanical analyses
Evolutionary, ecological or cybernetic perspectives
EXAMPLES OF AI SAFETY RELATED TOPICS
Here are a few examples of topics that engage with AI Safety concerns:
Assessment of distinctive risks, failure modes or threat models for artificial adaptive systems
Fundamental theories of agency, theories of cognition or theories of optimization.
Embedded Agency, formalizations of agent-environment interactions that account for embeddedness, detecting agents and representations of agents’ goals.
Selection theorems – how selection pressures and training environments determine agent properties.
Multi-agent cooperation; inferring / learning human values and aggregating preferences.
Techniques for aligning AI models to human preferences, such as Reinforcement Learning from Human Feedback (RLHF)
Goal Misgeneralisation – how agent’s goals generalise to new environments
Mechanistic interpretability of learned / evolved agents (“digital neuroscience”)
Improving fairness and reducing harm from machine learning models deployed in the real world.
Loss of human agency from increasing automation