In the fall of 2023, I’m teaching a course called “Philosophy and The Challenge of the Future”[1] which is focused on AI risk and safety. I designed the syllabus keeping in mind that my students:
will have no prior exposure to what AI is or how it works
will not necessarily have a strong philosophy background (the course is offered by the Philosophy department, but is open to everyone)
will not necessarily be familiar with Effective Altruism at all
Goals
My approach combines three perspectives: 1) philosophy, 2) AI safety, and 3) Science, Technology, Society (STS); this combination reflects my training in these fields and attempts to create an alternative introduction to AI safety (that doesn’t just copy the AISF curriculum). That said, I plan to recommend the AISF course towards the end of the semester; since my students are majoring in all sorts of different things, from CS to psychology, it’d be great if some of them considered AI safety research as their career path.
Course Overview
INTRO TO AI
Week 1 (8/28-9/1): The foundations of Artificial Intelligence (AI)
Required Readings:
Artificial Intelligence, A Modern Approach, pp. 1-27, Russell & Norvig.
Superintelligence, pp. 1-16, Bostrom.
Week 2 (9/5-8): AI, Machine Learning (ML), and Deep Learning (DL)
Required Readings:
You Look Like a Thing and I Love You, Chapters 1, 2, and 3, Shane.
I designed an AI safety course (for a philosophy department)
Background
In the fall of 2023, I’m teaching a course called “Philosophy and The Challenge of the Future”[1] which is focused on AI risk and safety. I designed the syllabus keeping in mind that my students:
will have no prior exposure to what AI is or how it works
will not necessarily have a strong philosophy background (the course is offered by the Philosophy department, but is open to everyone)
will not necessarily be familiar with Effective Altruism at all
Goals
My approach combines three perspectives: 1) philosophy, 2) AI safety, and 3) Science, Technology, Society (STS); this combination reflects my training in these fields and attempts to create an alternative introduction to AI safety (that doesn’t just copy the AISF curriculum). That said, I plan to recommend the AISF course towards the end of the semester; since my students are majoring in all sorts of different things, from CS to psychology, it’d be great if some of them considered AI safety research as their career path.
Course Overview
INTRO TO AI
Week 1 (8/28-9/1): The foundations of Artificial Intelligence (AI)
Required Readings:
Artificial Intelligence, A Modern Approach, pp. 1-27, Russell & Norvig.
Superintelligence, pp. 1-16, Bostrom.
Week 2 (9/5-8): AI, Machine Learning (ML), and Deep Learning (DL)
Required Readings:
You Look Like a Thing and I Love You, Chapters 1, 2, and 3, Shane.
But what is a neural network? (video)
ML Glossary (optional but helpful for terminological references)
Week 3 (9/11-16): What can current AI models do?
Required Readings:
Artificial Intelligence, A Modern Approach, pp. 27-34, Russell & Norvig.
ChatGPT Explained (video)
What is Stable Diffusion? (video)
AI AND THE FUTURE OF HUMANITY
Week 4 (9/18-22): What are the stakes?
Required Readings:
The Precipice, pp. 15-21, Ord.
Existential risk and human extinction: An intellectual history, Moynihan.
Everything might change forever this century (video)
Week 5 (9/25-29): What are the risks?
Required Readings:
Taxonomy of Risks posed by Language Models, Weidinger et al.
Human Compatible, pp. 140-152, Russell.
Loss of Control: “Normal Accidents and AI Systems”, Chan.
Week 6 (10/2-6): From Intelligence to Superintelligence
Required Readings:
A Collection of Definitions of Intelligence, Legg & Hutter.
Artificial Intelligence as a positive and negative factor in global risk, Yudkowsky.
Paths to Superintelligence, Bostrom.
Week 7 (10/10-13): Human-Machine interaction and cooperation
Required Readings:
Cooperative AI: machines must learn to find common ground, Dafoe et. al.
AI-written critiques help humans notice flaws
AI Generates Hypotheses Human Scientists Have Not Thought Of
THE BASICS OF AI SAFETY
Week 8 (10/16-20): Value learning and goal-directed behavior
Required Readings:
Machines Learning Values, Petersen.
The Basic AI Drives, Omuhundro.
The Value Learning Problem, Soares.
Week 9 (10/23-27): Instrumental rationality and the orthogonality thesis
Required Readings:
The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents, Bostrom.
General Purpose Intelligence: Arguing The Orthogonality Thesis, Armstrong.
METAPHYSICAL & EPISTEMOLOGICAL CONSIDERATIONS
Week 10 (10/30-11/4): Thinking about the Singularity
Required Readings:
The Singularity: A Philosophical Analysis, Chalmers.
Can Intelligence Explode?, Hutter.
Week 11 (11/6-11): AI and Consciousness
Required Readings:
Could a Large Language Model be Conscious?, Chalmers.
Will AI Achieve Consciousness? Wrong Question, Dennett.
ETHICAL QUESTIONS
Week 12 (11/13-17): What are the moral challenges of high-risk technologies?
Required Readings:
Human Compatible, “Misuses of AI”, Russell.
The Ethics of Invention, “Risk and Responsibility”, Jasanoff.
Week 13 (11/20-22): Do we owe anything to the future?
Required Readings:
What We Owe The Future, Chapter 1, MacAskill.
The Future of Humanity, Bostrom.
On future people, looking back at 21st century longtermism, Carlsmith.
WHAT CAN WE DO NOW
Week 14 (11/27-12/1): Technical AI Alignment
Required Readings:
Concrete Problems in AI Safety, Amodei et al.
AI Safety needs social scientists, Irving & Askell.
Week 15 (12/4-8): AI governance and regulation
Required Readings:
AI Governance, A research agenda, Dafoe.
AI Strategy, Policy, and Governance (optional but helpful video).
Feedback is welcome! Especially if you have readings in mind that you can imagine your 19-year-old self being excited about.