I’m currently flirting with the idea of trying for a math PhD 2 − 3 years down the line.
I’m currently on a Theoretical Computer Science Masters program at the University of [Redacted] in the United Kingdom.
(My program is 2 semesters of teaching (7 − 8 months) followed by a 9 − 12 month industrial placement starting in June 2023. [I might forego the 1 - year industrial placement if I can’t get a suitable placement [I.E. theoretical research that feels like it would be valuable experience for the kind of career I want to pursue] and graduate after one year by completing a masters project over the summer instead.)
After graduation I’m considering taking a gap year to fill in the gaps in my maths knowledge/prepare for the PhD, maybe see if I can contribute to the research agendas I think are interesting/promising).
I might also pursue a PhD in Theoretical Computer Science instead of mathematics (maybe applications for a TCS PhD would be looked on more favourably with a TCS Masters/recommendations from my current lecturers).
Why A PhD?
I currently plan to learn a lot of (especially abstract) maths to (upper) graduate level for alignment theory (I want to do theoretical alignment work that is basically just applied maths), and I think I would benefit from the opportunity to study the relevant mathematics under a “guru”/the dedicated mentorship a PhD provides. I expect the first few years of my career in alignment research would be mostly spent on deconfusion/distillation, and I expect high levels of mathematical sophistication to be very valuable for that.
I find abstract maths and mathematical modelling “fun”, and really enjoy being a student.
My Alignment Theory of Change
I am operating under/optimising for long timelines (transformative AI is decades away [20+ years]), and this influences what kind of research I believe to be most promising.
I expect theoretical (especially foundational [especially in our current pre paradigmatic stage to be the most promising]) and am persuaded by agent foundations agendas. The extant research agendas I’m most excited for and could see myself working on someday:
John Wentworth’s Natural Abstractions Hypothesis and Selection Theorems
Other agendas that take a desiderata first approach to alignment
Garrabant and Demski’s Embedded Agency
My basic plan for alignment is something like:
Study subjects that seem relevant to alignment theory
Mathematics: a fuckton
Theoretical Computer Science: likewise, a fuckton
Statistics (and its theory)
Learning Theory (Algorithmic and Statistical)
Information Theory (Algorithmic and Statistical)
Physics: Thermodynamics
Optimisation
Evolutionary Theory
Analytic Philosophy: ontology, epistemics, ethics, etc.
Develop executable/computable philosophy for the above
Decision Theory
Game Theory
Grapple with concepts that bear on agent foundations until I understand them better (“Deconfusion”)
Information and entropy
Computation (especially as an information dynamics phenomenon)
Abstractions, ontology, modelling/map making
Optimisation
Causality/dependencies and counterfactuals (including logical)
Epistemics (including for ideal agents)
Decision Making (including for ideal agents; especially in multi-agent environments)
Emergent behaviour in multi-agent environments (e.g. competition, coordination vs conflict, evolution)
Systems (especially complex) and their emergent behaviour
Embedded Agency more generally
Anthropics?
Distill my learnings and understandings to make them more widely accessible (“Distillation”)
Iterate #1 - #3
...
Formulate an adequate theory of robust agency
...
Solve alignment
Of course, I don’t expect to make it all the way to step 8. Mostly, I expect that deconfusion and distillation would be where most of the value from my “career” will come from.
[Question] Should I Pursue a PhD?
Introduction
I’m currently flirting with the idea of trying for a math PhD 2 − 3 years down the line.
I’m currently on a Theoretical Computer Science Masters program at the University of [Redacted] in the United Kingdom.
(My program is 2 semesters of teaching (7 − 8 months) followed by a 9 − 12 month industrial placement starting in June 2023. [I might forego the 1 - year industrial placement if I can’t get a suitable placement [I.E. theoretical research that feels like it would be valuable experience for the kind of career I want to pursue] and graduate after one year by completing a masters project over the summer instead.)
After graduation I’m considering taking a gap year to fill in the gaps in my maths knowledge/prepare for the PhD, maybe see if I can contribute to the research agendas I think are interesting/promising).
I might also pursue a PhD in Theoretical Computer Science instead of mathematics (maybe applications for a TCS PhD would be looked on more favourably with a TCS Masters/recommendations from my current lecturers).
Why A PhD?
I currently plan to learn a lot of (especially abstract) maths to (upper) graduate level for alignment theory (I want to do theoretical alignment work that is basically just applied maths), and I think I would benefit from the opportunity to study the relevant mathematics under a “guru”/the dedicated mentorship a PhD provides. I expect the first few years of my career in alignment research would be mostly spent on deconfusion/distillation, and I expect high levels of mathematical sophistication to be very valuable for that.
I find abstract maths and mathematical modelling “fun”, and really enjoy being a student.
My Alignment Theory of Change
I am operating under/optimising for long timelines (transformative AI is decades away [20+ years]), and this influences what kind of research I believe to be most promising.
I expect theoretical (especially foundational [especially in our current pre paradigmatic stage to be the most promising]) and am persuaded by agent foundations agendas. The extant research agendas I’m most excited for and could see myself working on someday:
John Wentworth’s Natural Abstractions Hypothesis and Selection Theorems
Vanessa Kosoy’s Learning Theoretic Alignment Agenda
Other agendas that take a desiderata first approach to alignment
Garrabant and Demski’s Embedded Agency
My basic plan for alignment is something like:
Study subjects that seem relevant to alignment theory
Mathematics: a fuckton
Theoretical Computer Science: likewise, a fuckton
Statistics (and its theory)
Learning Theory (Algorithmic and Statistical)
Information Theory (Algorithmic and Statistical)
Physics: Thermodynamics
Optimisation
Evolutionary Theory
Analytic Philosophy: ontology, epistemics, ethics, etc.
Develop executable/computable philosophy for the above
Decision Theory
Game Theory
Grapple with concepts that bear on agent foundations until I understand them better (“Deconfusion”)
Information and entropy
Computation (especially as an information dynamics phenomenon)
Abstractions, ontology, modelling/map making
Optimisation
Causality/dependencies and counterfactuals (including logical)
Epistemics (including for ideal agents)
Decision Making (including for ideal agents; especially in multi-agent environments)
Emergent behaviour in multi-agent environments (e.g. competition, coordination vs conflict, evolution)
Systems (especially complex) and their emergent behaviour
Embedded Agency more generally
Anthropics?
Distill my learnings and understandings to make them more widely accessible (“Distillation”)
Iterate #1 - #3
...
Formulate an adequate theory of robust agency
...
Solve alignment
Of course, I don’t expect to make it all the way to step 8. Mostly, I expect that deconfusion and distillation would be where most of the value from my “career” will come from.