I like the thinking that went into this post, but I also think it’s difficult to make any definitive statements here. None of the actions you’ve given are entirely independent of the others (for example, you can attend new events with friends). It’s also difficult to use an algorithmic approach without a good way of measuring expected returns, which are difficult to intuit and change significantly over time.
Even ignoring varying individual experiences with online dating (conventionally attractive/non-minority individuals tend to have better success), there may be actions that you can take to make it more efficient. It can be also done at times when you are unable to go out and can be done for even for small amounts of time.
I think algorithmic/rationalist approaches to dating are really interesting. I’m not certain that reinforcement learning is any different than non-algorithmic/rationalist approaches though. Aren’t humans always trying to maximize our expected reward?
Even if your characterization of AI was accurate, an unaligned AI could find it valuable to dedicate part of its resources to helping humans. In your analogy, isn’t that the role of myrmecologists today? If it were ants:us = us:AI, solving our problems wouldn’t require a significant expenditure of resources.