I can say from recent personal experience: there is something special about this age in the context of development
> As for doubting your ideas—there …
I do see the value in raw, direction-focused, intuition-based ideas. It’s just that after a healthy dose of self-awarness(maybe some could say mine is too big), adding rigor to an idea always makes it more epistemically valuable. I do recognize your perspective, though at the same time I wouldn’t jump to the conclusion that adding rigorous evaluation has an adverse effect on exploration. The best strategy seems to be separating these 2: first part is allowing the paradigm-level divergent exploration, then the second part is verifying it rigorously. Well I’m not sure this strategy is fully possible in practice. And I will add: my focus on epistemic progress—always trying to reach a depth of understanding—is why my messages might feel “heavy”
I’ve actually been recently thinking about randomness in ML, and I’ve come to a compelling case for it’s specific role. The insights do seem to generalize to all problem-solving mechanisms in a way. I can expand if you want
theliminalone
👋👋👋
I call myself a “practical philosopher”. Meaning I try to understand and redesign existing systems, or guide new ones. More generally, I’m interested in systems, incentives, but also paradigms and the high-level theory behind the practice. I go into the details later(in the edit)
Many of my beliefs have formed quite recently, since I’m 16 y/o. However I don’t see a reason to doubt them. Describing the recent progress, I’d say it was an increase in clarity: awareness of my goals, learning about the world, developing some of my own internal philosophy(which I now realize has mostly just been the rationalist worldview). Not coincidentally, clarity is my favorite word
I think AI is a few breakthroughs away from achieving its projected potential. And I see that as an opportunity for myself to do something fulfilling: potentially or hopefully contribute to the AI research space. It’s rare to come across an idea for expanding/inventing a theory in a way that also leads to real-world impact, and let’s just say, the ROI of impact in inventing AI architectures is pretty high compared to tinkering with other theories. I do recognize such work takes thousands of hours studying the existing theory to gain confidence in it. And that’s why I’ve committed my free time toward learning the current AI methods before I can start theorizing about my own
About the LW itself:
I’m honestly repulsed by the nature of most of internet’s conversations, expressions of ideas. Specifically by all the cognitive bias, lack of open-mindedness, etc. Hopefully I can find myself in a “better” circle. Nice to meet you all
Edit: I could provide more insight into my general thinking process. I’m reaching a turning point, where I’m beginning to have enough agency to be responsible for my conceptual progress. Before it wasn’t very consistent or deliberate. These are half-baked ideas that arose when trying to explore the space of AI-related invention, and represent my “thinking style”. I can already see how having expertise can inform and accelerate these types of ideas, and as I said experience is what I’m working on now
Randomness understood as a tool
I looked at how many AI learning processes(eg. lottery ticket hypothesis, mutual learning between RL agents, exploration in RL) can be interpreted as clever use of randomness. Paired with the fact that randomness seems fundamental(being a cornerstone in creativity and invention) in brain processes, it started to look as the driving force behind transcending existing knowledge. And so I wondered if there might exist a theory that describes how its used across all of machine learning. Then, randomness could be extrapolated more as a universal and deliberate tool rather than a non-negotiable constant in architectures, or a performance booster. I think this approach shows deeper understanding of it, compared to the way its implicitly used today
AI-brain relation
I theorized that the brain and NNs are both instances of a broader class of systems described by the same principles. If this were true, there might exist a theoretical framework that unifies both of them in its predictions. And this would also mean there exist fundamental features of the brain that aren’t just biologically optimized for, but show generalizable truths also applicable to AI systems. For instance, the evolutionary two-hemisphere structure may or may not reveal such principles. I believe this area is still unexplored
Incentive structures
Like any system, NN architectures can be understood as incentive structures. But while surface-level incentives are well-understood(eg. lottery ticket hypothesis, incentive for SGD to find local minima), this framing still opens up directions for novel approaches/architectures. For example, what would an architecture look like that incentivizes maximizing the depth of learnt patterns rather than just correctness? Admittedly, this is abstract and vague. I see this train of thought as just a simple tool that might lead to new interesting research directions.
I’ll attempt to keep it short
I started reasoning with this observation that applies to all problem-solving algorithms: The more knowledge[1] you have, the more deterministic the solution is. In ML, the missing knowledge encoded in the weights is the premise of the entire science. So the role of it is to fill in the gaps of our understanding
How much randomness? The key point is: We want to use as much useful unique knowledge to solve the problem. So we use as much knowledge as we already have(architecture, transfer learning, etc.), and if randomness can give us understanding we don’t have, we use it as a substitute
I think this generalizes to evolution. Mind you, the world started with no prior knowledge[2]. And so the properties[3] of the evolutionary algorithm make total sense when understood through this lens
There is a vague connection to BPP vs P: only in the realm of when we don’t have complete knowledge of the problem, so my idea doesn’t really help in this regard
On all abstraction levels: knowledge about the problem, the chosen paradigm, architecture/technique, optimization techniques, parameters
Some could say the laws of physics and matter is a form of knowledge that allowed everything else to unfold
Because the search isn’t defined and constrained by prior knowledge, the search space is broader and the algorithm is slower