Hence generation of higher quality data is a plausible way of disrupting the way scaling laws govern slow takeoff. What this data needs to provide is general cognitive competence that therefore applies to the physical world, but that competence doesn’t need to involve initial familiarity with the human world.
So it could be formal proofs on a reasonable distribution of topics, or a superscaled RL system in an environment that sufficiently elicits general reasoning. If the backbone of a dataset shapes representations towards competence, it might transfer to other areas. Thus we get an alien mind that mostly uses natural data as a tool to speak good English and anticipate popular opinion, not as the essential fabric of its own nature.
In the current not-knowing-what-we-are-doing regime, I’m guessing the safer AGIs are scaffolded natural data LLMs, or failing that model-based RL systems that develop in contact with the human world or data. Model-free RL that relies on a synthetic environment to generate enough data risks growing up more alien. Less clear with reasoning that originates in synthetic data for math, grounded in the physical world through natural data being a fraction of datasets for all models in the system (as a kind of multimodality). Such admixing of natural data might even be sufficient to make a model-free RL system less alien.
Hence generation of higher quality data is a plausible way of disrupting the way scaling laws govern slow takeoff. What this data needs to provide is general cognitive competence that therefore applies to the physical world, but that competence doesn’t need to involve initial familiarity with the human world.
So it could be formal proofs on a reasonable distribution of topics, or a superscaled RL system in an environment that sufficiently elicits general reasoning. If the backbone of a dataset shapes representations towards competence, it might transfer to other areas. Thus we get an alien mind that mostly uses natural data as a tool to speak good English and anticipate popular opinion, not as the essential fabric of its own nature.
In the current not-knowing-what-we-are-doing regime, I’m guessing the safer AGIs are scaffolded natural data LLMs, or failing that model-based RL systems that develop in contact with the human world or data. Model-free RL that relies on a synthetic environment to generate enough data risks growing up more alien. Less clear with reasoning that originates in synthetic data for math, grounded in the physical world through natural data being a fraction of datasets for all models in the system (as a kind of multimodality). Such admixing of natural data might even be sufficient to make a model-free RL system less alien.