Another posible update is towards shorter timelines if you think that humans might not be trained whith the optimal amount of data(since we can’t just for example read the entire internet) and so it might be posible to get better peformance whith less parameters, if you asume brain has similar scaling laws.
Another posible update is towards shorter timelines if you think that humans might not be trained whith the optimal amount of data(since we can’t just for example read the entire internet) and so it might be posible to get better peformance whith less parameters, if you asume brain has similar scaling laws.