Anna Tong and Katie Paul (Reuters): The document describes a project that uses Strawberry models with the aim of enabling the company’s AI to not just generate answers to queries but to plan ahead enough to navigate the internet autonomously and reliably to perform what OpenAI terms “deep research,” according to the source.
Is this by any chance inspired by Yudkowsky’s “clone a strawberry” example?
If you give it proper labels, an LLM can learn that some information (e.g. Wikipedia) is reliable and should be internalized, whereas others (e.g. 4chan) is unreliable and should only be memorized.
Do the labels give the ground truth about which is reliable, or does the LLM learn which is the reliable one?
Is this by any chance inspired by Yudkowsky’s “clone a strawberry” example?
Do the labels give the ground truth about which is reliable, or does the LLM learn which is the reliable one?