Eagerly waiting and hoping that they’ll do another part soon and discuss R1. Interesting that liang said multimodality is a focus. I hope they can achieve proper cross modal learning transfer that gpt 4o could not.
This seems to state the opposite: https://www.lesswrong.com/posts/JTKaR5q59BgDp6rH8/a-high-level-closed-door-session-discussing-deepseek-vision#:~:text=we%20hardly%20see%20the%20benefit%20of%20multimodal%20data.%20In%20other%20words%2C%20the%20cost%20is%20too%20high.%20Today%20there%20is%20no%20evidence%20it%20is%20useful.%20In%20the%20future%2C%20opportunities%20may%20be%20bigger.
That discussion is by people outside of DeepSeek trying to process the shock of R1. It is unclear what DeepSeek is doing currently.
Oh, I did not know, thanks.https://huggingface.co/spaces/deepseek-ai/Janus-Pro-7B seems to show DS is still merely clueless in the visual domain, at least IMO they are loosing there to Qwen and many others.
Eagerly waiting and hoping that they’ll do another part soon and discuss R1. Interesting that liang said multimodality is a focus. I hope they can achieve proper cross modal learning transfer that gpt 4o could not.
This seems to state the opposite: https://www.lesswrong.com/posts/JTKaR5q59BgDp6rH8/a-high-level-closed-door-session-discussing-deepseek-vision#:~:text=we%20hardly%20see%20the%20benefit%20of%20multimodal%20data.%20In%20other%20words%2C%20the%20cost%20is%20too%20high.%20Today%20there%20is%20no%20evidence%20it%20is%20useful.%20In%20the%20future%2C%20opportunities%20may%20be%20bigger.
That discussion is by people outside of DeepSeek trying to process the shock of R1. It is unclear what DeepSeek is doing currently.
Oh, I did not know, thanks.
https://huggingface.co/spaces/deepseek-ai/Janus-Pro-7B seems to show DS is still merely clueless in the visual domain, at least IMO they are loosing there to Qwen and many others.