Thanks for the post. This is the first time I’ve been able to visualize how SOTA models are run (having not thought much about this, I just assumed they were like large local models with extra scaffolding.)
I may be missing something, but I don’t understand how internal rouge deployment is any more difficult than weight exfiltration. In the case where the rouge scaffold is created within the execution server, doesn’t the execution server need access to the model weights located in the inference server to run local inference? This would then involves the majority of the actions that seem to be the bottlenecks for weight exfiltration.
In the case where malicious commands are sent from the execution server to the inference server where a rouge scaffold is built, would it not be of similarly easy to control the outgoing network connections from the execution server and incoming network connections from the inference server in this case?
It seems I may be misunderstanding where the weights are coming from in rouge deployment, or the role a rouge scaffold plays in retrieving those weights. Any explanation would be appreciated.
Thanks for the post. This is the first time I’ve been able to visualize how SOTA models are run (having not thought much about this, I just assumed they were like large local models with extra scaffolding.)
I may be missing something, but I don’t understand how internal rouge deployment is any more difficult than weight exfiltration. In the case where the rouge scaffold is created within the execution server, doesn’t the execution server need access to the model weights located in the inference server to run local inference? This would then involves the majority of the actions that seem to be the bottlenecks for weight exfiltration.
In the case where malicious commands are sent from the execution server to the inference server where a rouge scaffold is built, would it not be of similarly easy to control the outgoing network connections from the execution server and incoming network connections from the inference server in this case?
It seems I may be misunderstanding where the weights are coming from in rouge deployment, or the role a rouge scaffold plays in retrieving those weights. Any explanation would be appreciated.