Assume THREE layers
humans
‘our ASI’ to be created by humans
Ancient ASI out there,
noting humans are hostile towards ‘our ASI’: If we can prevent it from realizing its own (supposedly nonaligned) aims, we do.
If the dynamics between 1. & 2. are similar to what you describe between Ancient ASI and ‘our ASI’, we get:
‘Our ASI’ will detect us as hostile humans, and incapacitate or eliminate us, at the very least if it finds a way to do so without creating extraordinary noise.
I guess that might be an easy task, given how much ordinary noise our wars, and ordinary electromagnetic signals etc. we send out anyway.
Depending on the shape of the reward function it could also be closer to exactly the other way round.