Is Heartbleed really untraceable?
I was sent a PM about this and a really good point was raised there. If EFF claims that networking logs showed enough detail to confirm Heartbleed exploit attempts going on, then why did Codenomicon’s heartbleed.com website claim that the bug doesn’t show up in logs?
Recognizing that this is outside my specialty, I did not venture into this topic but I have been wondering about this myself. To say that Heartbleed leaves no traces in the logs is a pretty big commitment. This is because there are a lot of different software options that you could potentially use. Within those, there may be settings you can use to turn logs on or off, limit the size of the log files, etc. Also, people can always create their own software which logs whatever they’d like logged. So, to say that it doesn’t show up on most logs could be correct, but to say it doesn’t show up in any logs would be a hasty generalization.
I think it’s possible and maybe even likely that most companies would not have sufficient logs. There are multiple reasons for this:
Despite turning logs on, if you have a server and it loses a hard drive every 2-5 years (because they all die eventually), you might leave your logs behind. They’re very large and can be undesirable when considered in the context of your back up (depending on the resource limitations that apply to your backup options… and backing up requires processing resources for compression, bandwidth for sending the compressed file as well as the storage itself).
Some websites don’t have access to their logs since they’re on a shared hosting environment. Attempting to get logs out of a shared hosting company is likely to be difficult to impossible.
Logs often have so many entries that trying to find a hacking attempt among the legitimate attempts could be extremely difficult. I do realize that in order to make use of the Heartbleed bug, one would probably generate an awful lot of entries on the log. However, on a big server, that could be a drop in an ocean. A hacker is probably going to be thinking about hiding the hacking attempt. In addition to obscuring their IP address, they probably also know better than to be “noisy”, in other words, to create a huge number of weird requests of a type that might get noticed. If possible, they might do things with the specific intention of making it hard for you to tell the difference between their illegitimate request and real requests. Finding evidence in the logs could be extremely difficult or even impossible, depending on how well the attempts are camoflaged.
I’ve noticed that in IT, one tiny difference between the data format people are actually using and the data format you’re imagining might mean that the data they actually have can’t be used for the purpose you’re thinking of. For instance, if the logs don’t record every single packet or don’t retain the input, then the logs may not be useful in this case.
I do not know exactly what is in network log files, or what data format they use, or how many different systems and settings there are, but I would not be surprised if many companies don’t have logs that record enough detail for them to be useful for detecting Heartbleed exploit attempts. I am curious about this. Does anyone know about this? Would you please add a screen shot of some relevant logs, especially if they have the right format to detect a Heartbleed attempt, and let me know what type of device and software generated it?
I want to test my ideas, mostly ideas for technology projects and/or startup ideas. Doing scientific research is best but can be quite costly and time-consuming, so I assume it would be optimal to filter the ideas first in order to select the best ones for testing. I already do things like looking for problems and unintended consequences, looking for relevant studies, and showing them to people hoping to find flaws, but I would bet that somebody has created an idea review process that can be applied for even better preliminary filtering. It would be ideal if the process itself had been calibrated such that I could get the probability of an idea working if it has survived the review process. Finding out about processes that have not been calibrated in this way (or perhaps have a somewhat different purpose like locating as many problems as possible) would be better than nothing. I checked the Internet of course, but I am concerned because I don’t exactly know how to go about making that all-important distinction between marketing claims and processes that actually work. I could spend quite some time looking for evidence for various marketing claims only to discover that most of them are unsupported. Is there a gold standard in this area? If not, do any of you know of any idea review processes that you believe to be decent based on your understanding of one of more of: rationality, science, technology, startups, related areas?