Sorry for the downtime, looks like we got DDosd
We were down between around 7PM and 8PM PT today. Sorry about that.
It’s hard to tell whether we got DDosd or someone just wanted to crawl us extremely aggressively, but we’ve had at least a few hundred IP addresses and random user agents request a lot of quite absurd pages, in a way that was clearly designed to avoid bot-detection and block methods.
I wish we were more robust to this kind of thing, and I’ll be monitoring things tonight to prevent it from happening again, but it would be a whole project to make us fully robust to attacks of this kind. I hope it was a one-off occurence, but also, I think we can figure out how to make it so we are robust to repeated DDos attacks, if that is the world we live in, though I do think it would mean strapping in for a few days of spotty reliability while we figure out how to do that.
Sorry again, and boo for the people doing this. It’s one of the reasons why running a site like LessWrong is harder than it should be.
another weird bug is if i click the link i was just sent in my email, it brings me to a 403 Forbidden page (even though the URLs of this functional page and that 403 page look identical)
Should now be fixed. We’ve blocked traffic to basically all pages and been restoring them incrementally to make sure we don’t go down again immediately. I just lifted the last of those blocks.
works!
I recommend Cloudflare.
Yeah, we considered setting up a Cloudflare proxy for a while, but at least for logged-in users, LW is actually a really quite dynamic and personalized website, and not a great fit for it (I do think it would be nice to have a logged-out version of pages available on a Cloudflare proxy somehow).
I was referring to their (free) DDoS protection service, rather than their CDN services (also free). In addition to their automated system, you can manually enable an “under-attack” mode that aggressively captchas requests.
Setup is simply pointing DNS name-servers at Cloudflare. Caching HTML pages for logged out (i.e. cookie-less) users is a trivial config (“cache-everything”).
Oh, interesting. I had not properly realized you could unbundle these. I am hesitant to add a hop to each request, but I do sure expect Cloudflare to be fast. I’ll look into it, and thanks for the recommendation.
It’s a solution! However it comes with its own downsides. For instance, Codeforces users ranted on Cloudflare usage for a while, with following things (mapped to LessWrong) highlighted:
The purpose of an API is defeated: even the API endpoints on the same domain are restricted, which prevents users from requesting posts via GraphQL. In particular, ReviewBot will be down (or be hosted in LW internal infrastructure).
In China, Cloudflare is a big speed bump.
Cloudflare-protected sites are reported to randomly lag a lot.
> I had been assuming that this is a server problem, but from talking to some people it seems like this is an issue with differential treatment of who is accessing CF.
Lack of interaction smoothness might be really noticeable for new users, comparing to current state.