If we don’t use splitlines we instead need to use something similar, right? Like, even if we don’t need to worry about LF versus CRLF (which was a genuine suggestion I made), we still need to figure out if someone’s got any de-indents after the start of the payload. And I don’t expect us to do better without splitlines than with it.
Using newlines to figure out what happens after “payload” is fine, as far as I can tell. Multicore’s exploit relies on newlines being used when comparing stuff before the payload.
Stuff like CRLF vs LF is a bit awkward, but can maybe be handled explicitly?
Oh yeah, that’s true as far as I know. I guess it depends how much we trust ourselves to find all instances of this hole. A priori I would have thought “python sees a newline where splitlines doesn’t” was just as likely as the reverse. (I’m actually not sure why we don’t see it, I looked up what I thought was the source code for the function and it looked like it should only split on \n, \r and \r\n. But that’s not what it does. Maybe there’s a C implementation of it and a python implementation?)
That’s an idea worthy of consideration, but in addition to the risk you raised, I also feared some members would have submitted invalid bots.
Yeah, if we’d seen the issue, I think we could’ve gotten around it just by not using splitlines, which would’ve been smoother.
Though of course, this exploit updates me towards thinking that there are other vulnerabilities as well.
If we don’t use splitlines we instead need to use something similar, right? Like, even if we don’t need to worry about LF versus CRLF (which was a genuine suggestion I made), we still need to figure out if someone’s got any de-indents after the start of the payload. And I don’t expect us to do better without splitlines than with it.
Using newlines to figure out what happens after “payload” is fine, as far as I can tell. Multicore’s exploit relies on newlines being used when comparing stuff before the payload.
Stuff like CRLF vs LF is a bit awkward, but can maybe be handled explicitly?
Oh yeah, that’s true as far as I know. I guess it depends how much we trust ourselves to find all instances of this hole. A priori I would have thought “python sees a newline where splitlines doesn’t” was just as likely as the reverse. (I’m actually not sure why we don’t see it, I looked up what I thought was the source code for the function and it looked like it should only split on \n, \r and \r\n. But that’s not what it does. Maybe there’s a C implementation of it and a python implementation?)