I think all of those points are evidence that updates me in the direction of the null hypothesis, but I don’t think any of them is true to the exclusion of the others.
I think a moderate amount of people will use copilot. Cost, privacy, and internet connection will factor to limit this.
I think copilot will have a moderate affect on users outputs. I think it’s the best new programming tool I’ve used in the past year, but I’m not sure I’d trade it for, e.g. interactive debugging (reference example of a very useful programming tool)
I think copilot will have no significant differential effect on infosec, at least at first. The same way I think the null hypothesis should be a language model produces average language, I think the null hypothesis is a code model produces average code (average here meaning it doesn’t improve or worsen the infosec situation that jim is pointing to).
In general these lead me to putting a lot of weight on ‘no significant impact’ in aggregate, though I think it is difficult for anything to have a significant impact on the state of computer security.
(Some examples come to mind: Snowden leaks (almost definitely), Let’sEncrypt (maybe), HTTPSEverywhere (maybe), Domain Authentication (maybe))
1. jim originally said that copilot produces code with vulnerability, which, if used extensively, could generate loads of vulnerabilities, giving more opportunities for exploits overall. jim mentions it worsening “significantly” infosec
2. alex responds that given that the model tries to produce the code it was trained on, it will (by def.) produce average level code (with average level of vulnerability), so it won’t change the situation “significantly” as the % of vulnerabilities per line of code produced (in the world) won’t change much
3. vanessa asks if the absence of change from copilot results from a) lack of use b) lack of change in speed/vulnerability code production from using (ie. used as some fun help but without strong influence on the safety on the code as people would still be rigorous) c) change in speed/productivity, but not in the % of vulnerability
4. alex answers that indeed it makes users more productive and it helps him a lot, but that doesn’t affect overall infosec in terms of % of vulnerability (same argument as 2). He nuances his claim a bit saying that a) it would moderatly affect outputs b) some stuff like cost will limit how much it affect those c) it won’t change substantiallyat first (conjunction of two conditions).
What I think is the implicit debate
i) I think jim kind of implicitly assume that whenever someone writes code by himself, he would be forced to have good habits for security etc., and that whenever the code is automatically generated then people won’t use their “security” muscles that much & assume the AI produced clean work… which apparently (given the examples from jim) does not by default. Like a Tesla not being safe enough at self-driving.
ii) I think what’s missing from the debate is that the overall “infosec level” depends heavily on what a few key actors decide to do, those being in charge of safety-critical codebases for society-level tools (like nukes). So one argument could be that, although the masses might be more productive for prototyping etc., the actual infosec people might just still be as careful / not use it, so the overall important infosec won’t change, and thus the overall infosec won’t change.
iii) I think vanessa point kind of re-states i) and disagrees with ii) by saying that everyone will use this anyway? Because by definition if it’s useful it will change their code/habits, otherwise it’s not useful?
iv) I guess alex’s implicit points are that code generation with Language Models producing average human code was going to happen anyway & that saying it is a significant change is an overstatement, & we should probably just assume no drastic change in %vulnerability distribution at least for now.
I think jim kind of implicitly assume that whenever someone writes code by himself, he would be forced to have good habits for security etc.,
This part I think is not quite right. The counterfactual jim gives for Copilot isn’t manual programming, it’s StackOverflow. The argument is then: right now StackOverflow has better methods for promoting secure code than Copilot does, so Copilot will make the security situation worse insofar as it displaces SO.
I think all of those points are evidence that updates me in the direction of the null hypothesis, but I don’t think any of them is true to the exclusion of the others.
I think a moderate amount of people will use copilot. Cost, privacy, and internet connection will factor to limit this.
I think copilot will have a moderate affect on users outputs. I think it’s the best new programming tool I’ve used in the past year, but I’m not sure I’d trade it for, e.g. interactive debugging (reference example of a very useful programming tool)
I think copilot will have no significant differential effect on infosec, at least at first. The same way I think the null hypothesis should be a language model produces average language, I think the null hypothesis is a code model produces average code (average here meaning it doesn’t improve or worsen the infosec situation that jim is pointing to).
In general these lead me to putting a lot of weight on ‘no significant impact’ in aggregate, though I think it is difficult for anything to have a significant impact on the state of computer security.
(Some examples come to mind: Snowden leaks (almost definitely), Let’sEncrypt (maybe), HTTPSEverywhere (maybe), Domain Authentication (maybe))
Summary of the debate
1. jim originally said that copilot produces code with vulnerability, which, if used extensively, could generate loads of vulnerabilities, giving more opportunities for exploits overall. jim mentions it worsening “significantly” infosec
2. alex responds that given that the model tries to produce the code it was trained on, it will (by def.) produce average level code (with average level of vulnerability), so it won’t change the situation “significantly” as the % of vulnerabilities per line of code produced (in the world) won’t change much
3. vanessa asks if the absence of change from copilot results from a) lack of use b) lack of change in speed/vulnerability code production from using (ie. used as some fun help but without strong influence on the safety on the code as people would still be rigorous) c) change in speed/productivity, but not in the % of vulnerability
4. alex answers that indeed it makes users more productive and it helps him a lot, but that doesn’t affect overall infosec in terms of % of vulnerability (same argument as 2). He nuances his claim a bit saying that a) it would moderatly affect outputs b) some stuff like cost will limit how much it affect those c) it won’t change substantially at first (conjunction of two conditions).
What I think is the implicit debate
i) I think jim kind of implicitly assume that whenever someone writes code by himself, he would be forced to have good habits for security etc., and that whenever the code is automatically generated then people won’t use their “security” muscles that much & assume the AI produced clean work… which apparently (given the examples from jim) does not by default. Like a Tesla not being safe enough at self-driving.
ii) I think what’s missing from the debate is that the overall “infosec level” depends heavily on what a few key actors decide to do, those being in charge of safety-critical codebases for society-level tools (like nukes). So one argument could be that, although the masses might be more productive for prototyping etc., the actual infosec people might just still be as careful / not use it, so the overall important infosec won’t change, and thus the overall infosec won’t change.
iii) I think vanessa point kind of re-states i) and disagrees with ii) by saying that everyone will use this anyway? Because by definition if it’s useful it will change their code/habits, otherwise it’s not useful?
iv) I guess alex’s implicit points are that code generation with Language Models producing average human code was going to happen anyway & that saying it is a significant change is an overstatement, & we should probably just assume no drastic change in %vulnerability distribution at least for now.
This part I think is not quite right. The counterfactual jim gives for Copilot isn’t manual programming, it’s StackOverflow. The argument is then: right now StackOverflow has better methods for promoting secure code than Copilot does, so Copilot will make the security situation worse insofar as it displaces SO.