I’ve taught C, Java and Python at a university and (a little) at the high school level. I have noticed two simple things that people either surmount or get stuck on. The first seems to be even a basic ability to keep a formal system in mind; see the famous Dehnadi and Bornat paper. The second, I have heard less about: in programming, it’s the idea of scope.
The idea of scope in almost all modern programming languages goes like this:
A scope starts at some time (some place in the code), and ends somewhere later.
A scope can start before another ends; if so, it has to end before the “outer” scope.
Inside a scope, objects can be created and manipulated; generally even if another scope has started.
Unless something special is done, objects no longer exist after a scope ends.
Pivotally (this seems to be the hardest part), a objects can be created with one name in an outer scope and be referred to by a different name in an inner scope. Inner scopes can likewise create and manipulate objects with the same names as objects in an outer scope without affecting the objects in that outer scope.
It’s really hard for me to think of an analogous skill in the real world to keeping track of N levels of renaming (which may be why it gives students such difficulty?). The closest I can think of is function composition; if you don’t have to pick your way through symbolically integrating a composed function where the variables names don’t match, I have pretty high confidence that you can manage nested scopes.
EDIT: There are two other, well-known problems. Recursion and pointers. I’ve heard stories about students who were okay for a year or two of programming courses, but never “got” recursion or, never understood pointers, and had to change majors. I’ve seen students have enormous difficulty with both; in fact, I’ve passed students who never figured one or the other out, but managed to grind through my course anyway. I don’t know whether they dropped out or figured it out as their classes got harder—or just kept faking it (I had team members through grad school that couldn’t handle more than basic recursion). I’m not inclined to classify either as “programming gear” that they didn’t have, but I don’t have data to back that up.
My post didn’t indicate this, but the most common source of scope is functions; calling a function starts a new scope that ends when the function returns. Especially in this case, it does often make sense to use the same variable name:
Will have prior=prior, evidence=evidence, and is a good naming scheme. But in most languages, modifying ‘evidence’ in the function won’t affect the value of ‘evidence’ outside the scope of the function. This sometimes becomes confusing to students when the function above gets called like so:
Because their previous model relied on the names being the same, rather than the coincidence of naming being merely helpful.
Overall, I would say that this is still a fertile source of errors, but in some situations the alternative is to have less readable code, which is also a fertile source of errors and makes fixing them more difficult.
Your confusion is due to using “scope”, which is actually a lexical concept. What you’re dealing with here is variable substitution: in order to evaluate a function call such as posterior = ApplyBayes(prior, evidence1), the actual function arguments need to be plugged into the definition of ApplyBayes(·, ·). This is always true, regardless of what variable names are used in the code for ApplyBayes.
I certainly hope that I’m not confused about my word choice. I write compilers for a living, so I might be in trouble if I don’t understand elementary terms.
In all seriousness, my use of the word “scope” was imprecise, because the phenomenon I’m describing is more general than that. I don’t know of a better term though, so I don’t regret my choice. Perhaps you can help? Students that I’ve seen have difficulty with variable substitution seem to have difficulty with static scoping as well, and vice versa. To me they feel like different parts of the same confusion.
In a related note, I once took some of my students aside who where having great difficulty getting static scoping, and tried to teach them a bit of a dynamically-scoped LISP. I had miserable results, which is to say that I don’t think the idea of dynamic scope resonated with them any more than static scope; I was hoping maybe it would, that there were “dynamic scoping people” and “static scoping people”. Maybe there are; my experiment is far from conclusive.
EDIT: Hilariously, right after I wrote this comment the newest story on Hacker News was http://news.ycombinator.com/item?id=4534408, “Actually, YOU don’t understand lexical scope!”. To be honest, the coincidence of the headline gave me a bit of a start.
I’ve taught C, Java and Python at a university and (a little) at the high school level. I have noticed two simple things that people either surmount or get stuck on. The first seems to be even a basic ability to keep a formal system in mind; see the famous Dehnadi and Bornat paper. The second, I have heard less about: in programming, it’s the idea of scope.
The idea of scope in almost all modern programming languages goes like this:
A scope starts at some time (some place in the code), and ends somewhere later.
A scope can start before another ends; if so, it has to end before the “outer” scope.
Inside a scope, objects can be created and manipulated; generally even if another scope has started.
Unless something special is done, objects no longer exist after a scope ends.
Pivotally (this seems to be the hardest part), a objects can be created with one name in an outer scope and be referred to by a different name in an inner scope. Inner scopes can likewise create and manipulate objects with the same names as objects in an outer scope without affecting the objects in that outer scope.
It’s really hard for me to think of an analogous skill in the real world to keeping track of N levels of renaming (which may be why it gives students such difficulty?). The closest I can think of is function composition; if you don’t have to pick your way through symbolically integrating a composed function where the variables names don’t match, I have pretty high confidence that you can manage nested scopes.
EDIT: There are two other, well-known problems. Recursion and pointers. I’ve heard stories about students who were okay for a year or two of programming courses, but never “got” recursion or, never understood pointers, and had to change majors. I’ve seen students have enormous difficulty with both; in fact, I’ve passed students who never figured one or the other out, but managed to grind through my course anyway. I don’t know whether they dropped out or figured it out as their classes got harder—or just kept faking it (I had team members through grad school that couldn’t handle more than basic recursion). I’m not inclined to classify either as “programming gear” that they didn’t have, but I don’t have data to back that up.
Is there a reason to use the same variable name within and outside a scope? It seems like a fertile source of errors.
I can see that someone would need to understand that reusing names like that is possible as a way of identifying bugs.
My post didn’t indicate this, but the most common source of scope is functions; calling a function starts a new scope that ends when the function returns. Especially in this case, it does often make sense to use the same variable name:
Will have prior=prior, evidence=evidence, and is a good naming scheme. But in most languages, modifying ‘evidence’ in the function won’t affect the value of ‘evidence’ outside the scope of the function. This sometimes becomes confusing to students when the function above gets called like so:
Because their previous model relied on the names being the same, rather than the coincidence of naming being merely helpful.
Overall, I would say that this is still a fertile source of errors, but in some situations the alternative is to have less readable code, which is also a fertile source of errors and makes fixing them more difficult.
Your confusion is due to using “scope”, which is actually a lexical concept. What you’re dealing with here is variable substitution: in order to evaluate a function call such as
posterior = ApplyBayes(prior, evidence1)
, the actual function arguments need to be plugged into the definition ofApplyBayes(·, ·)
. This is always true, regardless of what variable names are used in the code forApplyBayes
.I certainly hope that I’m not confused about my word choice. I write compilers for a living, so I might be in trouble if I don’t understand elementary terms.
In all seriousness, my use of the word “scope” was imprecise, because the phenomenon I’m describing is more general than that. I don’t know of a better term though, so I don’t regret my choice. Perhaps you can help? Students that I’ve seen have difficulty with variable substitution seem to have difficulty with static scoping as well, and vice versa. To me they feel like different parts of the same confusion.
In a related note, I once took some of my students aside who where having great difficulty getting static scoping, and tried to teach them a bit of a dynamically-scoped LISP. I had miserable results, which is to say that I don’t think the idea of dynamic scope resonated with them any more than static scope; I was hoping maybe it would, that there were “dynamic scoping people” and “static scoping people”. Maybe there are; my experiment is far from conclusive.
EDIT: Hilariously, right after I wrote this comment the newest story on Hacker News was http://news.ycombinator.com/item?id=4534408, “Actually, YOU don’t understand lexical scope!”. To be honest, the coincidence of the headline gave me a bit of a start.