More intuitive programming languages
I’m not a programmer. I wish I were. I’ve tried to learn it several times, different languages, but never went very far. The most complex piece of software I ever wrote was a bulky, inefficient game of life.
Recently I’ve been exposed to the idea of a visual programming language named subtext. The concept seemed interesting, and the potential great. In short, the assumptions and principles sustaining this language seem more natural and more powerful than those behind writing lines of codes. For instance, a program written as lines of codes is uni-dimensional, and even the best of us may find it difficult to sort that out, model the flow of instructions in your mind, how distant parts of the code interact together, etc. Here it’s already more apparent because of the two-dimensional structure of the code.
I don’t know whether this particular project will bear fruit. But it seems to me many more people could become more interested in programming, and at least advance further before giving up, if programming languages were easier to learn and use for people who don’t necessarily have the necessary mindset to be a programmer in the current paradigm.
It could even benefit people who’re already good at it. Any programmer may have a threshold above which the complexity of the code goes beyond their ability to manipulate or understand. I think it should be possible to push that threshold farther with such languages/frameworks, enabling the writing of more complex, yet functional pieces of software.
Do you know anything about similar projects? Also, what could be done to help turn such a project into a workable programming language? Do you see obvious flaws in such an approach? If so, what could be done to repair these, or at least salvage part of this concept?
I would like to find a programming language that makes programming easier. And I have seen dozens that claimed to have that effect. But it seems to me that at best they work like placebo—they make programming appear easier, which encourages students to give it a try.
Seems to me that most of these projects are based on cheap analogies. Let me explain… Imagine that you are in a business of publishing scientific literature, and then someone tell you: “Why don’t you publish a children’s book? There is a big market out there.” Problem is, you have never seen any children’s book. You are aware that those books must be different from books for adults, so you make a research. You discover that successful children books use big letters and big colorful pictures. Great! So you take Wittgenstein’s Tractatus Logico-Philosophicus, triple the font size, insert pictures of cute kittens, and the book is ready. However, despite having all signs of a successful children’s book, children are not enjoying it.
Essentially, you can’t remove programming from programming. You could give people a game editor and call it a programming language, and that would be a nice first step, but if you want to go further, editor is not enough. Or you can give them a package “editor + programming language”, and they will use the editor part, and complain that the other part is too difficult.
I feel that there is something missing in programming education. Generally, if you want to teach a skill, you have to divide it into smaller skills and teach them separately. If that’s not enough, break the subskills to even smaller subskills, and at some point it becomes easy. But we probably don’t know how to break the programming skill correctly. There are some subskills—probably so trivial to us experienced programmers that they seem invisible—that we fail to teach them. Some people luckily get it anyway, and those become successful programmers, and some people don’t get it and cannot move further. I am thinking about things like “expecting the same result when you do the same thing”. Another prerequisite is algebra: if one does not understand that substituting “X” in “X^2” by “A+1″ makes “(A+1)^2”, how are they supposed to substitute parts of program with function calls? Etc.
It would be great to teach programming by tools like Light Bot. We would just need more of them, in a smooth sequence from simplest to more difficult.
Also I think that easy programming language cannot be a substitute for knowledge of programming. Of course we should prefer better tools to worse tools. But we should also accept that some amount of initial knowledge is required, and cannot be avoided by using a better tool. Analogy: If you don’t understand anything about mathematics, then your problem cannot be solved by having a more user-friendly calculator. You should first learn some mathematics, and only then search for the most user-friendly calculator available.
This alone is enough to get my upvote. I often struggle to explain that programming is a form of math (or at least, that it needs math). One typical answer goes like
My suffocation and stuttering (refusing to change one’s mind in the face of compelling sounding arguments tends to do that) squash any attempt at a proper rebuttal. But now, I have one:
There is a Lighbot implementation on the iPad (under different) name. It’s a nice app, but boredom sets in pretty fast (at least for my kids). What is needed is a common “building block” language for many interesting environments, teaching higher levels of abstraction.
One language, many environments—exactly. People remember by repeating, so completing 10 levels is not enough, but completing 1000 levels in the same environment would be boring.
You can practice the same concept, e.g. a while-loop, by letting a robot walk towards the wall, or cooking the cake until it is ready. You can practice a for-loop by collecting 3 apples in the garden or walking 3 blocks away on the map of the town. All you need is different environments with different sets of primitives and one editor with environment-independent commands.
There’s a non-empty reference class of previous efforts to create visual programming languages, including e.g. Prograph, and the success rate so far is very low (Scratch is perhaps a notable exception, making inroads in the unfortunately small “teach kids to program” community.)
To be fair, Subtext looks superficially like it does have some novel ideas, and actually differs from its predecessors.
Be careful with the implied equation of “visual” with “intuitive”. They don’t necessarily have anything to do with each other.
ETA: I’ve tried downloading the current version to see if I could do something simple with it, such as a FizzBuzz implementation. No dice; the .exe won’t start. Maybe the program has dependencies that are not fulfilled on my (virtual) Windows box and it fails silently, or something else is wrong. Updating on the experience, I wouldn’t expect very much from this effort. It’s literally a non-starter.
I started a couple of my younger brothers and sisters on Scratch, and they got quite far. Now my sixth grade brother has downloaded a mod for his digital camera, and he wrote a calculator program in Lua for it. And my fifth grade sister has been teaching herself Python using this book:
http://www.briggs.net.nz/snake-wrangling-for-kids.html
On a tangent (and just for my curiosity), can you explain/link an explanation of what the phrase “non-empty reference class” means? I infer from context it means that there is a non-empty set of instances, but what is the meaning of this specific ‘reference class’ wording?
By reference class I mean “the set of things that are like Subtext that I’d use to get my prior probability of success from, before updating on the specific merits of Subtext (or its flaws)”.
I’ve acquired the term both from previous discussions here on LW and from slightly more formal training in forecasting, specifically participating in the Good Judgment Project.
There are perils of forecasting based on reference classes (more), but it can be a useful heuristic.
I suspect it is “Previous Similar Attempts,” useful in avoiding Planning Fallacy and as fault analysis material.
One big thing about plain text (and so conventional programming languages) is speed: for a proficient and practiced user, the keyboard is a really efficient method of entering data into a computer, and even more so when using an editor like Emacs or vim or a fully-featured IDE. As soon as you start using the mouse you slow down dramatically.
Another thing about normal text is the tooling has built up around it. Version control is the largest one: being able to narrow down to exactly when bug Y was introduced (“git bisect” is pretty cool) and see exactly what changed is really useful. Currently I don’t think there is anything like this for visual programming languages, and this is a requirement.
I don’t think these points are necessarily impossible to address: a visual programming language could still use the keyboard for manipulation and the structure of the code is build into the storage format so it would be feasible for version control to be more intelligent and useful than current ones.
Also, using visual connection can separate things too much, so the meaning isn’t obvious at a glance, e.g. in this picture one has to follow the lines back to work out what’s being used where, but in a textual language the variable names should tell you what’s what. This places an additional burden on the programmer, having to remember too much about what connects where: the language isn’t “local” enough so the programmer can’t easily model the program in their head.
That said, I’m sure a good language that uses a visual representation of control flow can introduce non-programmers to some of the ideas and allow them to automate some tasks, which is always a good thing. But, I don’t think they will be a success amongst “real” programmers (at least until brain-computer interfaces work fluently).
Case study: Some of the first programming I ever did was on the LEGO Mindstorms RCX, and normal method was via a version of LabVIEW, the programs look a little like this. It was really cool! It got me introduced to some of the ideas of programming, and for simple things it’s easy to understand the program.
But it was so slow. So intensely slow. Clicking around to find the right block to place, dragging it to the appropriate spot, connecting up the wires, place any auxiliary inputs for that block and connect them up, move all blocks around a bit so wires don’t cross and blocks don’t overlap, repeat. Takes 10 or 20 seconds at best to do the equivalent of “MotorA.run(5)”, and much more to do something like “if TouchSensor.is_pressed()”.
Furthermore, moderately complicated programs ended up as a mess of wires and blocks: it is very difficult to make modifications to a block of code (inserting stuff means either leaving it hanging off to the side, or manually rearranging the whole thing) and expressing some concepts/structures is pretty awkward. As an example: this piece of code.
A few years later, I started learning proper languages, and found out that I could actually run real code on the RCX, which allowed me to do vastly more complicated tasks, because, for instance, I could have variables without having to have wires running across the whole page, and it was easy to encapsulate repeated bits of code into separate functions (and they could take input parameters!).
(Obviously, that’s just one example of one visual programming language, but it illustrates a few of my points.)
I don’t know about subtext specifically, but I’ve grown a bit more skeptical about the possibilities of visual programming languages over the years.
As a game developer, I sometimes had to find ways to give non-programmers control of a system—allowing a level designer to control where and when new enemies spawn, a game designer to design the attack patterns of a specific boss, a sound designer to make his music match in-game events, an FX artist to trigger specific FX at certain times … it’s not easy to do right. We programmers sometimes make things that seem obvious and simple with little graphs with arrows and dependencies, but it turns out to be a headache for someone else to wrap his head around. What seems to work best is not making a fully programmable system (even if it’s nice and visual), but rather defining a narrow set of operations that make sense for the behavior needed, and give a way of simply editing those; making something like a narrow minilanguage. And for that, simple linear text-based edition can work fine, without any graphical frills.
(Working with a visual tool works fine too, but it shouldn’t become a full-blown programming language; give a level designer a powerful turing-complete node-based scripting system, and he’ll make horribly complicated undebuggable rube goldberg machines, that could have been replaced by a few lines of code if he had explained to the right person what he needed)
Anyway, if you want to get better at programming, I don’t think you should expect much from fancy new visual languages; it may be more efficient to get someone to teach you a bit.
That would only be a problem if you could only refer to things by line number. When I call doTheWatoosi(), it doesn’t matter much if it’s the next function down or something defined in another program completely. It’s the symbol names that tell us about the interaction, not the locations of stuff in the file.
And, the space of possible names has many many dimensions, which actually gives it quite a leg up over visual languages which have 3 at best, and probably actually only 2 if they want to have a decent user interface.
Which of course doesn’t address the very real issue you raise: that text is much more opaque to beginners than visuals. But I am very skeptical of the notion that a visual programming language would be of much help to programmers who are already strong.
You could easily have “dimensional sliders” so you can move back and forth in 4 or 5 (etc.) dimensions. Not that this would make the user interface clearer, or the programming language more intuitive.
If you’re serious about learning, I suggest you take an online course from Udacity. Their 101 course is a very gentle introduction.
Registration is already open. They start tomorrow. It’s free.
http://www.codecademy.com is also great.
I am a programmer, and have been for about 20 years or so. My impressions here...
Diagrams and visual models of programs have typically been disappointing. Diagrams based on basic examples always look neat, tidy, intuitive and useful. When scaling up to a real example, the diagram often looks like the inside of a box of wires—lines going in all directions. Where the simple diagram showed simple lines drawing boxes together, the complex one has the same problem as the wiring box—you have 40 different ‘ends’ of the lines, and it’s a tedious job to pair them all up and figure out what goes where.
Text actually ends up winning when you scale up to real problems. I tend to find diagrams useful for describing various subsets of the problem at hand, but for describing algorithms, particularly, text wins.
The problem of programming is always that you need to model the state of the computer in your head, at various levels. You need to be able to ‘run’ parts of the program in your head to see what’s going on and correct it. This is the essential aspect of programming as it currently is, and it’s no use hoping some language will take this job away—it won’t, and can’t. The most a language can do is communicate the algorithm to you in an organised way. The job you mentioned—of modelling the flow of instructions in your head, and figuring out the interactions of distant pieces of code—that is the job of the programmer, and different presentation of the data won’t take that job away—not until the day AI takes the whole job away. Good presentation can make the job easier—but until you understand the flow of the code and the relationships between the parts, you won’t be able to solve programming problems.
On the other hand—I feel that the state of computer programming today is still almost primitivist. Almost any program today will show you several different views of the data structure you’re working on, to enable you to adjust it. The exception, by and large, is writing computer programs—particularly within a single function. There you’re stuck with the single view in which you both read and write. I’m sure there is something more that can be done in this area.
Part of this is probably due to VPLs not exposing the right abstractions—and of course, exposing an abstraction organically in a visual representation may be unfeasible. I looked at some instances of LabView programs linked in another comment, and there seemed to be a lot of repetition which would no-doubt be abstracted away in a text-based language.
I’m not sure. I think being able to model the computer’s actions in your head is something of a requirement to be a good programmer. If people who use (a hypothetical completed) subtext learn to do that more rapidly, then great. If instead they learn to just barely cobble something together without really understanding what is going on, I think that would be a net negative (I don’t want those people writing my bank’s software). I’m not sure which is the likely outcome.
Or maybe I’m conflicted because I am a computer programmer, and subconsciously either don’t want more competition or feel that anyone learning programming should have to put as much effort into it as I did.
But actually, if you managed to write a crappy life simulator, you can probably be a programmer. It takes practice to be good, like anything else. But if you don’t enjoy the process of practicing at programming (it sounds like you don’t), then you probably wouldn’t enjoy being a programmer, either.
Programming languages that make programming easier are a good goal. Problem is, there are too many languages that make programming of simple programs easier, and programming of complex programs more difficult. The language is optimized for doing a specific set of tasks, and if you walk outside that set, you are damned. (Although the authors will assure you that everything can be done by their language, it’s just a bit inconvenient.)
Things appealing to beginning programmers are often appealing for the wrong reasons. For example “you don’t have to write semicolons after each statements” or “you don’t have to declare variables”. Ouch! I agree that not having to write semicolons is convenient, but at the same time I think “if having to write a semicolon after each statement is such a big deal for you, I can’t imagine you as a successful programmer”, because although writing semicolons is boring task, cognitively it’s trivial; it can make people annoyed, but it can’t stop them from being able to program. If someone is not able to remember that each statement must be followed by a semicolon, I can’t imagine them doing anything non-trivial.
With visual languages it’s probably similar. Painting a simple loop as a picture makes it more transparent. Painting the whole algorithm, not so much. Only if we split the code into simple functions (which is anyway the right thing to do). But then we have to give names to those functions, and we are gradually switching back from the picture mode to the text mode.
Assuming it was not written in a visual programming language optimized for creating life-like simulators, nor it was created by a wizard for creating life-like simulators, I agree.
That’s the danger of many “easy” programming languages. They make only one specific task easy, and then use it as a proof that they made programming easy. Nope, they only made programming of that one specific task easy.
Agreed. But this problem can be avoided by embedding such domain-specific languages inside a general-purpose language. Then writing simple programs (for some definition of “simple”) is still fairly easy, because the DSL can be implemented with a one-time cost in complexity. However, coding complex programs is still feasible.
Visual representations of programs are interesting in their own right, because they allow reasoning about some program properties in very intuitive ways (depending on the representation, this may be syntax, data flow, control flow, data representation, etc.). However, it is probably the case that there is no single “best” visual representation for programs, and thus no such thing as a one-size-fits-all “visual programming language”.
Or by making a really convenient library for a general-purpose language. Although the language puts some limits on how convenient the library can be.
But I suspect one probably makes more money selling a new programming language than selling a library.
Or by making a really convenient DSL factory. The only use for your “general purpose” language would be to write DSLs. A bit extreme, but it shows some promise. Current results suggest this approach uses 3 orders of magnitude less code than current systems —possibly even less.
The effects of such visual flowering is greater than one might think, especially on beginners. Once you grok the concept of instruction, block, and nesting, you barely see the curly brackets (or the “begin” and “end” keyword) and the semicolons. A bit like a Lisp programmer that don’t “see” the parentheses any more.
Beginners are more sensitive. The cognitive load you call trivial is probably significant to them, because they still think in ACII instead of AST. In the ASCII world, a semicolon or a bracket takes about as cognitively loaded as any other keyword. Indentation, not so much.
Now one could see it as a test. I wonder if the ability to think through unhelpful syntax would be a good predictor of future success?
I hate writing “begin” and “end” in Pascal, because these words take too much of the screen space, and also visually pattern-match with identifiers. I think Pascal would be 50% more legible if it replaced “begin” and “end” with curly brackets. So I guess removing the semicolons and curly brackets is also an improvement in legibility.
Still, maybe the beginners are trying to move forward too fast. Maybe a lot of problems come from trying to run before one is able to walk reliably. When children learn mathematics, they have to solve dozens of “2+3=?” problems before they move to something more complex. When learning programming, students should also solve dozens of one-line or two-line problems before they move on. But there is often not enough time in the curriculum.
What would you replace the semicolon with?
There are a few obvious answers: One is to simply not allow multiple statements on the same visual line (even if they are closely related and idiomatic). Another is to define the semicolon (or equivalent) as a separator, with the side effect that you can no longer have a single statement split across multiple visual lines. Another is to, along with the ‘separator’ solution, add an additional symbol for splitting long statements across multiple visual lines—as in earlier Visual Basic. And yet another option is to have a separator and “guess” whether they meant a line break to end a statement or not—as in Javascript and modern Visual Basic.
You can also mix approaches: optional semicolons, but use indentation to guess if it’s the same instruction or not. That way:
This should be flexible enough and unambiguous enough.
In Python, you are supposed to write a colon before you start a block, right?
So the rules can be rather simple:
colon, with indentation = start of a new block
colon, no indentation = an empty block (or a syntax error)
no colon, with indentation = continuing of the previous line
no colon, no indentation = next statement
semicolon = statement boundary
Block ends where the indentation returns to the level of the line that opened the block. Continued line ends when indentation returns to the level of the starting line. (Where “to the level” = to the level, or below the level.)
I spent a lot of time thinking about this, and now it seems to me that this is a wrong question. The right question is: “how to make the best legible language?” Maybe it will require some changes to the concept of “statement”.
Why one statement plus one statement makes two statements, but one expression plus one expression makes one expression; why “x=1; y=1;” is two units, but “(x == 1) && (y == 1)” is one unit? What happens if a statement is a part of an expression, in an inline anonymous function? Where should we place semicolons or line breaks then?
Sorry, I don’t have a good answer. As a half-good answer, I would go with the early VB syntax: the rule is unambiguous (unlike some JavaScript rules), and it requires a special symbol in a special situation (as opposed to using a special symbol in non-special situation).
Another half-good answer: use four-space tabs for “this is the next statement” and a half-tab (two spaces) for “here continues the previous line”. (If the statement has more than two lines, all the lines except the first one are aligned the same; the half-tabs don’t accumulate.)
Because a statement is the fundamental unit of an imperative language. If “x=1; y=1;” were one unit, it would be one statement. Technically, on another level, multiple statements enclosed in braces is a single statement. Your objection does suggest another solution I forgot to put in—ban arbitrarily complex expressions. Then statements are of bounded length and have no need to span multiple lines. The obvious example for a language that makes this choice is assembly.
You could ban inline anonymous functions, or require them to be a single expression. You could implement half of Lisp as named functions that are building blocks for your “single expression” anonymous functions, so this doesn’t necessarily lose expressive power.
That Microsoft changed it is weak evidence against it—it suggests that people really don’t like having to add that extra symbol. There is that ambiguity problem, though. (Javascript’s rule* technically requires an arbitrarily large amount of lookahead—I think the modern VB rule is more sane from a compiler perspective, but can still have annoying consequences)
Your “other half-good answer” isn’t really very distinct from the first: the half-tab takes the role of the special symbol; it being at the beginning of the line just changes how you specify the grammar. (Vim scripting is an example of an existing language that uses a symbol at the beginning of a line for continuations) It also creates an extra burden (even compared to current whitespace-sensitive languages like Python) to maintain the indentation correctly. In particular, it forbids you from adding lots of extra indentation to, for example, line up the second part of a statement with a similar element on the first line (think making a C-style function call, then indenting subsequent lines to the point where the opening bracket of the argument list was. Or indenting to the opening bracket of the innermost still-open group in general.)
*Technical note: Javascript’s rule is “put in a semicolon if leaving it out leads to a syntax error”. VB’s rule is, more or less, “continue the statement if ending it at the linebreak leads to a syntax error”. In general, this will lead to Javascript continuing statements in unexpected places, and will lead to VB terminating statements in unexpected places.
I don’t believe this is true, at least not for the usual sense of “statement”, which is “code with side effects which, unlike an expression, has no type (not even unit/void) and does not evaluate to a value”.
You can easily make a language with no statements, just expressions. As an example, start with C. Remove the semicolon and replace all uses of it with the comma operator. You may need to adjust the semantics very slightly to compensate (I can’t say where offhand).
Presto, you have a statement-less language that looks quite functional: everything (other than definitions) is an expression (i.e. has a type and yields a value), and every program corresponds to the evaluation of a nested tree of expressions (rather than the execution of a sequence of statements).
Yet, the expressions have side effects upon evaluation, there is global shared mutable state, there are variables, there is a strict and well-defined eager order of evaluation—all the semantics of C are intact. Calling this a non-imperative language would be a matter of definition, I guess, but there’s no substantial difference between real C and this subset of it.
So the question “what kind of language are we trying to make?” must be answered before “what syntax would make it most legible?”.
Assuming an imperative language, the simplest solution would be one command per line, no exceptions. There is a scrollbar at the bottom; or you can split a long line into more lines by using temporary variables.
No syntax can make all programs legible. A good syntax is without exceptions and without unnecessary clutter. But if the user decides to write programs horribly, nothing can stop them.
An important choice is whether you make formatting significant (Python-style) or not. Making formatting significant has an advantage that you would probably format your code anyway, so the formatting can carry some information that does not have to be written explicitly, e.g. by curly brackets. But people will complain that in some situations a possibility to use their own formatting would be better. You probably can’t make everyone happy.
I think you stated my thoughts better than I did.
I recently did the biggest bit of useful programming I’ve ever done—automating large chunks of sysadmin work—in ant. ant is basically makefiles for Java. But it’s Turing-complete!
What it feels like: using an esoteric programming language whose conceit is that all code must be correctly-formed XML. Most of the work was the mathematical puzzle of how to implement some really simple thing in ant. (Every domain specific language that is allowed to become Turing-complete will evolve into brainfuck.)
My point is not that ant is a horrible, horrible language to use for anything outside its narrow domain (though it is) - but that even when I’m coding in proper programming languages, the process feels much the same—it takes me half a day to work out how to implement the simple thing in my head, and coding feels like building a house one matchstick at a time. This leads me to suspect it’s me, not the languages. So it’s not just you.
In the Subtext FAQ there’s a question that is now my favorite question to ask of any new programming tool:
Disconcertingly, the answer for Subtext is
This article suggests that something like 30% to 60% of people cannot learn to code. I think that’s interesting. EDIT: This also might be wrong; see child comment.
The three hurdles the article describes are variable assignment, recursion, and concurrency. I don’t think you can program at all without those three elements.
Programming is interesting in that the difference between good programmers and bad programmers seems to be far more pronounced than the difference between people who are good and bad at other tasks—I recently observed about ten smart friends of mine trying to learn Haskell for a introduction to algorithms course. Some of them got it immediately and intuitively, and some just didn’t.
Also. I suspect that some people will find learning to program a bit easier with functional programming languages like Haskell. When learning Haskell, I wrote lots of simple functions, and acheived more complex results by stringing together lots of simple functions. In imperative languages, it’s a bit harder to test all the individual pieces as you’re going.
This is the top link which the 2006 Coding Horror article is based on:
http://www.eis.mdx.ac.uk/research/PhDArea/saeed/
It’s to Saeed Dehnadi’s research. In 2006, Dehnadi and Bornat put out a paper purporting to “have discovered a test which divides programming sheep from non-programming goats. This test predicts ability to program with very high accuracy before the subjects have ever seen a program or a programming language.” The Coding Horror article, which was heavily linked and discussed in various forums, seems to have popularized this research quite well.
In 2008, the followup research on a much larger and more diverse set of students failed to confirm the effect.
And a 2009 followup showed mixed results.
These followups received substantially less widespread discussion than the original claim. My sneaking suspicion is that this may reflect not only the usual bias in favor of positive results, but a preference on the part of the programming community for the notion that programmers are a special class of people.
(Or it may just be that Coding Horror didn’t cover them.)
My suspicion is that such results reflect a failure of teaching.
Imagine that you are teaching people mathematics, and you skip some beginner lessons, and start with the more advanced ones. Some people will have the necessary knowledge (from home, books, internet, etc.), so they can follow you and improve their knowledge. Most people simply don’t understand what you are talking about. At the end of the year the test will show that there are two separate groups—those who know a lot, and those who have no clue.
Please note that the failure of teaching is not necessarily at the level where the problem was discovered. It may be a failure from previous levels. For example a university teacher may expect some really simple knowledge, but many high schools fail to teach it.
The type system of Haskell is quite restrictive for beginners (it’s a little annoying to not be able to debug by putting a print anywhere, or read user input wherever you want) and the laziness can be a little unintuitive, especially for people who haven’t done much mathematics (e.g.
ones = 1:ones
… “I’m defining something in terms of itself, aghafghfg”).But, I do agree that functional languages might be easier to teach to certain groups of people, like those who have done a fair bit of maths, and that Haskell has some very neat features for learning to program (GHCi and Hoogle are awesome!).
There’s unsafePerformIO :: IO a → a
Or, er,
Debug.Trace.trace
…I agree.
Well, depending on what platform you’re using, you don’t necessarily need concurrency.
I don’t think you can program at all without at least one of the first two elements, but you can get by with only one or the other if you restrict your choice of languages.
That’s just for learning how to code, though; you’ll never get through first-semester CS with that kind of limitation.
As others pointed out, different ways of “programming” are best for different problem domains, and there is virtually no chance that a one-size-fits-all language can be useful to do or teach programming in every domain.
Moreover, regardless of the language, you have to develop the ability to think like a computer, which means that there is no magical DWIM (do what I mean) button/keyword available to you. Some people have a harder time developing this essential ability than others; they should probably consider a different career path, no matter what languages are available out there.
I’m glad you like subtext. Me too.
I just had a big “update”. EDIT: I’m a little less sure now. See the end.
I found something to teach programming on an immediate level to non-programmers without knowing they are programming, without any cruft. I always wished this was possible, but now I think we’re really close.
If you want to get programming, and are a visual thinker, but never could get over some sort of inhibition, I think you should try this. You won’t even know you’re programming. It may not be “quite” programming, but it’s closer than anything else I’ve seen at this level of simplicity. And anyway it’s fun and pretty.
The important thing about this “programming” environment is that it is completely concrete. There are no formal “abstractions,” and yet it’s all about concrete representation of the idea formerly known as abstractions.
Enough words. Take a look: http://recursivedrawing.com/
[I was excited because to me this seems awfully close to the untyped lambda-calculus, made magically concrete. The “normal forms” are the “fixed points” are the fractals. It’s all too much and requires more thought. It only makes pictures, though, for now. However, I can’t see anything in it like “application” so… the issue of how close it is seems actually quite subtle. Somehow application’s being bypassed in a static way. Curious. I’m sure there’s a better way to see it I just haven’t gotten yet.]
PS: Blue! Blue! Blue! (**)
** This is a joke that will only make sense if you’ve read The Name of the Wind: Rothfuss. If you prefer to spoil yourself, here, but buy the book afterward if you like it.
cross-posted here
Update:
See also: www.worrydream.com
Visual programming is great where the visual constructs map well to the problem domain. Where it does not apply well it becomes a burden to the programmer. The same can be said about text based programming. The same can be said about programming paradigms. For example object oriented programming is great… when it maps well to the problem being solved, but for other problems it simply sucks and perhaps functional programming is a better model.
In general, programming is easy when the implementation domain (the programming language, abstract model, development environment, other tools) maps well to the problem domain. When the mapping becomes complex and obscure, programming becomes hard.
You will not find a single approach to programming that is easy for all problems, instead you will find that each approach has its limits.
My current project is to catalyze a new perspective on programming. I believe that we should be programming using an ecosystem of domain specific languages. Each language will be arbitrarily simple (easy to learn) and well targeted to representing solutions within in its targeted problem domain. Although none of the languages are individually Turing-complete, Turing-completeness is available within the ecosystem by combining programs written in different languages together using other languages.
When I use the term language I mean it in its most general sense, along the lines of this definition “a systematic means of communicating ideas or feelings by the use of conventionalized signs, sounds, gestures, or marks having understood meanings”. Perhaps a better word than language would be interface.
Programming from this perspective becomes the generation of new interfaces by composing, transforming, and specializing existing interfaces, using existing interfaces.
This perspective on programming is related to language-oriented programming, intentional programming, aspect-oriented programming, and model-driven engineering.
(Duplicate of this)
If you haven’t heard of the STEPS project from the Viewpoint Research Institute already, it may interest you. (Their last report is here)
Thank you for the reference to STEPS; I am now evaluating this material in some detail.
I would like to discuss the differences and similarities I see between their work and my perspective; are you are familiar enough with STEPS to discuss it from their point of view?
In reply to this:
This use of a general purpose language also shows up in the current generation of language workbenches (and here). For example JetBrains’ Meta Programming System uses a Java-like base language, and Intentional Software uses a C# (like?) base language.
My claim is that this use of a base general purpose language is not necessary, and possibly not generally desirable. With an ecosystem of DSLs general purpose languages can be generated when needed, and DSLs can be generated using only other DSLs.
I think I am (though I’m but an outsider). However, I can’t really see any significant difference between their approach and yours. Except maybe that their DSLs tend to be much more Turing complete than what you would like. It makes little matter however, because the cost of implementing a DSL is so low that there is little danger of being trapped in a Turing tar-pit. (To give you an idea, implementing Javascript on top of their stack takes 200 lines. And I believe the whole language stack implements itself in about 1000 lines .)
In the unlikely case you haven’t already, you may want to check out their other papers, which include the other progress reports, and other specific findings. You should be most interested by Ian Piumarta’s work on maru, and Alessandro Warth’s on OMeta, which can be examined separately.
This seems like a bad idea. There is a high cognitive cost to learning a language. There is a high engineering cost to making different languages play nice together—you need to figure out precisely what happens to types, synchronization, etc etc at the boundaries.
I suspect that breaking programs into pieces that are defined in terms of separate languages is lousy engineering. Among other things, traditional unix shell programming has very much this flavor—a little awk, a little sed, a little perl, all glued together with some shell. And the outcome is usually pretty gross.
These are well targeted critiques, and are points that must be addressed in my proposal. I will address these critiques here while not claiming that the approach I propose is immune to “bad design”.
Yes, traditional general purpose languages (GPLs) and many domain specific languages (DSLs) are hard to learn. There are a few reasons that I believe this can be allayed by the approach I propose. The DSLs I propose are (generally) small, composable, heavily reused, and interface oriented which is probably very different than the GPLs (and perhaps DSLs) from your experience. Also, I will describe what I call the encoding problem and map it between DSLs and GPLs to show why well chosen DSLs should be better.
In my model there will be heavy reuse of small (or even tiny) DSLs. The DSLs can be small because they can be composed to create new DSLs (via transparent implementations, heavy use of generics, transformation, and partial specialization). Composition allows each DSL to deal with a distinct and simple concern but yet be combined. Reuse is enhanced because many problem domains regardless of their abstraction level can be effectively modeled using common concerns. For example consider functions, Boolean logic, control structures, trees, lists, and sets. Cross-cutting concerns can be handled using the approaches of Aspect-oriented programming.
The small size of these commonly used DSLs, and their focused concerns make them individually easy to learn. The heavy reuse provides good leveraging of knowledge across projects and across scales and types of abstractions. Probably learning how to program with a large number of these DSLs will be the equivalent of learning a new GPL.
In my model DSLs are best thought of as interfaces, where the interface is customized to provide an efficient and easily understood method of manipulating solutions within the problem domain. In some cases this might be text based interfaces such as we commonly program in now, but it also could be graphs, interactive graphics, sound, touch, or EM signals; really any form of communication. The method and structure of communication is constrained by the interface, and is chosen to providing a useful (and low noise) perspective into the problem domain. Text base languages often come with a large amount of syntactic noise. (Ever try template based metaprogramming in C++? Ack!).
Different interfaces (DSLs) may provide different perspectives into the same solution space of a problem domain. For example a graph, and the data being graphed: the underlying data could be modified by interacting with either interface. The choice of interface will depend on the programmer’s intention. This is also related to the concept of projectional editors, and can be enhanced with concepts like Example Centric Programming.
The encoding problem is the problem of transforming an abstract model (the solution) into code that represents it properly. If the solution is coded in a high-level DSL, then the description of the model that we create while thinking about the problem and talking to our customizers might actually represent the final top level code. In this case the cognitive cost of learning the DSL is the same as understanding the problem domain, and the cost of understanding the program is that of understanding the solution model. For well chosen DSLs the encoding problem will be easy to solve. In the case of general purpose languages the encoding problem can add arbitrary levels of complexity. In addition to understanding the problem domain and the abstract solution model, we also have to know how these are encoded into the general purpose language. This adds a great deal of learning effort even if we already know the language, and even if we find a library that allows us to code the solution relatively directly. Perhaps worse than the learning cost is the ongoing mental effort of encoding and decoding between the abstract models and the general purpose implementation. We have to be able to understand and modify the solution through an additional layer of syntactic noise. The extra complexity, the larger code size and the added cognitive load imposed by using general purpose languages multiplies the likelihood of bugs.
Boundary costs can be common and high even if you are lucky enough to get to program exclusively in a single general purpose language. Ever try to use functions from two different libraries on the same data? Image processing libraries and math libraries are notorious for custom memory representations, none of which seem to match my preferred representation of the same data. Two GUI libraries or stream I/O libraries will clobber each other’s output. The costs (both development-time and run-time) to conform disparate interfaces in general purpose languages is outrageous. My proposal just moves these boundary costs to new (and perhaps unexpected) places while providing tools (DSLs for composition and transformation) that ease the effort of connecting the disparate interfaces.
I’ve described my proposal as a perspective shift, and that interface might be a better term than language. To shift your perspective, consider the interfaces you have to your file system. You may have a command line interface to it, a GUI interface, and a programmatical interface (in your favorite language). You choose the appropriate interface based on the task at hand. The same is true for the interfaces I propose. You could use the file system in a complex way to perform perfectly good source code control, or you could rely on the simpler interface of a source control system. The source control system itself might simply rely on a complex structuring of the file system, but you don’t really care how it works as long as it is easy to use and meets your needs. You could use CSV text files to store your data, but if you need to perform complex queries a database engine is probably a better choice.
We already break programs (stuff we do) into pieces that are defined in terms of separate languages (interfaces), and we consider this good engineering. My proposal is about how to successfully extend this type of separation of concerns to its granular and interconnected end-point.
Your UNIX shell programming example is well placed. It is roughly a model that matches my proposal with connected DSLs, but it is not a panacea (perhaps far from it). I will point out that the languages you mention (awk, sed, and perl) are all general purpose (Turing-complete) text based languages, which is far from the type of DSL I am proposing. Also the shell limits interaction between DSLs to character streams via pipes. This representation of communication rarely maps cleanly to the problem being solved; forcing the implementations to compensate. This generates a great deal of overhead in terms of cognitive effort, complexity, cost ($, development time, run-time), and in some sense a reduction of beauty in the Universe.
To highlight the difference between shell programming and the system I’m proposing, start with the shell programming model, but in addition to character streams add support for the communication of structured data, and in addition to pipes add new communication models like a directed graph communication model. Add DSLs that perform transformations on structured data, and DSLs for interactive interfaces. Now you can create sophisticated applications such as syntax sensitive editors while programming at a level that feels like scripting or perhaps like painting; and given the composability of my DSLs, the parts of this program could be optimized and specialized (to the hardware) together to run like a single, purpose built program.
I’m only going to respond to the last few paragraphs you wrote. I did read the rest. But I think most of the relevant issues are easier to talk about in a concrete context which the shell analogy supplies.
Yes. It’s clunky. But it’s not clunky by happenstance. It’s clunky because standardized IPC is really hard.
It’s a standard observation in the programming language community that a library is sort of a miniature domain-specific language. Every language worth talking about can be “extended” in this way. But there’s nothing novel about saying “we can extend the core Java language by defining additional classes.” Languages like C++ and Scala go to some trouble to let user classes resemble the core language, syntactically. (With features like operator overloading).
I assume you want to do something different from that, since if you wanted C++, you know where to find it.
In particular, I assume you want to be able to write and compose DSLs, where those DSLs cannot be implemented as libraries in some base GPL. But that’s a self-contradictory desire. If DSL A and DSL B don’t share common abstractions, they won’t compose cleanly.
Think about types for a minute. Suppose DSL A has some type system t, and DSL B has some other set of types t’. If t and t’ aren’t identical, then you’ll have trouble sharing data between those DSLs, since there won’t be a way to represent the data from A in B (or vice versa).
Alternatively, ask about implementation. I have a chunk of code written in A and a chunk written in B. I’d like my compiler/translator to optimize across the boundary. I also want to be able to handle memory management, synchronization, etc across the boundary. That’s what composability means, I think.
Today, we often achieve it by having a shared representation that we compile down to. For instance, there are a bunch of languages that all compile down to JVM bytecode, to the .NET CLR, or to GCC’s intermediate representation. (This also sidesteps the type problem I mentioned above.)
But the price is that if you have to compile to reasonably clean JVM bytecode (or the like), that really limits you. To give an example of an un-embeddable language, I don’t believe you could compile C to JVM bytecode and have it efficiently share objects with Java code. Look at the contortions scala has gone through to implement closures and higher-order functions efficiently.
if two DSLs A and B share a common internal representation, they aren’t all that separate as languages. Alternatively, if A and B are really different—say, C and Haskell—then you would have an awfully hard time writing a clean implementation of the joint language.
Shell is a concrete example of this. I agree that a major reason why shell is clunk is that you can’t communicate structured data. Everything has to be serialized, and in practice, mostly in newline-delimited lists of records, which is very limiting. But that’s not simply because of bad design. It’s because most of the languages we glue together with shell don’t have any other data type in common. Awk and sed don’t have powerful type systems. If they did, they would be much more complicated—and that would make them much less useful.
Another reason shell programming is hard is that there aren’t good constructs in the shell for error handling, concurrency, and so forth. But there couldn’t be, in some sense—you would have to carry the same mechanisms into each of the embedded languages. And that’s intolerably confining.
MaxMSP is a music technology program that works like this—you can visually track the flow of information. It might be relevant—I’ll write about it more tomorrow, i’m on a mobile phone at the moment.
Okay, it’s been about two years since I’ve used Max/MSP, and they’ve brought out a significantly different version since then, but this is what I remember.
It’s a program for building musical instruments, patches, and effects. You place objects (which take the form of labelled boxes) on a blank space, and connect them together with wires. The UI is pretty bare—it’s mostly just black and white, though the user can add a degree of their own design to the patch for actual use.
The objects are, for the most part, pretty simple, so it can be quite difficult to achieve even simple tasks. To create a sine tone, you create a specific object that has the “create sine tone function”, input a number (for example 440 - it’s measured in Hz) into it, and output it to an audio control. Building bigger and more complex devices gets pretty dense, and if something isn’t working it can be very difficult to figure out where the problem lies.
That said, I found it quite helpful to have the ability to visually track the flow of information—one exception to the usually black-and-white UI is that wires carrying sound rather than numerical data appear as crosshatched grey and yellow, rather than simple black line.
I’m not sure how helpful this is; I’ve no knowledge of programming, but maybe it’ll serve as a useful comparison.
I spent some time programming with Max (less with MSP) and found roughly the results that others have reported for visual languages. It makes something like an FM synthesizer (y = a sin(b sin (ct))) look a lot more pleasant to a non-programmer musician, but for bigger projects it slows you down and gets in the way and prevents version control etc. But I didn’t spend a lot of time with it, so a grain of salt is needed of course.
For how long? I’ve been able to solve the first 5 Project Euler problems after 2 days of Python, and I’d probably be able to solve more, but that isn’t programming.
I doubt you can become anything that would deserve the label “programmer” in under 3 years, as long as you are not a genius.
I know at least two people that got $70k+ programming jobs after only about three months of study. Not sure what “genius” means in this context.
What did they do before programming?
One was a physics grad student, the other a mathematics grad student. Both had some prior experience with basic Bash.
This makes it much less surprising. Anecdotally in my social circle it seems that people who have had studied math or physics seem to easily pick up programming.
I think that’s exaggerated. From what I understand it was more like one $70k and one $40k, after something like 6-8 months of study.
That said, anyone with anecdotes like this is invited to share them. They sound cool, and give one hope for this world.
The 40k one wasn’t the one I had in mind, but I’ll accept your correction re the 70k one.
I know very little so it is hard to judge for me. I would be impressed by someone with no programming experience who could write a post like this, after three months of study, without a previous math or computer science background.
That author’s level isn’t necessary to make a living at computer programming.
And (for the avoidance of doubt) that author doesn’t in any sense lack “a previous math or computer science background”, although he says he’s a programming beginner. He’s a first-rate physicist and the author of an important book on quantum computing. So I’m not sure what XiXiDu is saying here; that Nielsen’s level of insight is what it takes to deserve the label of “programmer”? (No.) Or that Nielsen is a genius? (Maybe, but so what?) Or what?
It seems that there are some who are incapable of learning programming. That said, when you are programming, you are virtually always working with a Von Neumann Architecture, so many languages have common ground.
Code is in general presented as composable units. Working in Symbolic Graphs of plaintext names or in actual graphs in 2d makes little difference.
If you really want to learn programming, but think that regular Java(script) or Python is trite and annoying, try Haskell. Haskell requires you to know math to actually make sense of anything, and it is very powerful in very different ways compared to Algol descendants.
Replication has proven difficult: http://www.gwern.net/Notes#the-camel-has-two-humps
Related point—I remember that programming in hyperscript (for hypercard) was a lot like explaining something to someone who had no domain knowledge.
Stuff like that.
I lost interest in programming after I realized how much effort it would take me to do anything cool.
For those people here who consider themselves reasonably skilled at programming: how long would it take you to implement a clone of Tetris? I’ve got a computer engineering degree, and “cin” was the only user input method they taught me in programming classes...
Edit: You’re not allowed to use ASCII graphics, and it has to run at the same speed on different processors, but other than that, requirements are flexible.
Depends on how many bells and whistles you want. For just a basic clone, I can do it in an evening, using a third-party library to handle the windowing and whatnot. Of course if you wanted to implement the windowing and input handling by talking directly to the OS, it would be an utter nightmare; call it three months.
User experience makes questions like this tricky: graphics, sound, score-tracking, tightening up controls. If I was working with a graphics library I was already familiar with, I doubt the core gameplay would take me more than a few hours. Writing a reasonable clone of the version of Tetris I played fifteen years ago on the Atari ST would take at least a couple of weeks, though, and that’s if I had all the resources I needed on hand. An exact clone would take even longer.
A related question: how much would a beginner have to study before being able to write Tetris? As I said, I graduated with a degree in computer engineering without being able to code Tetris, because I have no idea how to write anything except console programs that use cin and cout to do all the input/output work. “Draw something on the screen, and have it move when someone presses a key” would seem to be a fundamental task in computer programming, but apparently it’s some kind of super-advanced topic. :P
The focus on cin / cout as opposed to GUI is probably because cin is simple and always works the same way (mostly because nobody uses it, no need for a zillion libraries), whereas there are a lot of very different GUI libraries with different ways of doing things; learning one of those would take time and not help you use another one much.
If you want to learn yow to make a GUI you can probably find a “hello world” example for your language/os of choice and just copy-paste the code and then adjust it to suit your needs.
Yeah… everything is in libraries these days and the libraries are all incompatible with each other. :(
It’s not as hard as it might sound. Modern languages have nice libraries and frameworks that make input and basic graphics very easy. Here’s a tutorial for Slick (a Java-based 2D game framework) that walks you through how to do exactly what you ask: http://slick.cokeandcode.com/wiki/doku.php?id=01_-_a_basic_slick_game
Here’s a tutorial for how to make Tetris: http://slick.cokeandcode.com/wiki/doku.php?id=02_-_slickblocks
I’m sure similar things exist for C++, especially since it’s the most popular language for making games in.
edit: If you want to actually follow one of the above tutorials, see this setup info first: http://slick.cokeandcode.com/wiki/doku.php?id=getting_started_and_setup
Probably not too long. I wrote (a crappy version of) Breakout in my second semester of high-school programming, and that was using Pascal plus some homebrewed x86 assembler for the graphics (both of which were a nightmare that I wouldn’t recommend to anyone), so simple games clearly don’t require any deep knowledge of the discipline; if you’ve got a computer-engineering background already and you’re working with a modern graphics library, I’d call it a couple weeks of casual study. Less if you’re using a game-specific framework, but those skills don’t usually transfer well to other things.
I did this (in Java) when I had some spare time at work recently. It took a couple of days of work (with some slacking off), including learning a (pretty simple) new game/graphics framework.
If you don’t care about graphics, like a couple hours. But you could spend as much time as you want on graphics.
Use Pygame. Evidence: People frequently produce playable games using Pygame in 24-hour hackathons.
An evening? Depends on what you mean by “Tetris” and the target quality. I did a two player Tetris-over-network for a class once (the opponent gets an additional garbled line when you clear multiple lines). It’s easy, which is part of the problem: with enough expertise, it can become boring, you don’t learn as much new stuff, it becomes more like laying bricks, not designing intricate machines or learning the principles of their operation.
An estimate that I heard on multiple occasions, disbelieved, and then witnessed come true, is that it takes about 4 years of hands-on experience for an enthusiastic smart adult to conquer the learning curve and as a result lose enthusiasm for software development in the abstract (so that you’d need something special about the purpose of the activity, not just the activity itself). I don’t know about the probability of this happening, but the timescale seems about right.
There is good reason that several programmers have referred to GUIs as “point-and-grunt” interfaces. And actually programming requires even more flexibility. An intuitive and functional “intuitive” programming system is going to be based around natural language, not pretty pictures. That isn’t to say images won’t be used, most likely in structuring the sub-components of the overall program, but LabView already shows the strengths and weaknesses of that if the programmer doesn’t have sufficient linguistic control at a lower level.
ADDED:
CS345 / Graphical User Interfaces: Principles of Graphical User Interface Design
Fantastic thread! Are there any statistical programming, or programming languages of any kind that are, well, ‘obvious’. Something I can type ‘survival analysis with lalalala’ instead of ‘stset 34.3 alpha 334’ or something like that?
Unless you’re hacking you usually don’t need to do this. You just need to understand what state the program is in before and after each operation. You never need to understand the whole thing at once, just understand one part at a time.
Er, what? You absolutely do need to model control flow, and how distant parts fit together. You should only think about state one operation at a time when you’re confused, or suspicious of the code you’re looking at, because step-by-step thinking is very slow and can’t support most of the operations you’d want to do on a program.
When modelling how distant parts fit together, you use abstraction. You don’t need to model how the internals of your sort function interact with other parts of your code, you just remember that it sorts things. You’re still thinking in terms of one operation at a time, just using more high-level operations.
Notice that software design best practices improve your ability to do this: separation of concerns, avoidance of mutable global variables, lack of non-obvious side effects.
From my own experience as a programmer, I think this is idealized to the point of being false. Finding a few distantly-separated, interacting regions of code which don’t respect a clean abstraction is pretty common, especially when debugging (in which case there is an abstraction but it doesn’t work).
This isn’t really possible in many cases. Many programs are resource-constrained. And the heap, IO resources, etc, are shared state. We don’t have good ways of abstracting that away. Likewise, synchronization is still a giant can of worms.