I see there are some functional programming buffs here, so I have a question. These functional programs, as pretty as they are, must be compiled into plain old machine code to actually run. Are these compilers really so smart that they can translate pretty idiomatic functional code into high-performing binaries? Or do you have to use ugly techniques based on intimate knowledge of the compiler/interpreter to squeeze out some decent performance out of these languages?
At the shootout it looks like the fastest Haskell programs are about 2x slower than the fastest C programs, but their code contains many non-Haskellish tricks like accessing raw memory. Idiomatic Haskell code is probably slower than that.
A gap of about 2x on a benchmark set doesn’t say much, if anything. That’s well within the usual variance you’ll get for any single language from different compilers and from different levels of programmer skill and effort put into manual optimizations. Certainly, much more work has gone into C compilers than Haskell compilers, so one would expect that there’s much more low-hanging fruit for improvement in the latter.
That said, the really interesting figures would be those for nice idiomatic Haskell, as well as figures about what percentage of code must be written using ugly hacks to achieve performance comparable to C. The power of C lies in the fact that you can write nice, manageable, and natural-looking code while having a very good idea about the machine code that comes out of each statement you write. (In fact, C was originally meant to be used with machine code directly in mind, with no need for compiler optimization, though modern architectures are too complicated for that.) Now, in almost any language, you can force a similar approach by using ugly and unnatural hacks based on the intimate knowledge of what your compiler will produce in response. However, that defeats the purpose of using a language more fancy than C, except perhaps if you can justify it by demonstrating that such ugly hacks are necessary only for a minuscule performance-critical part of the code.
I’m flattered by your trust in my intuition, but I don’t trust it very much myself :-) Who knows what will be fast in five years? Anyway, Haskell is already bringing much good to the world as a vehicle for research and a source of ideas for other languages (just look at the history of C#), so I think it’s okay if it never makes it into production.
Generally speaking, functional programs get compiled into the same intermediate representations as imperative languages and perform the same. They’re at an advantage where their objects are immutable, which lets the compiler skip alias analysis and continue to optimize in the cases where alias analysis fails, but at a disadvantage when they’re dynamically typed and types can’t be inferred, since they need to insert extra branches.
Some compilers are better than others and some languages are easier to write good compilers for than others, but functional/imperative is not the deciding factor.
Which exact imperative languages are you taking as benchmarks of performance here? The lower-level ones like C or Fortran where you can squeeze out amazing performance if the compiler is any good, or the fancier ones with garbage collection, dynamic typing, bounds checking, and other features with large overheads? If a language like Haskell can actually be compiled to perform comparably with the former when written naturally and without expert hacks, I find that an astonishing feat.
Which exact imperative languages are you taking as benchmarks of performance here? The lower-level ones like C or Fortran where you can squeeze out amazing performance if the compiler is any good, or the fancier ones with garbage collection, dynamic typing, bounds checking, and other features with large overheads?
To give one example pair, Java (imperative) and Scala (functional) share a significant amount of compiler infrastructure and end up being pretty much the same speed.
In practice, the answer to how fast languages are is pretty complex, and sometimes counterintuitive. For example, one of the “large overheads” you mentioned, bounds checking, isn’t a significant overhead at all. Modern compilers are very aggressive about removing bounds checks when it can be proven that they’ll never fail (which is true for almost all good code), and moving them out of loops. C programs, for security reasons, are generally run past a theorem prover (splint) to ensure that they have bounds checks where they’re needed, which means that C programs and Scala programs end up with the same set of bounds checks in the compiled output, with differences due to different features of the respective theorem provers/optimizers, which operate on intermediate representations that look nothing like the original languages anyways. Similarly, garbage collection has a reputation as an expensive feature, because of bad garbage collectors, but a good compiler can prove when most objects will leave scope, and take them out of the garbage collector’s domain; it doesn’t have to be expensive. Again for dynamic typing; most dynamically typed programs can have type inference performed on them to make statically-typed programs, it’s just that the compilers aren’t always good enough to do that. (But sometimes they are. And there are plenty of dynamically typed imperative languages around, so it’s not a functional/imperative thing.)
Fortran’s reputation for speed, as far as I can tell, is mainly due to the LINPACK linear algebra library, which started in the Fortran world but is actually written in assembly to use funky vector instruction sets these days anyways. C is great for interfacing with hardware and writing schedulers and memory allocators, because it maps so closely to assembly language; and more work has been put into C compilers than compilers for any other language. But on the sort of code that compilers are good at, it’s the same speed as OCaml or Scala, because they all end up compiling to the same thing.
(My knowledge of this subject is mainly academic; I wrote a compiler in school, and I try to keep tabs on research in the field, but I’m not an expert by any means. Take this for what it’s worth.)
You make the situation with optimizing compilers sound really optimistic! Unfortunately, it seems to me that things don’t work anywhere so well in practice. Yes, the practical handling of fancy language features has come a long way from naive straightforward implementations, but I’d say you exaggerate how good it is.
For example, I haven’t followed closely the work in bounds checking elimination, but I know that a lot of papers are still being published about it, indicating that there are still large enough overheads to make the problem interesting. (Which is not surprising, considering that the problem is after all undecidable in general. Also, as far as I know, bounds checks are normally not added by C compilers, and there are depressing limits to what theorem provers can figure out about the usual C where you pass pointers around liberally.)
It’s similar with garbage collection, dynamic type checks, and other fancy features. Their overheads can certainly be reduced greatly by smart compilers and efficient run-time operations, sometimes to the point where there is no difference from C, but definitely not always, and often not reliably and predictably.
(Fortran, by the way, has traditionally had the advantage of being highly amenable to automatic optimization, including automatic parallelization, especially when used for typical array-based numerical computations. This has in turn led to a lot of fruitful effort put into optimizing and parallelizing numerical Fortran, leading to its unprecedented performance and acceptance in these areas, and its lead has been eroded only in recent years. You can’t possibly say that this is just due to a single popular library.)
I see there are some functional programming buffs here, so I have a question. These functional programs, as pretty as they are, must be compiled into plain old machine code to actually run. Are these compilers really so smart that they can translate pretty idiomatic functional code into high-performing binaries? Or do you have to use ugly techniques based on intimate knowledge of the compiler/interpreter to squeeze out some decent performance out of these languages?
At the shootout it looks like the fastest Haskell programs are about 2x slower than the fastest C programs, but their code contains many non-Haskellish tricks like accessing raw memory. Idiomatic Haskell code is probably slower than that.
Do you have a sense about whether this is an inherent feature of Haskell or if this gap decrease substantially as haskell compilers improve?
A gap of about 2x on a benchmark set doesn’t say much, if anything. That’s well within the usual variance you’ll get for any single language from different compilers and from different levels of programmer skill and effort put into manual optimizations. Certainly, much more work has gone into C compilers than Haskell compilers, so one would expect that there’s much more low-hanging fruit for improvement in the latter.
That said, the really interesting figures would be those for nice idiomatic Haskell, as well as figures about what percentage of code must be written using ugly hacks to achieve performance comparable to C. The power of C lies in the fact that you can write nice, manageable, and natural-looking code while having a very good idea about the machine code that comes out of each statement you write. (In fact, C was originally meant to be used with machine code directly in mind, with no need for compiler optimization, though modern architectures are too complicated for that.) Now, in almost any language, you can force a similar approach by using ugly and unnatural hacks based on the intimate knowledge of what your compiler will produce in response. However, that defeats the purpose of using a language more fancy than C, except perhaps if you can justify it by demonstrating that such ugly hacks are necessary only for a minuscule performance-critical part of the code.
I would like to see that as well, for the reasons you mention.
It would be truly really interesting if any language managed to abstract out the process of optimization significantly more than other languages.
I’m flattered by your trust in my intuition, but I don’t trust it very much myself :-) Who knows what will be fast in five years? Anyway, Haskell is already bringing much good to the world as a vehicle for research and a source of ideas for other languages (just look at the history of C#), so I think it’s okay if it never makes it into production.
Generally speaking, functional programs get compiled into the same intermediate representations as imperative languages and perform the same. They’re at an advantage where their objects are immutable, which lets the compiler skip alias analysis and continue to optimize in the cases where alias analysis fails, but at a disadvantage when they’re dynamically typed and types can’t be inferred, since they need to insert extra branches.
Some compilers are better than others and some languages are easier to write good compilers for than others, but functional/imperative is not the deciding factor.
Which exact imperative languages are you taking as benchmarks of performance here? The lower-level ones like C or Fortran where you can squeeze out amazing performance if the compiler is any good, or the fancier ones with garbage collection, dynamic typing, bounds checking, and other features with large overheads? If a language like Haskell can actually be compiled to perform comparably with the former when written naturally and without expert hacks, I find that an astonishing feat.
To give one example pair, Java (imperative) and Scala (functional) share a significant amount of compiler infrastructure and end up being pretty much the same speed.
In practice, the answer to how fast languages are is pretty complex, and sometimes counterintuitive. For example, one of the “large overheads” you mentioned, bounds checking, isn’t a significant overhead at all. Modern compilers are very aggressive about removing bounds checks when it can be proven that they’ll never fail (which is true for almost all good code), and moving them out of loops. C programs, for security reasons, are generally run past a theorem prover (splint) to ensure that they have bounds checks where they’re needed, which means that C programs and Scala programs end up with the same set of bounds checks in the compiled output, with differences due to different features of the respective theorem provers/optimizers, which operate on intermediate representations that look nothing like the original languages anyways. Similarly, garbage collection has a reputation as an expensive feature, because of bad garbage collectors, but a good compiler can prove when most objects will leave scope, and take them out of the garbage collector’s domain; it doesn’t have to be expensive. Again for dynamic typing; most dynamically typed programs can have type inference performed on them to make statically-typed programs, it’s just that the compilers aren’t always good enough to do that. (But sometimes they are. And there are plenty of dynamically typed imperative languages around, so it’s not a functional/imperative thing.)
Fortran’s reputation for speed, as far as I can tell, is mainly due to the LINPACK linear algebra library, which started in the Fortran world but is actually written in assembly to use funky vector instruction sets these days anyways. C is great for interfacing with hardware and writing schedulers and memory allocators, because it maps so closely to assembly language; and more work has been put into C compilers than compilers for any other language. But on the sort of code that compilers are good at, it’s the same speed as OCaml or Scala, because they all end up compiling to the same thing.
(My knowledge of this subject is mainly academic; I wrote a compiler in school, and I try to keep tabs on research in the field, but I’m not an expert by any means. Take this for what it’s worth.)
You make the situation with optimizing compilers sound really optimistic! Unfortunately, it seems to me that things don’t work anywhere so well in practice. Yes, the practical handling of fancy language features has come a long way from naive straightforward implementations, but I’d say you exaggerate how good it is.
For example, I haven’t followed closely the work in bounds checking elimination, but I know that a lot of papers are still being published about it, indicating that there are still large enough overheads to make the problem interesting. (Which is not surprising, considering that the problem is after all undecidable in general. Also, as far as I know, bounds checks are normally not added by C compilers, and there are depressing limits to what theorem provers can figure out about the usual C where you pass pointers around liberally.)
It’s similar with garbage collection, dynamic type checks, and other fancy features. Their overheads can certainly be reduced greatly by smart compilers and efficient run-time operations, sometimes to the point where there is no difference from C, but definitely not always, and often not reliably and predictably.
(Fortran, by the way, has traditionally had the advantage of being highly amenable to automatic optimization, including automatic parallelization, especially when used for typical array-based numerical computations. This has in turn led to a lot of fruitful effort put into optimizing and parallelizing numerical Fortran, leading to its unprecedented performance and acceptance in these areas, and its lead has been eroded only in recent years. You can’t possibly say that this is just due to a single popular library.)
In the end, every programming language, no matter how pure and pretty, gets turns into a bunch of loads, stores, and gotos...
I find that there’s a lot of good techniques to be learned from FP, but the supposed mathematical purity of it kind of bores me.
(http://docs.python.org/py3k/howto/functional.html) has some nice tricks in Python.