Are Functional languages the future of programming?
Because I have been learning about Type Theory, I have become much more aware of and interested in Functional Programming.
If you are unfamiliar with functional programming, Real World Haskell describes functional programming like this:
In Haskell [and other functional languages], we de-emphasise code that modifies data. Instead, we focus on functions that take immutable values as input and produce new values as output. Given the same inputs, these functions always return the same results. This is a core idea behind functional programming.
Along with not modifying data, our Haskell functions usually don’t talk to the external world; we call these functions pure. We make a strong distinction between pure code and the parts of our programs that read or write files, communicate over network connections, or make robot arms move. This makes it easier to organize, reason about, and test our programs.
Because of this functional languages have a number of interesting differences with traditional programming. In functional programming:
Programming is lot more like math. Programs are often elegant and terse.
It is much easier to reason about programs, including proving things about them (termination, lack of errors etc.). This means compilers have much more room to automatically optimize a program, automatically parallelizing code, merging repeated operations etc.
Static typing helps (and requires) you find and correct a large fraction of trivial bugs without running the program.
Pure code means doing things with side effects (like I/O) requires significantly more thought to start to understand, but also makes side effects more explicit.
Program evaluation is defined much more directly on the syntax of the language.
A rather big problem that technical language discussions often miss is that language ecosystems are a big deal. The bigger the mindshare a language has, the more people there are you can hire to write stuff in the language, the more people there are writing middleware and libraries you can glue together to get what you want, the more people there are discovering bugs in the language implementation, and the more literature there is in how to effectively use the language. The bigger the existing codebase for the language is, the easier it will be to find some nice place to slot your new thing in instead of rigging up life support and a special interface with the alien outside environment. Uncommon languages face an uphill battle here, unless they can have some kind of special niche they excel in. Functional programming has been around a long time, so having referential transparency by default doesn’t seem to be winning hearts and minds by itself so far. Some people think that the coming sea change from cheap sequential speedups via Moore’s Law to having to actually do some very clever things with a parallel architecture is going to favor languages where you can mostly work at a level of abstraction that doesn’t involve mutable state.
Thus far, the ecosystem thing seems to be a big practical damper on things. A Microsoft or Google sized entity going nuts and starting to do everything with functional programming would probably do a lot more in terms of language adoption via a growing pool of libraries, quality implementations, training, literature, career prospects and imitator companies than a very large amount of academic R&D work.
It’s a neat topic in general. John Backus’ Turing Award lecture (PDF) from 30 years ago was one of the early expositions of the issue, and we’re still in pretty much the same situation as was described there. There are many stories on how the higher abstraction of functional programming helped with a difficult problem, but they are often characterized by the problem being quite well-specified both in its behavior and its interface to the outside world. And practical programming today is probably even less like that than it was 30 years ago. Success stories for using functional programming in the large and in anger, like you tend to end up doing when programming things people want to pay money for, are a lot rarer than stories of elegant, self-contained programs manageable by a single programmer.
There is a support group for people who want to use functional programming in the real world.
The problem with your ecosystem argument is that since the creation of ML in the early 1970s and the creation of Haskell-like languages in the mid-1980s many languages (Smalltalk, Perl, Java, Python, Javascript, Ruby, C#) were created that overcame the ecosystem problem. To the argument that none of those languages are as much as departure from the status quo as Haskell and ML are, I respond that object-orientation was a very big change.
You are underestimating the ability of the programming profession to select the best tools and overestimating what functional programming languages can do for the average programmer in industry. The argument that removing mutation from a program will make it easier for the program to exploit multiple cores (what you call “parallelism”) for example was made in 1977 in the paper by Backus you link to and has been made constantly since then, but 34 years later, the vast majority of programs that exploits multiple cores do so in ordinary, non-functional languages.
If you have a strong interest in certain areas of math, including Friendliness theory, and you also need to spend a lot of time writing programs, well, then Haskell (or Scheme using functional techniques if static type checking rubs you the wrong way) is probably worth studying because some of the things you learn from Haskell can be used in your math work as well as your programming work.
In contrast, if you just want to make a good living as a programmer, then for you to study Haskell and other functional languages is probably mostly a distraction and not the best way to deliver value to your customers or employers. Concepts and techniques that originate in the functional languages that are useful to you will become available to you when they are added to mainstream languages (e.g., as list comprehensions were added to Python, and e.g., as anonymous functions and first-class functions were added to Javascript).
I’ve heard people saying that for years now, but from what I see in practice, the main trend driven by this situation is that parallel programming in ordinary imperative languages is becoming easier, with much better support both at language level and in terms of debuggers, profilers, etc.
If you’re working with a problem that has an inherently parallel structure, you have elegant and easy to use APIs to parallelize it in pretty much any language these days. If not, I don’t see where using a functional language would help you. (It could be that a good functional programmer will find it more easy to devise a parallelizable solution to a given problem than a typical imperative programmer, but is there actually some advantage that a functional programmer enjoys over someone skilled in traditional parallel programming?)
Functional programming has been the future of programming for 50 years—and always will be. :)
Functional programming doesn’t work in applications where you have a large dataset, and need to change some parts of it asynchronously. If you have a 10G array in RAM and need to change one million elements as a function of the values of surrounding elements, you don’t want to pass it to a function that computes a new 10G array.
And, since traditional programming lets you write functions, it sounds like functional programming is just a restriction on what you can do. In which case, I can write functional code in Java.
I’m not convinced that “functional” is high on the list of useful programming distinctions. LISP and Prolog are both functional programming languages; yet they are very different.
Also, whenever you build a pure functional language, like pure LISP or pure Prolog, you find you can’t use it. So you build in a lot of stuff that isn’t functional, like seq and setf, or the cut operator. Then it isn’t easy to prove stuff anymore. (If it were possible to develop a programming language that you could prove things about using predicate logic, we would program in predicate logic.)
Java doesn’t have tail call optimization, lazy evaluation, type inference, or the right kind of parametric polymorphism. Also Java compilers and runtimes cannot rely on the absence of side effects. All this stuff is important. If you try to translate an idiomatic Haskell program to Java, you’ll be in a world of pain. See Joe Duffy’s STM retrospective for a very detailed account of how a team of brilliant people at Microsoft tried and failed to translate a popular Haskell library into C#.
This is the stuff that confuses me. What is functional programming? Is it defined as things that you can do when you program with functions, or is it a set of properties that go together for historic reasons? Why does programming using functions correlate with all these things?
I understand that lazy evaluation works best with functional programming. But I get type inference in Perl, and parametric polymorphism in Java, so they can’t rely on functional programming.
Perl doesn’t have type inference. It has dynamic typing, but that doesn’t help with the typical use case of type inference: you write terse code without types, and the typechecker tells you what’s wrong with it without having to run it. Functional programmers often expect their programs to work correctly on the first run. Here’s some slides showing how the typechecker can detect an infinite loop bug. They’re taken from a talk by Mark-Jason Dominus, a well-known Perl programmer.
Java’s implementation of parametric polymorphism is broken. Neal Gafter’s proposal for reified generics describes the problems nicely:
Asking for a definition of functional programming is asking for a flame war over words :-) Here’s my best attempt at an explanation: when people tried to program with pure functions, they discovered a whole world of new ideas that fit together in wonderful ways. It may not be obvious when you’re looking from the outside in, thinking that “it’s just programming with functions”. For example, Chris Okasaki’s red-black trees in Haskell take only 5 short lines of code to implement the rebalancing operation, because the supporting features of Haskell turned out to be a very good fit for Okasaki’s research on data structures.
If I write,
$x = “3”; $y = $x + 2;
then Perl infers that $x is an integer. Why is that not type inference?
“Type inference” is a technical term. You can’t figure out its meaning by separately examining the words “type” and “inference” and then pasting those meanings together into something that vaguely makes sense. That would be like thinking that “real numbers” are the sort of numbers that are more real and touchy-feely than other numbers, and then arguing that −1 is obviously not “real”. In short, you have to read up. A good starting point is “Hindley-Milner type inference” on Wikipedia.
I already read that article before making my comment. It says, “Type inference refers to the ability to deduce automatically the type of a value in a programming language.” And Perl does that. Perl therefore does type inference.
Perl doesn’t use the Hindley-Milner algorithm, which that article says was “the algorithm first used to perform type inference”. Not “the only way to do type inference.”
More importantly, my question about type inference is a child of my question about why this is called a property of functional programming languages. A language doesn’t have to be functional to do type inference, and a functional language doesn’t have to do type inference. LISP doesn’t do type inference. So why do people bring up type inference when defining functional programming?
People don’t say Perl does type inference because Perl doesn’t do type inference in the technical sense of the term. People have other technical terms for what Perl is doing, e.g. “implicit type conversion”. In your example, Perl does not infer that “3″ belongs to the type of integers, because that is false. First off, Perl doesn’t subdivide scalars into types like integers and strings. Secondly, if it did have separate types for integers and strings, then “3” would belong to the type of strings, which would have a different memory representation than integers. What Perl does is insert a silent conversion operation at runtime, which can sometimes give weird results, e.g. in your example it would happily treat “a” as the integer zero. (Type inference, my ass!) Similarly, there could be a conversion operation from strings like “$150” to values of type Money, but that doesn’t mean you should be talking about “type inference” to say such strings “are” values of type Money.
People bring up type inference when discussing functional programming because Haskell and ML use variants of the Hindley-Milner algorithm. I don’t want to talk about “defining” functional programming, or discuss whether some language should or shouldn’t be labeled “functional”, because such questions are ultimately about words and therefore useless.
Indeed, but what Perl does still isn’t type inference; type inference is something done at compile time, and doesn’t really make sense outside of a static type system. The first-sentence summary of a Wikipedia article should not be taken as a definition.
But then what /should/ be taken as a definition? I’m saying, “Saying that ‘type inference’ means ‘type inference’ makes sense; and Wikipedia agrees.” You say no. Citation needed.
Here’s an example of what I call type inference in Perl. It happens at execution, not at compile time. It’s also an example of why uncontrolled inference of types is bad:
sub complement { my ($string) = @_; $string =~ tr/acgt/tgca/; return $string; }
$fasta = “aaaccc”; $fasta = &complement(reverse($fasta)); print “$fasta\n”;
What do you think this code will produce?
EDIT: Your comment has been substantially edited so looks like I’ll have to do the same for mine.
“tttggg\n”, since Perl “reverse” is for reversing lists, right? But it’s not clear what that has to do with type inference. You seem to once again be using “type inference” to mean “implicit type conversion”. There’s no need to call this “type inference” because it already had a name. If Perl used type inference—and did not implicitly convert between scalars and lists of one scalar—the program would never even start, because it would contain a type error. Again, the notion of “type inference” only applies in a static typing system, so it’s not really applicable to Perl.
If you want an actual definition, then here is my attempt: “Type inference” refers to compile-time inference of types so that type errors can be detected.
Since you work in AI, you have probably heard of the unification algorithm (of Prolog fame). Well, when a “languages” geek says “type inference” they mean something whose implementation involves the unification algorithm (among other things). (Note that most programmers using a compiler that does type inference probably do not know that it involves the unification algorithm. In other words, type inference is usually explained without making reference to unification.)
The same “languages” geek would refer to what is happening in your example as “coercion”. In particular, a string is being coerced to an integer.
I disagree regarding your “large dataset’ argument. Very frequently, such problems can be easily solved in a functional language using immutable data structures that share portions of their structure. In your example, if that 10GB array were represented as a functional immutable tree with a high branching factor, you could efficiently copy the bits that changed, and point the rest of the tree back to the original copy, incurring a negligible performance penalty.
If I have a flat 10GB array of bytes, and I want to change an arbitrary sparse subset of those bytes, there’s no way to do that efficiently with pure functional programming.
I don’t know what the word “functional” means in the phrase “functional immutable tree”.
That constitutes an indictment of functional programming only if solving problems in the real world necessitates changing arbitrary sparse subsets of flat (e.g., not tree-like) arrays >10GB in size. If not, you’ve just defended imperative programming by referring to concepts and considerations important only to imperative programming.
I agree with you that Haskell is probably not going to displace imperative programming, but I would give other reasons for my conclusion. (I will speak of Haskell instead of functional languages to avoid tedious discussion of which languages are ‘functional’ languages.) In particular, the Haskell programmer has fewer ways than the imperative programmer has to estimate the resource usage of his programs. In particular, Haskell compilers engage in certain optimizations or collections of optimizations that tend radically to alter the asymptotic time complexity or memory-usage complexity of the program being compiled—and which alterations are made depends on esoteric details of the implementation of the optimizations. One optimization used by Haskell compilers and not used by Java, Lisp, etc, compilers that seems drastically to increase the difficulty of reasoning about the resource usage of Haskell programs is strictness analysis.
One can try to defend Haskell by pointing out that a programmer in this millenium should worry about resource usage only after experience with the program shows that better resource usage is actually required, and at that time, profiling is a more reliable way of finding the places that require optimization than static analysis of the program by the programmer. Well, two responses to that. First, efficiency is often enough a consideration that cannot be ignored that it is genuinely useful for the programmer to have some ways of reasoning about it that do not depend on profiling. Second, the last time I looked in detail at Haskell, although John Hughes’s group was developing some tools for profiling Haskell programs, it is clearly a more difficult task than profiling imperative programs is, and the tools are nowhere near as mature.
I have to add the disclaimer that it has been about 8 years since I stopped reading the main Haskell mailing list, and about 6 years since I wrote my last Haskell program, but Haskell itself is 21 years old, and a very similar language, Miranda, which is certainly vulnerable to the criticisms in this comment, and which was also the ‘future of programming’, is 5 years older than Haskell.
Sorry, it was an extra word. I admit that there is no way to mutate an arbitrary bunch of bytes in an array besides mutating a bunch of bytes in an array. My point was the one rhollerith makes here—frequently you can solve the same kinds of problems with similar performance by using trees instead of arrays.
This phrasing makes it sound like a bad thing. I would say: Static analysis helps you to find and correct a large fraction of trivial bugs without even running the program.
(Static typing is a subset of static analysis.)
edited.
Functional languages are popular in some circles (e.g. academics interested in theorem proving) but most large-scale software development does still uses imperative languages. This is precisely the kind of decision that we’d expect large software development companies to be rational about, and we should Aumann-update from that: functional programming is not the future of anything.
“This is precisely the kind of decision that we’d expect large software development companies to be rational about,”
Have you ever worked for a large software development company? In my experience the percentage of decisions made rationally tends asymptotically to 0 as the size of the company increases...
Are you sure? Software development seems like a fairly efficient market: low cost of entry, plenty of innovators, enormous market cap, few subsidies, rapid turnover of companies. Nothing about functional programming seems particularly prone to cognitive biases. It would be remarkable if software companies were all completely stupid and nobody had ever realised for long enough to become a billionaire.
I can’t talk much about my job, especially since I’m posting using my real name. But yes, I’m sure...
Functional programming has been “the future” for quite a while now.
IMO the interesting thing about FP is how laziness (enabled by purity) allows the programmer to express as ordinary functions things that would’ve required special syntax or compile-time preprocessing in a strict language. Because of this, many interesting ideas like parser combinators or software transactional memory get discovered first in functional programming and then get translated to other settings. (There’s more where that came from. If you want less familiar examples, see composing financial contracts or computational compactness.)
If you want code that looks like math, I recommend looking at the line of languages that started from Kenneth Iverson’s APL. They don’t look like your typical functional code, but allow very terse expression of algorithms. See K finger exercises for a taste. Yes, these little explosions of line noise are actual working programs, and they really do what they say they do.
Well, you can write a Fibonacci generator in a functional way by closely following the mathematical definition, but it will be very slow. Our compilers/interpreters are not yet that smart.
Fibonacci numbers are a terrible choice of showcase, because they have a closed-form expression (difference of two geometric progressions) computable in logarithmic time (using square-and-multiply), while most recursive solutions require linear time.
One well-known showcase where lazy functional languages clearly win is Hamming numbers. Here’s an LtU discussion where people attempt to translate a ten-line Haskell solution into C++ while keeping the same asymptotic efficiency, with varying results.
Indeed—different problems are best expressed using different styles. I like Scheme for that, it’s easy to choose either a functional or a more imperative approach whenever that makes more sense. Haskell is interesting, but I find it hard to use outside the realm of computer-sciency stuff. I remember the difficulty to get a random number, because getting one has the side-effect of change the entropy state of the universe...
No. There is no single thing that is “the future of programming”, because different problems require different solutions, and thus different tools will be appropriate for the job. There are still areas today where you are best of using assembly for example.
I think charitably interpreted, it means “the future of most real world application/non-specialist programming”. Pretty much like how object oriented is “most of real world application/non-specialist programming” today.
This thread needs mention of Erlang, a functional language for parallel programs, designed and deployed in industry.
I see there are some functional programming buffs here, so I have a question. These functional programs, as pretty as they are, must be compiled into plain old machine code to actually run. Are these compilers really so smart that they can translate pretty idiomatic functional code into high-performing binaries? Or do you have to use ugly techniques based on intimate knowledge of the compiler/interpreter to squeeze out some decent performance out of these languages?
At the shootout it looks like the fastest Haskell programs are about 2x slower than the fastest C programs, but their code contains many non-Haskellish tricks like accessing raw memory. Idiomatic Haskell code is probably slower than that.
Do you have a sense about whether this is an inherent feature of Haskell or if this gap decrease substantially as haskell compilers improve?
A gap of about 2x on a benchmark set doesn’t say much, if anything. That’s well within the usual variance you’ll get for any single language from different compilers and from different levels of programmer skill and effort put into manual optimizations. Certainly, much more work has gone into C compilers than Haskell compilers, so one would expect that there’s much more low-hanging fruit for improvement in the latter.
That said, the really interesting figures would be those for nice idiomatic Haskell, as well as figures about what percentage of code must be written using ugly hacks to achieve performance comparable to C. The power of C lies in the fact that you can write nice, manageable, and natural-looking code while having a very good idea about the machine code that comes out of each statement you write. (In fact, C was originally meant to be used with machine code directly in mind, with no need for compiler optimization, though modern architectures are too complicated for that.) Now, in almost any language, you can force a similar approach by using ugly and unnatural hacks based on the intimate knowledge of what your compiler will produce in response. However, that defeats the purpose of using a language more fancy than C, except perhaps if you can justify it by demonstrating that such ugly hacks are necessary only for a minuscule performance-critical part of the code.
I would like to see that as well, for the reasons you mention.
It would be truly really interesting if any language managed to abstract out the process of optimization significantly more than other languages.
I’m flattered by your trust in my intuition, but I don’t trust it very much myself :-) Who knows what will be fast in five years? Anyway, Haskell is already bringing much good to the world as a vehicle for research and a source of ideas for other languages (just look at the history of C#), so I think it’s okay if it never makes it into production.
Generally speaking, functional programs get compiled into the same intermediate representations as imperative languages and perform the same. They’re at an advantage where their objects are immutable, which lets the compiler skip alias analysis and continue to optimize in the cases where alias analysis fails, but at a disadvantage when they’re dynamically typed and types can’t be inferred, since they need to insert extra branches.
Some compilers are better than others and some languages are easier to write good compilers for than others, but functional/imperative is not the deciding factor.
Which exact imperative languages are you taking as benchmarks of performance here? The lower-level ones like C or Fortran where you can squeeze out amazing performance if the compiler is any good, or the fancier ones with garbage collection, dynamic typing, bounds checking, and other features with large overheads? If a language like Haskell can actually be compiled to perform comparably with the former when written naturally and without expert hacks, I find that an astonishing feat.
To give one example pair, Java (imperative) and Scala (functional) share a significant amount of compiler infrastructure and end up being pretty much the same speed.
In practice, the answer to how fast languages are is pretty complex, and sometimes counterintuitive. For example, one of the “large overheads” you mentioned, bounds checking, isn’t a significant overhead at all. Modern compilers are very aggressive about removing bounds checks when it can be proven that they’ll never fail (which is true for almost all good code), and moving them out of loops. C programs, for security reasons, are generally run past a theorem prover (splint) to ensure that they have bounds checks where they’re needed, which means that C programs and Scala programs end up with the same set of bounds checks in the compiled output, with differences due to different features of the respective theorem provers/optimizers, which operate on intermediate representations that look nothing like the original languages anyways. Similarly, garbage collection has a reputation as an expensive feature, because of bad garbage collectors, but a good compiler can prove when most objects will leave scope, and take them out of the garbage collector’s domain; it doesn’t have to be expensive. Again for dynamic typing; most dynamically typed programs can have type inference performed on them to make statically-typed programs, it’s just that the compilers aren’t always good enough to do that. (But sometimes they are. And there are plenty of dynamically typed imperative languages around, so it’s not a functional/imperative thing.)
Fortran’s reputation for speed, as far as I can tell, is mainly due to the LINPACK linear algebra library, which started in the Fortran world but is actually written in assembly to use funky vector instruction sets these days anyways. C is great for interfacing with hardware and writing schedulers and memory allocators, because it maps so closely to assembly language; and more work has been put into C compilers than compilers for any other language. But on the sort of code that compilers are good at, it’s the same speed as OCaml or Scala, because they all end up compiling to the same thing.
(My knowledge of this subject is mainly academic; I wrote a compiler in school, and I try to keep tabs on research in the field, but I’m not an expert by any means. Take this for what it’s worth.)
You make the situation with optimizing compilers sound really optimistic! Unfortunately, it seems to me that things don’t work anywhere so well in practice. Yes, the practical handling of fancy language features has come a long way from naive straightforward implementations, but I’d say you exaggerate how good it is.
For example, I haven’t followed closely the work in bounds checking elimination, but I know that a lot of papers are still being published about it, indicating that there are still large enough overheads to make the problem interesting. (Which is not surprising, considering that the problem is after all undecidable in general. Also, as far as I know, bounds checks are normally not added by C compilers, and there are depressing limits to what theorem provers can figure out about the usual C where you pass pointers around liberally.)
It’s similar with garbage collection, dynamic type checks, and other fancy features. Their overheads can certainly be reduced greatly by smart compilers and efficient run-time operations, sometimes to the point where there is no difference from C, but definitely not always, and often not reliably and predictably.
(Fortran, by the way, has traditionally had the advantage of being highly amenable to automatic optimization, including automatic parallelization, especially when used for typical array-based numerical computations. This has in turn led to a lot of fruitful effort put into optimizing and parallelizing numerical Fortran, leading to its unprecedented performance and acceptance in these areas, and its lead has been eroded only in recent years. You can’t possibly say that this is just due to a single popular library.)
In the end, every programming language, no matter how pure and pretty, gets turns into a bunch of loads, stores, and gotos...
I find that there’s a lot of good techniques to be learned from FP, but the supposed mathematical purity of it kind of bores me.
(http://docs.python.org/py3k/howto/functional.html) has some nice tricks in Python.
Functional languages are the current of programming. If you agree on a combination of the following few things as being what you mean when you say “functional programming”:
Garbage-collected
Some language support for immutability
Some language support for purity
Some language support for lazy evaluation
Good language support for first-class functions
Some libraries with functional data structures and algorithms
then most of the languages people use—C++ (C++11 at least, now that it has anonymous functions), Javascript, C#, Python, Ruby, any Lisp, Perl—support functional programming to a reasonable degree, with the notable exception of Java. (But there are lots of other cool new JVM languages like Scala, Duby and Clojure.)
The same languages often support imperative programming too, but I think that’s a feature, not a bug. There will always be some hobbyists who just want to hack together something without learning very much about programming.
I guess a better question is maybe “is functional programming the future of programmers?” In which case I think the answer is yes; it seems like FP is getting more and more mindshare among good programmers and in college curricula.
I was just rummaging through some papers, and I ran into this one that seems pertinent—it highlights some important shortcomings of fancy languages when it comes to really performance-critical work:
J. Shapiro, Programming Language Challenges in Systems Codes: Why Systems Programmers Still Use C, and What to Do About It
http://www.bitc-lang.org/docs/papers/PLOS2006-shap.pdf
I’m another functional programming (and type theory) afficionado, but I don’t think they are the future for the whole of computer programming.
Some problems, such as real-time interaction with a user through a GUI, are inherently procedural and object-oriented.
And there are some problems in distributed and/or massively parallel computing that will be more common in the future, and these problems will also demand both functional and procedural aspects for their efficient and robust solution.
And the functional languages I have played with are still very weak in dealing with error and exception handling—particularly ‘hardware-like’ out-of-memory conditions or network timeouts.
I need to disagree, though, with your inclusion of “Static typing helps (and requires) you find and correct a large fraction of trivial bugs without running the program” in your list of the advantages of functional programming. The issue of static vs dynamic typing is completely orthogonal to the question of functional vs procedural control flow. There are statically typed procedural languages (C++) as well as dynamically typed procedural languages (Smalltalk). Similarly on the functional side of the fence.
I also have to disagree that it is an advantage of functional programming that “Program evaluation is defined much more directly on the syntax of the language.” I’ve always understood that it is the direct opposite—evaluation order in a procedural language is explicit in the program syntax, whereas functional languages (being side-effect free) leave much more scope for the compiler to shift evaluation steps around.
Couple of comments:
Functional programming can do procedural programming as well (monads), you just to be more more explicit about the state thats being modified.
I agree static typing is somewhat orthogonal, but it seems to be the case that functional languages have much more powerful type systems than procedural ones. I don’t know enough to say whether similar powerful type systems can be developed for procedural languages.
What I meant by ′ program evaluation is defined …” is that in many functional languages, a term is evaluated by transforming it into another term in the same language (such as via beta reduction), but this isn’t true in procedural languages. You are correct that functional languages have a lot more options about the order in which this is done.
Probabilistic programming is currently being discussed as having a lot of potential.
I like the style of Functional programming, and I tend to use it where appropiate in non-Functional languages, which have varying levels of support for the style. In Java, I can define immutable data structures, and declare immutable variables within my methods. In C# I can also define immutable data structures but I can only declare mutable variables and just not assign them after initialization. In neither language can a data structure be marked as immutable (I can just define its fields to not allow assignment after initialization, and to use types I know are immutable), nor can methods be marked as “pure”. Javascript doesn’t even support tail recursion, which can make Functional style recursion have much worse performance than procedural style looping. (With good compiler optimizations on both styles, they perform the same.) I would like to see greater support within these languages for Functional programming.