How to use SMILE to solve Bayes Nets

This is an account of downloading and using SMILE, a free-as-in-beer-but-not-open-source bayes net library. SMILE powers GENIE, a graphical bayes net tool. SMILE can do a lot of things, but I only used the simplest features—building a network and, given evidence, inferring probability distributions on the unobserved features.

After registering, there is a download page, containing binary distributions for various operating systems and architectures, including a .tar.gz intended for 32-bit Linux, which is what I picked. Possibly someone else can describe the experience of using it from Windows. The tarball turned out to contain about 50 .h files (C++ header files) and two .a files (statically linked libraries). I also downloaded the “smilehelp.zip”, which turned out to contain a .chm (compiled hypertext help). I didn’t have a viewer at the time, but gnochm worked out of the box. Most of the smilehelp text was only moderately helpful, but the appendices with tutorial code were very helpful.

As I understand it, there are a lot of different things that one could be interested in doing (influence diagrams, learning parameters, learning structure) I did the very simplest thing, building a net and performing inference on it. This is basically “tutorial 2” from the reference manual. (Note: The tutorial 2 code is not correct, but the appendix tutorial 2 code is).

The experience of using the library was not magically smooth, but then, the number of large-ish old-ish libraries which are magically smooth to use can probably be counted on the fingers of one foot. It might be obvious to some, but the friction impressed on me that even if some people are getting solid results that they trust from SMILE, I need to verify my use of it independently. Even straightforward uses of a trusted library could be buggy or (less likely) an unusual usage could reveal bugs internal to the library which existing users never encounter.

The two annoyances that I encountered were Law of Demeter ugliness and the flat conditional probability tables. The Law of Demeter is the principle that interfaces should not reveal their implementations, or alternatively, clambering through object graphs is a bad idea. It is a version of the general programmer injunction: “Decouple!”. Of course, once an interface is published, it is very hard to break backwards compatibility and revise away from it—there are almost certainly best-option-at-the-time historical reasons for the SMILE interface to be the way it is.

In order to understand the issue with the conditional probability tables, consider writing a data structure to support interactive editing of a Bayes Net. The graph-editing parts are relatively standard and straightforward, but each node has a table of probabilities associated with it, and the table’s dimensionality (number of dimensions, as well as size of dimensions) depends on the node’s parents, and they might change as the graph is edited. This sounds somewhat tricky.

The data structure that SMILE seems to have picked is a flat sequence of probabilities. The flat sequence is interpreted as a grid through the standard mixed-radix arithmetic. The first parent is the most-significant dimension, then the next parent, and so on, and the node itself is the least-significant. So if a node X has two parents, Y and Z, in that order, and all three of them have two outcomes, true and false, in that order, then the probabilities go like this:

P(X|Y&Z), P(~X|Y&Z), P(X|Y&~Z), P(~X|Y&~Z), P(X|~Y&Z), P(~X|~Y&Z), P(X|~Y&~Z), P(~X|~Y&~Z)

But of course, that’s the symbolic form. Inside someone’s code (or inside the XML), you might see something like this:

0.8 0.2 0.9 0.1 0 1 1 0

There is an iterator-like class, sysCoords, which encapsulates the mixed-radix logic one uses to navigate the matrix, but I couldn’t find a clean symbolic way to access or modify “P(foo=bar|baz)”. Maybe most of the users learn to read these sequences, or maybe most users learn parameters or use noisy-or models.

A graph with four nodes (Strategy, X, Y, Outcome), and directed arcs from Strategy to X and Y, from X to Y and Outcome, and from Y to Outcome.
The net that I built is quite simple—an absent-minded driver picks one of three strategies at random—always take the exit, always continue, or flip a fair coin. After that, the driver tries to get home in the usual way, following that strategy. To build a net with SMILE it is probably helpful to be very clear on the network structure, and in particular a standard linear order (topological sort) of the nodes in your net.

Very probably, it is almost never appropriate to build a net like this inside a executable. SMILE makes parsing a net very easy, either from XDSL format, or any of several competing formats. Building a net as an XML document would be easier to write, easier to read, and more portable. Note: The smilehelp document claims that to use the XML format, you need a separate library, but this is no longer correct. Another small quirk, which took some debugging: strings that are names of things should not contain spaces.

The best part about SMILE interface is the inference step—one method call on the network: “UpdateBeliefs()”. There is a standard “Lauritzen” algorithm for computing beliefs. Setting evidence requires the usual (LoD violating) graph navigation (from the network to the node to the node’s value), but it seemed relatively straightforward. Of course, I may have grown accustomed to the interface.

Once you get comfortable with the flat conditional tables, it seems very feasible to use SMILE as a bayes-net calculator. I look forward to the day when people discussing probability on LessWrong routinely exchange bayes nets, either in XDSL some other standard format.

If you’re interested in reading my code, it’s here. Since the SMILE authors want people to register before downloading the library (I presume so they can use the numbers to justify their research to grant-giving institutions), I didn’t include the library in my tarball, so you’ll have to download it separately.