Foodpairing and Embeddings

I’ve been working on digital foodpairing and recipe generation for 7 years in a startup we founded in Copenhagen and I’d like to share some of the things I found interesting.

tldr;

The most prominent foodpairing theory[1] based on aromatic compounds is blatantly simplistic. I made word-embeddings from ingredients to show you that there are other aspects of much more importance and I hypothesise what those might be.

What is Flavour

I want to differentiate taste and flavour. There are 5 dimensions to taste: salty, sweet, bitter, sour and umami. In addition to that there are many more aromatic chemicals. The VCF (volatile compounds in food) database contains over 7k at the moment.[2] What I would like to call flavour is the overall perception of a food when eating it (there are of course more contributors, but taste and aroma are imho by far the most important).

The reason you have taste in the first place is to avoid dangerous and seek nutritious foods. Bitter is an indicator of mould, sour indicates acid, salt and sugar are obvious and umami is a glutamate detector, which is a proxy for protein. Sugar and umami are safe and needed, so even newborn babies react positively to them. Bitter is on the other side of the spectrum (most probably something bad), which is why the taste for bitter foods has to be learned—think beer, arugula, olives and negronis.[3]

Aromas can be detected with a surprising precision. We can smeel the drastic difference between caraway seeds and spearmint even though it’s the same molecule only differing in chirality.[4] How so? My favourite segway is a comparison to dogs. We all know that dogs have a tramendous sense of smell. However, their resolution of aromas is not on par with us. The dog’s snout is by a large part only a particle filter, because they breathe air from very close to the ground. Tracking hounds beat us somewhere else—the airflow. The slits on dogs’ noses allow them to create a vortex close to the ground, sampling smells from a large diameter.[3] We needed to detect poisonous plants much more than to stalk prey—therefore the much greater resolution of aromatic compounds, despite the inability to smell your cousin a day after they leave your appartment (hopefully). Aromas enter your nose from the back of your mouth as you chew (as opposed to odours coming through the nose). In short—our smell is built to differentiate plants we put in our mouth.

Olfaction: Odours vs Aromas

A mind blowing experiment for you: take a leaf of mint or any other (ideally fresh) herb and pinch your nose before you put it in your mouth. As you chew you will only taste bitterness. After a while let go of your nose and there will be an explosion of “taste” in your mouth. The “taste” of mint is actually its aroma mapped by your brain from your nose into your mouth.

The Foodpairing Theory

In 2011 prof. Sebastian Ahnert from Cambridge University made a compelling visualisation in his article[1] on foodpairing. It’s a graph of ingredients connected by the aromas they share. Turns out many make great pairings in Western cuisine, so the idea that “ingredients which share volatile compounds make a great match” has been picked up en masse. High end restaurants started serving white chocolate and caviar based on this article. Something did not ring true to me here. Where’s does the texture, the presentation, the tastes for crying out loud come into play? If I’m basically vectorising foods, why would I choose 7.000 dimensions in aromatic space?

The backbone of the flavor network

Turns out the article is rebutted in 2015. Anupam Jain and colleagues from the Indian Institute of Technology[5] showed that in Indian cuisine, the aromas of ingredients in recipes are in an exclusive relationship. Indians tend to combine ingredients that together give the largest possible set of aromas.

Shared aromas in Indian cuisine

Word Embeddings

I got my hands on a scraped English dataset of recipes (~2 million). At first I wanted to have a glance with my own eyes at what might be hiding there. I did not know what aspects of foods are important, so I chose a black box approach of making word embeddings from the recipes. A recipe was only to be a list of ingredients, so no natural language. As in (for a caprese salad): tomato, mozzarella, basil, olive oil, salt, pepper, toasted bread.

This was 8 years back, so Word2Vec is all fire at the moment. To my delight, the king and queen experiment worked; somewhat. Already at this moment I’ve learned a few things. These are some of the combinations that ended the closest to the resulting vector.

Embeddings’ compositionality

Some examples are straighforward, like apple cider—apple = cider, or brown sugar + honey = agave nectar. Others were a bit cryptic. Why the hell would fennel bulk—fennel seeds = kohlrabi? Well bite a fennel bulb with your nose pinched, it’s basically kohlrabi. Another interesting one is the sesame oil—sesame = fish sauce. Fish sauce is the essence of umami, pure umami; turns out that sesame does has a bunch of umami too, so when you remove the sesame aroma, you get a “thick umami sauce”—fish sauce.

A squash down to 2D and a visualisation turned out intuitive as well. I loved the cluster of sweet ingredients. They made a neat line of white sugar, brown sugar, honey, coconut sugar, agave nectar, stevia, dates; at first glance I though they go from the sweetest to less sweet, but then I thought maybe it might actually be the perception of healthiness of each. Who knows, but their aromas are definitely not playing the only role here.

Word Embeddings in 2d


Clustering

Here’s hierarchical clustering, to take a better look at the groupings. The first cluster that peeled off was Asian ingredients, afterwards meats (split into seafood and the rest), then starches, cheeses, vinegars, oils and so on. Here I noticed an interesting thing too—all meats are clustered in the second branch, except for a few that come in at the other side of the graph among (the dark brown in the 4th image). Pancheta, ground pork, sausage, tuna, anchovies, ham and other cured meats are more of an umami seasoning rather than a centerpiece of a dish, that’s presumably why you see them next to pickles, olives and artichokes followed by branches of dairy ingredients and tomato sauces/​pastes.

The full image is unfortnately a bit too large to share here, but here it is anyway. The connections in the center are co-occurances.

It’s imo really interesting to explore the clusters and think why do possibly some of the ingredients end up where they do. I’ll stop at the umami-meats hypothesis above and leave this open ended. At the very least we see that the way we pair foods in recipes is not a system purely based on aromatic compounds. There’s culture, locality, tradition, texture, taste and probably many other dimensions that chefs and cooks intuitively take into account.

Practical advice built on top of this whole excercise over the years (tested out on many users) is this—don’t think about aroma at all if you’re a beginner cook.

Here are the things that are more important:

1) Your food has all the tastes (bitter is not necessary) and none of them is too intense
2) There’s mutliple textures—crunchy, smooth, chewy
3) You can tell ingredients apart in the finished dish as much as you can
4) Have any aroma in the food, doesn’t matter what spice you use as long as it’s not overwhelming

This is the kind of engineering basis of gastronomy. Aromas play an aesthetic role on top of this foundation. Your house needs walls, a roof, doors and windows; doesn’t matter if the roof is flat or pointy. First make sure your house doesn’t fall down, only then try make it look like the temple you’ve once seen in Hong Kong.

To Conclude

My motivation for sharing all this is partly that I think there’s much more information hidden in embeddings, even though it might be obscure. One can imho learn from just looking at embedding spaces, clustering them, trying to find some sort of hierarchy or an interesting linear relationship. Afterall they are a lens into the space of meaning and I think we don’t appreciate that enough. The aroma-based foodpairing theory is here imho shown quite clearly not to play the fundamental role in how we come up with recipes, even though I can’t quite tell you what is.

There’s more I’d like to share about foods and recipes. Maybe the useful fact that there’s only a handful of recipes in the world that only differ by their aroma, but their preparation is often identical; and how these meta-recipes work. But on that some other time; if you guys don’t eat me alive for my first submission.

  1. ^
  2. ^
  3. ^
  4. ^
  5. ^