Thanks for the link!
But actually what I had in mind is something simpler that would not necessarily need such tools to be feasible. Basically akin to taking the main argument of each approach, as expressed in natural language, without worrying too much about all the baggage at finer levels of abstraction. But I guess this is not quite what the article is about ..
Hi any idea how this would compare to just replacing the l1 loss with a smoothed l0 loss function? Something like ∑ilog(1+a|xi|) (summed across the sparse representation).