I don’t think the Hamming advice is so great. It’s akin to asking, “What are the highest salary professions? Why aren’t you entering them?”.
Academia is a market-place. Everyone wants high research impact, for a given expenditure of time. Some opportunities are higher-value than others, but as those opportunities appear, other researchers are going to identify them too.
So in academia, as in the economy, it’s better to identify your comparative advantage—both short-term, and long-term. You usually need to publish something quickly, so you need to know what you can do right away. But you also want to plan for the medium and long-term, too. It’s a difficult trade-off.
I totally agree. But in the job market, I have search tools to find the best job close to where I live, within my skills, and in my salary range to maximize my comparative advantage. And don’t even get me started on all the tools and advice you can get for the stock market. But there is currently no tool for maximizing the comparative advantage of volunteer work. The good news for me is that there are a lot of similar tools to what I want to do, so I don’t have to be terribly creative.
I have no idea how much stuff didn’t pan out, but we’ve been steadily chipping away at AI since the the 40′s, and I can’t imagine that AI was considered unimportant. We also made gigantic strides on user friendless/interfaces. I’m not sure if academia thought it was important but consumers and businesses thought it was.
AI doesn’t seem to be a single problem but a label for a broad field.
We also made gigantic strides on user friendless/interfaces. I’m not sure if academia thought it was important but consumers and businesses thought it was.
How do you know that business thought it was very important? Amount of money invested into improving user experience compared to money invested in other areas?
If we look at Word it seems like it took them till Word 2007 to finally focus on user experience instead of focusing on getting as much features as possible. The didn’t get the importance 2 decades ago but only 1 decade ago.
AI doesn’t seem to be a single problem but a label for a broad field.
I don’t really want to debate definitions. But that is exactly why I want the sorter to break down ‘big problems’ like AI into ‘little problems’ like neural networks, search, etc.
How do you know that business thought it was very important?
Because people keep spending money on marginal user interface improvements that have added up to big differences in user interfaces. The easier the interface is to use, the more people will be able to use it, the more people will buy it.
I was shocked and started wondering why I wasn’t working on anything like that? Then I wondered: Why wasn’t EVERYONE working on something like that?
If you don’t understand why, I think it’s worth your time to figure it out. Do you understand why everyone working on the most important problem is a bad idea?
haha. Yeah, later, on reflection I understood. I promise to not only show the ‘most important’ problems. The marginal utility of working on a problem is higher when no one else is doing it. But if there are neglected important problems then I want to find them.
The first computer to visit my home was there in 1972. The first to stay, 1977. Was doing simple hardware and software work in the 1980s. Began using open source software in 1991. On the internet since 1992. Nothing holds me back in using computers worse than inadequate documentation.
I contributed better documentation in the past. The field changed and my past work helps no one now. I struggle to keep up, as all do, but now I know if I help with documentation (a) it is fleeting, which is acceptable if discouraging, and (b) the noise to signal ratio in online documentation is too loud to compete with. I will help myself and help people in person but I see no way to help over all.
There was a time when a man could read every printed book on the planet. There weren’t many. That stopped being possible long ago. Knowing all there is to know to make a computer and put software on it, that is done but not much. And I’d say no one anywhere knows what it takes to make all the hardware and software for a simple smart phone. The whole field shuts out the individual now. It is no fun any more.
The quote hits close: I like open source, yet at work I am “hacking” an closed-source ERP/order processing/accounting/manufacturing etc. software.
The reason is valuing a stable reliable income over job satisfaction. As long as my customers, employers want to use this, instead of open-source competitors I am OK with it.
Why do customers want to use this ERP not the open source ones?
ERP has intrictate features not obvious to users yet crucially important. Consider your FIFO inventory valuation. Suppose you buy something with a purchase invoice, build it as a component into another product, move it to another warehouse, sell it, and then return from the customer as it was a mistake during invoice posting, so it comes back to inventory. And at this point you book another purchase invoice, a shipping / delivery cost for the original purchase which increases the cost / inventory value of your original purchase, and that cost should go through the whole chain eventually increasing the value of the finished product you just returned from the customer! This is the only way it is legal (in the EU at least) and it is HARD! Because there are software with a wide user base for 20-25 years and they still don’t get it right, have bugs in thus process and bugs here are fully UNACCEPTABLE because it means not something minor like your software crashing. It means something major like the earnings you put into your P/L (income statement) are completely wrong (as COGS is a big part of it) and the taxes you pay are wrong and you get fined and the CEO goes into prison.
Customers have no way of verifying this at a sales demo
They go for big establishes brand names, SAP Oracle Microsoft etc. They hope a hundred thousand companies cannot all have their earnings, P/L, income statement ,COGS, hence inventory valuation wrong so hopefully the big software houses do it right, and not buggy. Well, this is half true.
They are kind of right. I investigated back then two open source ERP, Compiere in 2004 and TinyERP turned OpenERP turned odoo.com in 2009 and they did not even TRY this process. There was no FIFO cost forwarding whatsoever! It was like you put in the purchase price of an item as a master data and job done. UGH.
The majors are still buggy in corner cases, at least as I can see from the feedback of others, and hands-on with the one I am working with
These are not acceptable bugs like OK Firefox is something crashing, so what. They are “NASA bugs”, things that Must Not Be.
Don’t confuse open source with free. There are companies who manage to sell open source software. Especially software that takes a lot of adaption to a particular use case of a company.
In case it helps, here’s a rough list of the thoughts that have come to mind:
Simplicity is usually best with voting systems. It may be worth looking at a reddit style up/down system for popularity. With importance you probably want high/mid/low. If you track the ‘importance profile’ of a user, you could use that to promote projects to their attention that other users with similar profiles find important. Also, in all these rankings it should be clear to the user exactly what metric is being used.
Make use of the wisdom of crowds by getting users to evaluate projects/tasks/comments for things like difficulty, relevance, utility, marginal utility—along the lines of this xkcd comic.
It seems to me that good open source management tool should direct attention to the right places. Having inbuilt problem flags that users can activate to have the problem brought to the attention of someone who can solve it seems like a good idea.
Skill matching. Have detailed skill profiles for users and have required skills flagged up for tasks.
Could try breaking projects up into a set of tasks, sub-tasks and next actions a-la Getting Things Done
Duolingo provides free language courses. They plan to make this financially viable by crowd sourcing translations from their students. Perhaps a similar thing could be implemented—maybe by getting university students involved.
Gamification across a broad range of possible tasks. Give points for things like participation, research, providing information. While rewarding programmers for coding is good, we should seek to reward anything that lowers the activation energy of a task for someone else.
Keep a portfolio of work that each user has completed in a format that is easy for them to access, customize and print out and reference in job applications.
Encourage networking between users with similar skills, areas of interest and the like. This would provide a benefit to being part of the community.
You could have a Patreon like pledging system where people pledge a small amount to projects they consider important. When the project reaches a milestone the contributors then get rewarded a portion of the pledge.
Duolingo provides free language courses. They make this financially viable by crowd sourcing translations from their students. Perhaps a similar thing could be implemented—maybe by getting university students involved.
Duolingo doesn’t make profits. It’s investors believe in it, but it still to early to say that it’s really financially viable.
Good advice. Since I wanted a lot of things to be weighted when determining the search order, I considered just hiding all the complexity ‘under the hood’. But if people don’t know what they are voting on they might be less inclined to vote at all.
Since I wanted a lot of things to be weighted when determining the search order, I considered just hiding all the complexity ‘under the hood’.
The way I view it, search rankings are a tool like any other. In my own experience in academic research I’ve always found that clearly defined search rankings are more useful to me than generic rankings; if you know how the tool works, it’s easier to use correctly. That said, there’s probably still a place for a complex algorithm alongside other search tools, it just shouldn’t be the only search tool.
But if people don’t know what they are voting on they might be less inclined to vote at all.
Well I think it’s more a matter of efficiently extracting information from users. Consider the LessWrong karma system, while it serves its purpose of filtering out spam, its a very noisy indicator of anything other than ‘people thought this comment should get karma’. This is because some users think that we should vote things up or down based on different criteria, such as: do I agree with this comment?; did this comment contain valuable information for me?; was this an amusing comment?; was this comment well reasoned?; and so on.
By clearly defining the voting criteria, you’re not just making users more likely to vote, you’re also more efficiently extracting information out of them. From a user perspective this can be really useful, knowing that a particular rating is the popularity or the importance of a project, they can then choose whether they want to pay attention to or ignore that metric.
I think the reason something like this doesn’t already exist are the barriers to entry. I’ve often toyed with the idea of contributing to several large linux distributions, like Ubuntu. But the barriers to entry for this are immense. Often you have to deal with obscure version control systems, archaic build tools, and strange development environments before you can even get to the actual process of contributing to the projects. In my experience, the more important a software project is, the harder it is to contribute. Contributing some extra CSS on a webui project on github is often very easy, whereas fixing a bug in a c library that is on a bzr version control system using a modified 1998 version of gnu autotools to make itself is hard.
If these barriers to entry were lower, it would be easier to contribute.
That’s a good point. I shouldn’t just list skills by the goal of all the similar projects but also by the individual projects. If one Linux distribution is way easier to contribute to than the others, users should know that.
I can offer one data point. A lot of people consider P vs. NP to be the ‘most important problem’ in CS. I personally don’t think it’s a very interesting problem. I don’t think any solution to it would necessarily reveal any major new insights. And I think it’s far beyond my abilities to work on.
I’d be pretty surprised if someone solved PvNP without major new insights, but in any case Hamming (in the same place the quotation comes from) is quite explicit that by “important” he doesn’t mean “important, conditional on a successful breakthrough” but “important, taking into account the likelihood of getting somewhere”. (His example: no one in physics is seriously working on time travel, teleportation or antigravity, not because they wouldn’t be vastly exciting and useful but because we don’t have a credible line of attack on any of them.)
I don’t think any solution to P vs. NP would necessarily reveal any major new insights. To understand why, you have to think about the nature of the problem. P vs. NP is about (a) worst-case (b) asymptotic complexity of (c) decision problems. However, most of the time we are not concerned with worst-case performance, we are not concerned with asymptotic complexity, and we are not concerned with decision problems.
It takes a bit more effort to formalize this than it does to formalize P vs. NP, but the benefit of this added formalism is that you get a set of problems with far more theoretical and practical importance.
If you meant, strictly and technically, only that a solution to P v NP wouldn’t necessarily do much, then indeed I agree. But if you mean it’s quite likely that if P v NP is resolved then it won’t make much difference, I’m not so convinced.
A solution to P v NP could make a big difference in at least three ways.
Directly. If the answer were that P=NP then it would mean that “efficient” solutions exist to many families of problems not currently believed to have them. These include problems about which it’s reasonable to care a lot—e.g., the problem of finding a proof or disproof, if one exists, of a given mathematical proposition. Now, of course (1) it could be that the only solutions to these take time of order N^1000, with a large constant of proportionality and large lower-order terms, and (2) the proof that P=NP might fail to tell us how to find those solutions at all.
Via the way it’s proved. At present, so far as I know no one has any credible idea how to prove or disprove P=NP. Some of the more plausible-looking approaches are in some sense known not to work. So a proof or disproof of P=NP would most likely need some powerful new ideas. Again, of course it’s possible (1) that actually there’s some extraordinarily unenlightening proof that somehow avoids containing any discernible new ideas at all, or (2) that the new ideas needed are perfectly chosen not to give any insight into other questions of computational complexity. But either of those would be quite a surprise. (And the most plausible way for #1 to happen would be that the proof shows P=NP by supplying an incomprehensible 100-page algorithm for 3SAT or something—but, see above, a constructive proof of P=NP would itself be very exciting.)
Via intuition adjustments. If the answer were that P=NP it would overturn what almost everyone in the field currently thinks. Lots of complexity theorists’ intuitions would need radical readjustment. If you take a whole lot of very smart people who have been hamstrung by completely wrong intuitions and you fix those intuitions, it seems reasonably probable that something good will result.
Again: I don’t disagree with the literal sense of your original (and now reiterated) claim: a solution to P v NP wouldn’t necessarily reveal new insights; maybe there are (apparently epistemically) possible worlds in which P v NP gets resolved and nothing else changes all that much. But if I have to make a bet on how much difference the resolution of P v NP makes, at least if it comes any time soon, I’d bet on “large” over “small”.
the problem of finding a proof or disproof, if one exists, of a given mathematical proposition.
Again, P vs. NP is about decision problems. You’re making the classical mistake, repeated often in pop-sci accounts by non-mathematicians, that P vs. NP is about finding mathematical proofs. It isn’t. An example of a decision problem would be “does a proof to this proposition exist?” or “does a proof to this proposition shorter than N characters exist?” not “find a proof to this proposition.” Depending on the type of proof system, the decision question could be either trivial, NP-complete, NP-hard, or undecidable. In most proof systems we care about, it’s not NP-complete. It’s often NP-hard.
If the answer were that P=NP then it would mean that “efficient” solutions exist to many families of problems not currently believed to have them.
It wouldn’t. You should change would to could. If an efficient solution to SAT had a provable lower bound of O(n^100000000), it makes very little practical difference whether P = NP or not.
So a proof or disproof of P=NP would most likely need some powerful new ideas.
Most likely, and the proof would be interesting indeed and worthy of study. But could it prove anything other than P vs. NP? Would it give any information about which of the five worlds we live in? Maybe, maybe not. Despite the cosmetic similarity, the five worlds question is mathematically quite different from P vs. NP, so there’s no reason to think that it would.
If the answer were that P=NP it would overturn what almost everyone in the field currently thinks.
It’s very, very, very unlikely that the answer is P=NP.
All your points merely serve to reinforce my initial point. You’re talking about non-asymptotic complexity of non-decision problems. This is what people really care about. All I’m saying is that there’s no reason to assign high likelihood to a proof of P vs. NP offering any new insights into the problems people really care about. All you’ve said so far is consistent with this claim.
I beg your pardon; I was sloppy. What “efficient” P=NP would trivialize is, indeed, not finding proofs but determining which propositions are provable. In practice I very strongly suspect that this would make finding proofs vastly easier too; having established that a certain proposition has a proof, one could “try out” various candidate intermediate steps and figure out a path by a sort of bisection procedure. And it wouldn’t be a big surprise if inspection of a proof-existence-checking algorithm found a way to extract actual proofs from it, but of course that would depend on the details of how it turned out that P=NP.
No, wait. Surely the following questions are in NP. (1) “Does proposition p have a proof whose length is N characters?” (2) “Does proposition p have a proof whose length is N characters and whose first n characters are c1...cn?” If so, then P=NP implies a polynomial-time procedure where first you nail down a possible length, then you find characters one by one. The time this takes is polynomial in the length of the proof, so of course it’ll be less than helpful if the proof is inhumanly long, but the point is that the problem here isn’t that PvNP is only about decision problems.
It wouldn’t. [...]
I do think it would have been better if you had read the whole of the paragraph you were criticizing and observed (1) that there was a reason for the quotation marks around “efficient” and (2) that I already explicitly made the same point you were making.
could it prove anything other than P vs. NP?
Gosh, what a good point. If only I’d thought to say something about that. Oh, wait, I did; that was actually what pretty much the whole of that paragraph was about.
It’s very, very, very unlikely that the answer is P=NP.
Right. (“It would overturn what almost everyone in the field currently thinks”.)
no reason to assign high likelihood [...] new insights into the problems people really care about
You have given no justification at all for this; you’ve merely made the (correct but I think not very interesting) observation that it’s theoretically possible so far as we currently know that someone might resolve PvNP without such new insights, and passed without comment over everything I wrote that addresses the question of how plausible that really is.
Obviously you’re under no obligation to engage at all with what I write, but if you don’t want to then why bother replying at all?
If so, then P=NP implies a polynomial-time procedure where first you nail down a possible length, then you find
characters one by one.
Yup! Even the decision version of the P vs NP question is certainly about search, for this reason (contrary to what the grand parent is claiming). One can generally solve optimization problems via a decision algorithm by paying at most a polynomial cost, precisely as you suggested.
I seem to remember that Levin’s work was about the optimization version of P vs NP, and Cook’s work was about the decision version. They are both given credit these days.
This is only true if P=NP, which is very unlikely! If P=/=NP, the far more likely scenario, then we get no new information about the time complexity of search.
If the decision versions of sets P and NP are not equal, then this implies the search versions of sets P and NP are not equal also. This is because if they were equal, we could use a polynomial algorithm for a search version of an NP-complete problem to solve the decision version of that NP-complete problem in polynomial time, and thus all problem in the decision version of NP are also in the decision version of P.
If the search versions of sets P and NP are not equal, then we get a lot of information about the time complexity of finding proofs that are easy to check as valid (e.g. most proofs mathematicians care about). In particular, there will be no general way of finding an easy-to-check-as-valid proof (of length N) of a statement in time polynomial in N, if the statement and corresponding proof encode an NP-complete problem and solution of some sort.
That is, “either there is no insight in mathematics in general, or the Church-Turing thesis is false, and humans are capable of crazy things.” (???)
Gosh, what a good point. If only I’d thought to say something about that. Oh, wait, I did; that was actually what pretty much the whole of that paragraph was about.
I know what you wrote. What I’m emphasizing is the mathematical differences between the five worlds question and the P vs. NP question, to counter your point.
Right. (“It would overturn what almost everyone in the field currently thinks”.)
Again, I know what you wrote. But this is a point that is worth repeating. And I’ll repeat it again: It is very, very, very, very unlikely that P=NP.
You have given no justification at all for this
The burden of proof is on you, here. I’m saying that there’s no reason to assign high belief to what you claim. You must provide a reason for assigning high belief.
That would be an appropriate response if I had just said “Nope, P v NP is really important and resolving it would probably have useful and interesting consequences”. Since what I actually did was to give some reasons for thinking that, and what you actually did was to dismiss them out of hand and (in some cases) write as if I simply hadn’t written what I did, I’m not sure what further response I can usefully make.
As for the burden of proof, allow me to remind you that your comment that launched this discussion began this way:
A lot of people consider P vs. NP to be the ‘most important problem’ in CS.
It seems to me that that fact is already enough to put the burden of proof on the one who contends that P v NP doesn’t matter. You might, of course, be right, but it seems that “a lot of people”—including, my impression is, most people with substantial expertise in this stuff—take a different view.
(Of course “burden of proof” is, when it’s a useful idea at all, merely shorthand for something to do with prior probabilities. So I can rephrase the above: given that experts in the field generally appear to think that P v NP is important and interesting, and to expect valuable consequences if it’s resolved, it seems obviously reasonable to me to give substantial probability to that in advance of investigating the question more directly. And it turns out that the question is really difficult to investigate directly, and our posterior probability estimates should be largely determined by our priors.)
I will say at slightly greater length why I think it likely (not certain) that a resolution to P v NP would be interesting and valuable:
It’s a question lots of very smart people have thought about a lot, without getting close to a proof.
Therefore, it seems likely that resolving the question will require new ideas that are difficult even for very intelligent experts to find.
It’s quite a central question in its field: lots of other questions regarded as important seem to be closely related to it (e.g., questions about equality or inclusion of other pairs of complexity classes).
Therefore, it seems likely that new ideas powerful enough to resolve P v NP might bring progress on other fairly central open problems in the field.
The suspicion that resolving P v NP is difficult and requires new ideas comes not only from the fact that lots of clever people have failed to do it, but also from various theorems that show that whole classes of approaches can’t work. (E.g., there are oracles relative to which P=NP and others relative to which P!=NP, so no relativizable proof can resolve PvNP either way.)
This at least suggests (though it doesn’t prove) that a resolution of PvNP will need to engage in some fairly deep manner with the actual nature of P and NP. Which in turn seems like reason to think it would require useful new ideas.
The most obvious way for a resolution of PvNP to fail to have interesting new ideas in it would be for it to proceed by exhibiting a polynomial-time for some problem known to be NP-hard—but for that algorithm to be fearsomely and impenetrably complicated, supplying no insight to human readers.
But that would mean showing that P=NP, which would itself be a huge surprise (and overturn lots of things that experts thought almost certainly true). Repairing all those experts’ intuitions seems like it would be likely to lead to progress elsewhere.
And P=NP would be useful in itself, unless the polynomial-time algorithms for NP-hard problems always turned out to take time like 1000000000 n^1000000000. Which seems not very likely since the exponents and constant factors in actual algorithms people have found are very seldom like that.
Historically, resolutions of famous hard mathematical problems have often turned out to help (more or less directly) to make progress beyond the mere resolution of those problems.
For instance, the ideas that led to the proof of “Fermat’s last theorem” have since then also led to a proof of the independently important Taniyama-Shimura conjecture (FLT having required TS only for “semistable” elliptic curves).
Although I say “independently important” it might be better to say that among mathematicians as opposed to readers of popular mathematics books FLT was interesting only because of its connection to the Taniyama-Shimura conjecture.
Another recent example: The key idea behind Perelman’s resolution of the Poincare conjecture (“Ricci flow”) has been used to prove other (related but) separately interesting questions in differential geometry, such as the “quarter-pinched sphere theorem”.
There’s no obvious reason why something similar wouldn’t happen with a resolution of P v NP.
A problem can be in P and still not be computationally feasible; O(n^10000) algorithms are still in P.
Almost always a very good approximate answer is good enough. The travelling salesman problem is in NP, but we have efficient enough approximations (not just in P, but small exponents) that the benefit of getting to perfect is low.
The major implication I care about concerns the feasibility of A.I., specifically because I believe “analogy finding” and knowledge reuse to rest on tractable sollutions to the subgraph isomorphism problem.
You could build an AI which restricts itself to solving small cases of the problem, finding “petty” analogies, and growing more complex ones ontop. Any (computationally efficient) algorithm which could find isomorphisms for larger problems directly in polynomial time would blow a “bottom up” analogy finder out of orbit.
Though ofcourse if anyone reading could point out a method for finding subgraph isomorphisms on large graphs that is “good enough” by some standard, I’ll be happy to learn of it :)
I don’t think the Hamming advice is so great. It’s akin to asking, “What are the highest salary professions? Why aren’t you entering them?”.
Academia is a market-place. Everyone wants high research impact, for a given expenditure of time. Some opportunities are higher-value than others, but as those opportunities appear, other researchers are going to identify them too.
So in academia, as in the economy, it’s better to identify your comparative advantage—both short-term, and long-term. You usually need to publish something quickly, so you need to know what you can do right away. But you also want to plan for the medium and long-term, too. It’s a difficult trade-off.
It is worth noting that advice is given to particular people, and Hamming was asking this at Bell Labs.
I totally agree. But in the job market, I have search tools to find the best job close to where I live, within my skills, and in my salary range to maximize my comparative advantage. And don’t even get me started on all the tools and advice you can get for the stock market. But there is currently no tool for maximizing the comparative advantage of volunteer work. The good news for me is that there are a lot of similar tools to what I want to do, so I don’t have to be terribly creative.
You did give me an idea. Let me edit my post.
Did the most important achievements that happened in academia in any field 30 years ago came from the problems considered most important at that time?
I have no idea how much stuff didn’t pan out, but we’ve been steadily chipping away at AI since the the 40′s, and I can’t imagine that AI was considered unimportant. We also made gigantic strides on user friendless/interfaces. I’m not sure if academia thought it was important but consumers and businesses thought it was.
AI doesn’t seem to be a single problem but a label for a broad field.
How do you know that business thought it was very important? Amount of money invested into improving user experience compared to money invested in other areas?
If we look at Word it seems like it took them till Word 2007 to finally focus on user experience instead of focusing on getting as much features as possible. The didn’t get the importance 2 decades ago but only 1 decade ago.
I don’t really want to debate definitions. But that is exactly why I want the sorter to break down ‘big problems’ like AI into ‘little problems’ like neural networks, search, etc.
Because people keep spending money on marginal user interface improvements that have added up to big differences in user interfaces. The easier the interface is to use, the more people will be able to use it, the more people will buy it.
[Here is a guide to graphic interfaces over time]. (http://toastytech.com/guis/guitimeline.html) Start about 30 years ago when the Macintosh comes out.
If you don’t understand why, I think it’s worth your time to figure it out. Do you understand why everyone working on the most important problem is a bad idea?
haha. Yeah, later, on reflection I understood. I promise to not only show the ‘most important’ problems. The marginal utility of working on a problem is higher when no one else is doing it. But if there are neglected important problems then I want to find them.
The first computer to visit my home was there in 1972. The first to stay, 1977. Was doing simple hardware and software work in the 1980s. Began using open source software in 1991. On the internet since 1992. Nothing holds me back in using computers worse than inadequate documentation.
I contributed better documentation in the past. The field changed and my past work helps no one now. I struggle to keep up, as all do, but now I know if I help with documentation (a) it is fleeting, which is acceptable if discouraging, and (b) the noise to signal ratio in online documentation is too loud to compete with. I will help myself and help people in person but I see no way to help over all.
There was a time when a man could read every printed book on the planet. There weren’t many. That stopped being possible long ago. Knowing all there is to know to make a computer and put software on it, that is done but not much. And I’d say no one anywhere knows what it takes to make all the hardware and software for a simple smart phone. The whole field shuts out the individual now. It is no fun any more.
The quote hits close: I like open source, yet at work I am “hacking” an closed-source ERP/order processing/accounting/manufacturing etc. software.
The reason is valuing a stable reliable income over job satisfaction. As long as my customers, employers want to use this, instead of open-source competitors I am OK with it.
Why do customers want to use this ERP not the open source ones?
ERP has intrictate features not obvious to users yet crucially important. Consider your FIFO inventory valuation. Suppose you buy something with a purchase invoice, build it as a component into another product, move it to another warehouse, sell it, and then return from the customer as it was a mistake during invoice posting, so it comes back to inventory. And at this point you book another purchase invoice, a shipping / delivery cost for the original purchase which increases the cost / inventory value of your original purchase, and that cost should go through the whole chain eventually increasing the value of the finished product you just returned from the customer! This is the only way it is legal (in the EU at least) and it is HARD! Because there are software with a wide user base for 20-25 years and they still don’t get it right, have bugs in thus process and bugs here are fully UNACCEPTABLE because it means not something minor like your software crashing. It means something major like the earnings you put into your P/L (income statement) are completely wrong (as COGS is a big part of it) and the taxes you pay are wrong and you get fined and the CEO goes into prison.
Customers have no way of verifying this at a sales demo
They go for big establishes brand names, SAP Oracle Microsoft etc. They hope a hundred thousand companies cannot all have their earnings, P/L, income statement ,COGS, hence inventory valuation wrong so hopefully the big software houses do it right, and not buggy. Well, this is half true.
They are kind of right. I investigated back then two open source ERP, Compiere in 2004 and TinyERP turned OpenERP turned odoo.com in 2009 and they did not even TRY this process. There was no FIFO cost forwarding whatsoever! It was like you put in the purchase price of an item as a master data and job done. UGH.
The majors are still buggy in corner cases, at least as I can see from the feedback of others, and hands-on with the one I am working with
These are not acceptable bugs like OK Firefox is something crashing, so what. They are “NASA bugs”, things that Must Not Be.
Don’t confuse open source with free. There are companies who manage to sell open source software. Especially software that takes a lot of adaption to a particular use case of a company.
Here’s something that has some aspects of what you’re asking for, though it doesn’t do exactly what you want: http://www.codetriage.com/
In case it helps, here’s a rough list of the thoughts that have come to mind:
Simplicity is usually best with voting systems. It may be worth looking at a reddit style up/down system for popularity. With importance you probably want high/mid/low. If you track the ‘importance profile’ of a user, you could use that to promote projects to their attention that other users with similar profiles find important. Also, in all these rankings it should be clear to the user exactly what metric is being used.
Make use of the wisdom of crowds by getting users to evaluate projects/tasks/comments for things like difficulty, relevance, utility, marginal utility—along the lines of this xkcd comic.
It seems to me that good open source management tool should direct attention to the right places. Having inbuilt problem flags that users can activate to have the problem brought to the attention of someone who can solve it seems like a good idea.
Skill matching. Have detailed skill profiles for users and have required skills flagged up for tasks.
Could try breaking projects up into a set of tasks, sub-tasks and next actions a-la Getting Things Done
Duolingo provides free language courses. They plan to make this financially viable by crowd sourcing translations from their students. Perhaps a similar thing could be implemented—maybe by getting university students involved.
Gamification across a broad range of possible tasks. Give points for things like participation, research, providing information. While rewarding programmers for coding is good, we should seek to reward anything that lowers the activation energy of a task for someone else.
Keep a portfolio of work that each user has completed in a format that is easy for them to access, customize and print out and reference in job applications.
Encourage networking between users with similar skills, areas of interest and the like. This would provide a benefit to being part of the community.
You could have a Patreon like pledging system where people pledge a small amount to projects they consider important. When the project reaches a milestone the contributors then get rewarded a portion of the pledge.
Duolingo doesn’t make profits. It’s investors believe in it, but it still to early to say that it’s really financially viable.
Thanks. I’ve edited the comment to reflect this better.
Good advice. Since I wanted a lot of things to be weighted when determining the search order, I considered just hiding all the complexity ‘under the hood’. But if people don’t know what they are voting on they might be less inclined to vote at all.
The way I view it, search rankings are a tool like any other. In my own experience in academic research I’ve always found that clearly defined search rankings are more useful to me than generic rankings; if you know how the tool works, it’s easier to use correctly. That said, there’s probably still a place for a complex algorithm alongside other search tools, it just shouldn’t be the only search tool.
Well I think it’s more a matter of efficiently extracting information from users. Consider the LessWrong karma system, while it serves its purpose of filtering out spam, its a very noisy indicator of anything other than ‘people thought this comment should get karma’. This is because some users think that we should vote things up or down based on different criteria, such as: do I agree with this comment?; did this comment contain valuable information for me?; was this an amusing comment?; was this comment well reasoned?; and so on.
By clearly defining the voting criteria, you’re not just making users more likely to vote, you’re also more efficiently extracting information out of them. From a user perspective this can be really useful, knowing that a particular rating is the popularity or the importance of a project, they can then choose whether they want to pay attention to or ignore that metric.
Have you read The Cathedral and The Bazaar?
Not really. Is there a lot to the book beyond the Wikipedia summary?
It certainly describes “efficient open source” in a detailed way.
So, basically a review site for OSS?
It’s kind of a cross between a job site and a review site. In theory.
I think the reason something like this doesn’t already exist are the barriers to entry. I’ve often toyed with the idea of contributing to several large linux distributions, like Ubuntu. But the barriers to entry for this are immense. Often you have to deal with obscure version control systems, archaic build tools, and strange development environments before you can even get to the actual process of contributing to the projects. In my experience, the more important a software project is, the harder it is to contribute. Contributing some extra CSS on a webui project on github is often very easy, whereas fixing a bug in a c library that is on a bzr version control system using a modified 1998 version of gnu autotools to make itself is hard.
If these barriers to entry were lower, it would be easier to contribute.
That’s a good point. I shouldn’t just list skills by the goal of all the similar projects but also by the individual projects. If one Linux distribution is way easier to contribute to than the others, users should know that.
I can offer one data point. A lot of people consider P vs. NP to be the ‘most important problem’ in CS. I personally don’t think it’s a very interesting problem. I don’t think any solution to it would necessarily reveal any major new insights. And I think it’s far beyond my abilities to work on.
I’d be pretty surprised if someone solved PvNP without major new insights, but in any case Hamming (in the same place the quotation comes from) is quite explicit that by “important” he doesn’t mean “important, conditional on a successful breakthrough” but “important, taking into account the likelihood of getting somewhere”. (His example: no one in physics is seriously working on time travel, teleportation or antigravity, not because they wouldn’t be vastly exciting and useful but because we don’t have a credible line of attack on any of them.)
I don’t think any solution to P vs. NP would necessarily reveal any major new insights. To understand why, you have to think about the nature of the problem. P vs. NP is about (a) worst-case (b) asymptotic complexity of (c) decision problems. However, most of the time we are not concerned with worst-case performance, we are not concerned with asymptotic complexity, and we are not concerned with decision problems.
A related, but far more interesting question, is the “five worlds” problem: http://blog.computationalcomplexity.org/2004/06/impagliazzos-five-worlds.html
It takes a bit more effort to formalize this than it does to formalize P vs. NP, but the benefit of this added formalism is that you get a set of problems with far more theoretical and practical importance.
If you meant, strictly and technically, only that a solution to P v NP wouldn’t necessarily do much, then indeed I agree. But if you mean it’s quite likely that if P v NP is resolved then it won’t make much difference, I’m not so convinced.
A solution to P v NP could make a big difference in at least three ways.
Directly. If the answer were that P=NP then it would mean that “efficient” solutions exist to many families of problems not currently believed to have them. These include problems about which it’s reasonable to care a lot—e.g., the problem of finding a proof or disproof, if one exists, of a given mathematical proposition. Now, of course (1) it could be that the only solutions to these take time of order N^1000, with a large constant of proportionality and large lower-order terms, and (2) the proof that P=NP might fail to tell us how to find those solutions at all.
Via the way it’s proved. At present, so far as I know no one has any credible idea how to prove or disprove P=NP. Some of the more plausible-looking approaches are in some sense known not to work. So a proof or disproof of P=NP would most likely need some powerful new ideas. Again, of course it’s possible (1) that actually there’s some extraordinarily unenlightening proof that somehow avoids containing any discernible new ideas at all, or (2) that the new ideas needed are perfectly chosen not to give any insight into other questions of computational complexity. But either of those would be quite a surprise. (And the most plausible way for #1 to happen would be that the proof shows P=NP by supplying an incomprehensible 100-page algorithm for 3SAT or something—but, see above, a constructive proof of P=NP would itself be very exciting.)
Via intuition adjustments. If the answer were that P=NP it would overturn what almost everyone in the field currently thinks. Lots of complexity theorists’ intuitions would need radical readjustment. If you take a whole lot of very smart people who have been hamstrung by completely wrong intuitions and you fix those intuitions, it seems reasonably probable that something good will result.
Again: I don’t disagree with the literal sense of your original (and now reiterated) claim: a solution to P v NP wouldn’t necessarily reveal new insights; maybe there are (apparently epistemically) possible worlds in which P v NP gets resolved and nothing else changes all that much. But if I have to make a bet on how much difference the resolution of P v NP makes, at least if it comes any time soon, I’d bet on “large” over “small”.
Again, P vs. NP is about decision problems. You’re making the classical mistake, repeated often in pop-sci accounts by non-mathematicians, that P vs. NP is about finding mathematical proofs. It isn’t. An example of a decision problem would be “does a proof to this proposition exist?” or “does a proof to this proposition shorter than N characters exist?” not “find a proof to this proposition.” Depending on the type of proof system, the decision question could be either trivial, NP-complete, NP-hard, or undecidable. In most proof systems we care about, it’s not NP-complete. It’s often NP-hard.
It wouldn’t. You should change would to could. If an efficient solution to SAT had a provable lower bound of O(n^100000000), it makes very little practical difference whether P = NP or not.
Most likely, and the proof would be interesting indeed and worthy of study. But could it prove anything other than P vs. NP? Would it give any information about which of the five worlds we live in? Maybe, maybe not. Despite the cosmetic similarity, the five worlds question is mathematically quite different from P vs. NP, so there’s no reason to think that it would.
It’s very, very, very unlikely that the answer is P=NP.
All your points merely serve to reinforce my initial point. You’re talking about non-asymptotic complexity of non-decision problems. This is what people really care about. All I’m saying is that there’s no reason to assign high likelihood to a proof of P vs. NP offering any new insights into the problems people really care about. All you’ve said so far is consistent with this claim.
I beg your pardon; I was sloppy. What “efficient” P=NP would trivialize is, indeed, not finding proofs but determining which propositions are provable. In practice I very strongly suspect that this would make finding proofs vastly easier too; having established that a certain proposition has a proof, one could “try out” various candidate intermediate steps and figure out a path by a sort of bisection procedure. And it wouldn’t be a big surprise if inspection of a proof-existence-checking algorithm found a way to extract actual proofs from it, but of course that would depend on the details of how it turned out that P=NP.
No, wait. Surely the following questions are in NP. (1) “Does proposition p have a proof whose length is N characters?” (2) “Does proposition p have a proof whose length is N characters and whose first n characters are c1...cn?” If so, then P=NP implies a polynomial-time procedure where first you nail down a possible length, then you find characters one by one. The time this takes is polynomial in the length of the proof, so of course it’ll be less than helpful if the proof is inhumanly long, but the point is that the problem here isn’t that PvNP is only about decision problems.
I do think it would have been better if you had read the whole of the paragraph you were criticizing and observed (1) that there was a reason for the quotation marks around “efficient” and (2) that I already explicitly made the same point you were making.
Gosh, what a good point. If only I’d thought to say something about that. Oh, wait, I did; that was actually what pretty much the whole of that paragraph was about.
Right. (“It would overturn what almost everyone in the field currently thinks”.)
You have given no justification at all for this; you’ve merely made the (correct but I think not very interesting) observation that it’s theoretically possible so far as we currently know that someone might resolve PvNP without such new insights, and passed without comment over everything I wrote that addresses the question of how plausible that really is.
Obviously you’re under no obligation to engage at all with what I write, but if you don’t want to then why bother replying at all?
Yup! Even the decision version of the P vs NP question is certainly about search, for this reason (contrary to what the grand parent is claiming). One can generally solve optimization problems via a decision algorithm by paying at most a polynomial cost, precisely as you suggested.
I seem to remember that Levin’s work was about the optimization version of P vs NP, and Cook’s work was about the decision version. They are both given credit these days.
This is only true if P=NP, which is very unlikely! If P=/=NP, the far more likely scenario, then we get no new information about the time complexity of search.
??? Am I missing something?
If the decision versions of sets P and NP are not equal, then this implies the search versions of sets P and NP are not equal also. This is because if they were equal, we could use a polynomial algorithm for a search version of an NP-complete problem to solve the decision version of that NP-complete problem in polynomial time, and thus all problem in the decision version of NP are also in the decision version of P.
If the search versions of sets P and NP are not equal, then we get a lot of information about the time complexity of finding proofs that are easy to check as valid (e.g. most proofs mathematicians care about). In particular, there will be no general way of finding an easy-to-check-as-valid proof (of length N) of a statement in time polynomial in N, if the statement and corresponding proof encode an NP-complete problem and solution of some sort.
That is, “either there is no insight in mathematics in general, or the Church-Turing thesis is false, and humans are capable of crazy things.” (???)
I know what you wrote. What I’m emphasizing is the mathematical differences between the five worlds question and the P vs. NP question, to counter your point.
Again, I know what you wrote. But this is a point that is worth repeating. And I’ll repeat it again: It is very, very, very, very unlikely that P=NP.
The burden of proof is on you, here. I’m saying that there’s no reason to assign high belief to what you claim. You must provide a reason for assigning high belief.
That would be an appropriate response if I had just said “Nope, P v NP is really important and resolving it would probably have useful and interesting consequences”. Since what I actually did was to give some reasons for thinking that, and what you actually did was to dismiss them out of hand and (in some cases) write as if I simply hadn’t written what I did, I’m not sure what further response I can usefully make.
As for the burden of proof, allow me to remind you that your comment that launched this discussion began this way:
It seems to me that that fact is already enough to put the burden of proof on the one who contends that P v NP doesn’t matter. You might, of course, be right, but it seems that “a lot of people”—including, my impression is, most people with substantial expertise in this stuff—take a different view.
(Of course “burden of proof” is, when it’s a useful idea at all, merely shorthand for something to do with prior probabilities. So I can rephrase the above: given that experts in the field generally appear to think that P v NP is important and interesting, and to expect valuable consequences if it’s resolved, it seems obviously reasonable to me to give substantial probability to that in advance of investigating the question more directly. And it turns out that the question is really difficult to investigate directly, and our posterior probability estimates should be largely determined by our priors.)
I will say at slightly greater length why I think it likely (not certain) that a resolution to P v NP would be interesting and valuable:
It’s a question lots of very smart people have thought about a lot, without getting close to a proof.
Therefore, it seems likely that resolving the question will require new ideas that are difficult even for very intelligent experts to find.
It’s quite a central question in its field: lots of other questions regarded as important seem to be closely related to it (e.g., questions about equality or inclusion of other pairs of complexity classes).
Therefore, it seems likely that new ideas powerful enough to resolve P v NP might bring progress on other fairly central open problems in the field.
The suspicion that resolving P v NP is difficult and requires new ideas comes not only from the fact that lots of clever people have failed to do it, but also from various theorems that show that whole classes of approaches can’t work. (E.g., there are oracles relative to which P=NP and others relative to which P!=NP, so no relativizable proof can resolve PvNP either way.)
This at least suggests (though it doesn’t prove) that a resolution of PvNP will need to engage in some fairly deep manner with the actual nature of P and NP. Which in turn seems like reason to think it would require useful new ideas.
The most obvious way for a resolution of PvNP to fail to have interesting new ideas in it would be for it to proceed by exhibiting a polynomial-time for some problem known to be NP-hard—but for that algorithm to be fearsomely and impenetrably complicated, supplying no insight to human readers.
But that would mean showing that P=NP, which would itself be a huge surprise (and overturn lots of things that experts thought almost certainly true). Repairing all those experts’ intuitions seems like it would be likely to lead to progress elsewhere.
And P=NP would be useful in itself, unless the polynomial-time algorithms for NP-hard problems always turned out to take time like 1000000000 n^1000000000. Which seems not very likely since the exponents and constant factors in actual algorithms people have found are very seldom like that.
Historically, resolutions of famous hard mathematical problems have often turned out to help (more or less directly) to make progress beyond the mere resolution of those problems.
For instance, the ideas that led to the proof of “Fermat’s last theorem” have since then also led to a proof of the independently important Taniyama-Shimura conjecture (FLT having required TS only for “semistable” elliptic curves).
Although I say “independently important” it might be better to say that among mathematicians as opposed to readers of popular mathematics books FLT was interesting only because of its connection to the Taniyama-Shimura conjecture.
Another recent example: The key idea behind Perelman’s resolution of the Poincare conjecture (“Ricci flow”) has been used to prove other (related but) separately interesting questions in differential geometry, such as the “quarter-pinched sphere theorem”.
There’s no obvious reason why something similar wouldn’t happen with a resolution of P v NP.
I’ll just leave this here: http://math.stackexchange.com/questions/1892/the-practical-implication-of-p-vs-np-problem
I mostly agree with passve_fist:
A problem can be in P and still not be computationally feasible; O(n^10000) algorithms are still in P.
Almost always a very good approximate answer is good enough. The travelling salesman problem is in NP, but we have efficient enough approximations (not just in P, but small exponents) that the benefit of getting to perfect is low.
The major implication I care about concerns the feasibility of A.I., specifically because I believe “analogy finding” and knowledge reuse to rest on tractable sollutions to the subgraph isomorphism problem.
You could build an AI which restricts itself to solving small cases of the problem, finding “petty” analogies, and growing more complex ones ontop. Any (computationally efficient) algorithm which could find isomorphisms for larger problems directly in polynomial time would blow a “bottom up” analogy finder out of orbit.
Though ofcourse if anyone reading could point out a method for finding subgraph isomorphisms on large graphs that is “good enough” by some standard, I’ll be happy to learn of it :)