IDEA—write a syntax/static analysis checker for laws. Possibly begin with U.S. state law in a particularly simple state, and move up to the U.S. Code (U.S.C.) and the Code of Federal Regulations (C.F.R.) Automatically look for conflicting/inconsistent definitions, logical conflicts, and other possible problems or ambiguities. Gradually improve it to find real problems, and start convincing the legal profession to use it when drafting new legislation.
While it may not directly pertain to lesswrong, it is an awesomely hard problem that could have far reaching impacts.
I’m a lawyer. I’m also an enthusiast about applying computing technology to legal work generally, but not tech-savvy by the standards of LessWrong. But if I could help to define the problems a bit, I’d be happy to respond to PMs.
For example, the text of the U.S. Constitution is not long. Here’s just one part of it:
Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the Government for a redress of grievances.
As you know, this small bit of text has been the subject of a lot of debate over the years. But here’s another portion of the Constitution, not much shorter:
No soldier shall, in time of peace be quartered in any house, without the consent of the owner, nor in time of war, but in a manner to be prescribed by law.
There’s arguably a lot of room for debate over these words as well, but as a practical matter, the subject almost never comes up. I’d suggest that doesn’t mean that the ambiguity isn’t potentially present in the text, and could be revealed if for some reason the government had a strong urge to quarter troops in private homes.
I think the text of the Motor Vehicles Code of Wyoming is much longer than the whole U.S. Constitution with all its amendments, but since Wyoming is not a populous state, and the code mostly deals with relatively mundane matters, there hasn’t been a huge amount of published litigation over the precise meanings of the words and phrases in that text. It doesn’t mean that there isn’t just as much potential ambiguity within any given section of the Wyoming Motor Vehicles Code as there is in the First Amendment.
ETA: Law is made of words, and even at its best it is written in a language far, far less precise than the language of mathematics. Law is (among other things) a set of rules designed to govern the behavior of large numbers of people. But people are tricky, and keep on coming up with new and unexpected behaviors.
Also, it’s important to note that there are hierarchies of law in the U.S. I mentioned the U.S. Constitution to illustrate the potential complexity of law—libraries have been written on the Bill of Rights, and the Supreme Court hasn’t resolved every conflict just yet. If this seems daunting, it’s because it is. But in some ways, the U.S. Constitution is the simplest and easiest place to start syntactic analysis. The text is only a few thousand words long, and it is far less subject to change than almost all other laws. More importantly, it trumps all other law. All other U.S. laws are subject to the Constitution. By the same token, state laws are subject to federal law, and so on down to local regulations.
A county or municipality may enact a nice, well-drafted set of ordinances regulating billboards or street signs. These ordinances may be, in themselves, elegant and internally consistent and unambiguous. But all the other higher-level laws are still in place...if the local laws violate state or federal laws, or restrict free speech unconstitutionally, there is a problem. So, in a way, every local law implicitly incorporates a huge amount of jurisprudence simply from its context within the state and national governments.
My original thought was selling access to lawyers who are preparing cases. It could also be valuable to people who are trying to maneuver in complex legal environments—executives and politicians and such.
It seems to me that there should a limited cheap or free version, but I’m not sure how that would work.
Hmmm. Okay. So the reason this is profitable is because it’s gotten SO hard to keep track of all the laws that even lawyers would be willing to pay for software that can help them check their legal ideas against the database of existing laws?
There’s probably a bit of money in distilling legalese into simpler language. Nolo Press, for instance, is in that field.
The real money in lawyering, however, is in applying the law to the available evidence in a very specific case. This is why some BigLaw firms charge hourly fees measured by the boatload. A brilliant entrepreneur able to develop an artificial intelligence application which could apply the facts to the law as effectively as a BigLaw firm should eventually be able to cut into some BigLaw action. That’s a lot of money.
This is a hard problem. My personal favorite Aesop’s fable about applying the facts to the law is Isaac Asimov’s short story Runaround . Worth reading all the way through, but for our purposes, the law is very clear and simple: the three laws of robotics. The fact situation is that the human master has casually and lightly ordered the robot to do something which was unexpectedly very dangerous to the robot. The robot then goes nuts, spinning around in a circle. Asimov says it better of course:
Powell’s radio voice was tense in Donovan’s car: “Now, look, let’s start with the three fundamental Rules of Robotics—the three rules that are built most deeply into a robot’s positronic brain.” In the darkness, his gloved fingers ticked off each point.
“We have: One, a robot may not injure a human being, or, through inaction, allow a human being to come to harm.”
“Right!”
“Two,” continued Powell, “a robot must obey the orders given it by human beings except where such orders would conflict with the First Law.”
“Right!”
“And three, a robot must protect its own existence as long as such protection does Dot conflict with the First or Second Laws.”
“Right! Now where are we?”
“Exactly at the explanation. The conflict between the various rules is ironed out by the different positronic potentials in the brain. We’ll say that a robot is walking into danger and knows it. The automatic potential that Rule 3 sets up turns him back. But suppose you order him to walk into that danger. In that case, Rule 2 sets up a counterpotential higher than the previous one and the robot follows orders at the risk of existence.”
“Well, I know that. What about it?”
“Let’s take Speedy’s case. Speedy is one of the latest models, extremely specialized, and as expensive as a battleship. It’s not a thing to be lightly destroyed.”
“So?”
“So Rule 3 has been strengthened-that was specifically mentioned, by the way, in the advance notices on the SPD models-so that his allergy to danger is unusually high. At the same time, when you sent him out after the selenium, you gave him his order casually and without special emphasis, so that the Rule 2 potential set-up was rather weak. Now, hold on; I’m just stating facts.”
“All right, go ahead. I think I get it.”
“You see how it works, don’t you? There’s some sort of danger centering at the selenium pool. It increases as he approaches, and at a certain distance from it the Rule 3 potential, unusually high to start with, exactly balances the Rule 2 potential, unusually low to start with.”
Donovan rose to his feet in excitement. “And it strikes an equilibrium. I see. Rule 3 drives him back and Rule 2 drives him forward - ”
“So he follows a circle around the selenium pool, staying on the locus of all points of potential equilibrium. And unless we do something about it, he’ll stay on that circle forever, giving us the good old runaround.”
In the real world, courts hardly ever decide that the law is indecipherable, and so the plaintiff should run around in a circle singing nonsense songs (but see, Ashford v Thornton [(1818) 106 ER 149].) The moral of the story, however, is that there is ambiguity in the application of the simplest and clearest of laws.
And so the whole human race spins in circles. Yes, I see. (: And so, do you propose that this software also takes out ambiguity? Do you see a way around that other than specifying exactly what to do in every situation? BTW, I rewrote the intro on the OP—any suggestions?
Now that I think about it, a program which can do a good job of finding laws which are relevant to a case would and or ranking laws by relevance probably be valuable—even if it’s not as good as the best lawyers.
Any opinions on whether this is harder or easier than understanding natural language? In theory, legal language is supposed to be clearer (for experts) and more precise, but I’m not sure that this is true.
It might be easier to write programs which evaluate scientific journal articles for contradictions with each other, the simpler sorts of bad research design, and such.
I’d say that legal language, at least in America, is absolutely well within the bounds of natural language, with all the ambiguity that implies. Certainly lawyers have their own jargon and “terms of art” that sound unfamiliar to the uninitiated, but so do airplane pilots and sailors and auto mechanics. It’s still not mathematics.
There are a lot of legislators and judges, and they don’t all use words in exactly the same ways. Over time, the processes of binding precedent and legal authority are supposed to resolve the inconsistencies within the law, but the change is slow. In the meantime, statutes keep on changing, and human beings keep on presenting courts with new and unexpected problems. And judges and legislatures are only people within a society and culture which itself changes. Our ideas about “moral turpitude” and “public policy” and what a “reasonable man” (or person) would do are subject to change over time. In this way, the language of the law is like a leaky boat that is being bailed out by the crew. It’s not a closed system.
One bottleneck here would be that the programmer would also have to be able to understand legalese. To find someone with both specialties could be pretty hard.
I honestly don’t know enough about law to provide the kind of detailed mistake you’re looking for. My belief that it is a somewhat ‘important’ problem is circumstantial, but I think there’s definitely gain to be had:
1) It is often said that bad law consistently applied is better than good law inconsistently applied, but all other things being equal, good law is better than bad law. It is generally accepted that it is possible to have ‘good’ law which is better than ‘bad’ law, and I take this as evidence that it’s at least possible to have good law and bad law.
2) Law is currently pretty ambiguous, at least compared to software. These ambiguities are typically resolved at run time, by the court system. If we can resolve some of these ambiguities earlier with automated software, it may be possible to reduce the run time overhead of court cases.
3) Law is written in an internally inconsistent language. The words are natural language words, and do not have well understood, well defined meanings in all cases. A checker could plausibly identify and construct a dictionary of the most consistent words and definitions, and perhaps encourage new law makers to either use better words, define undefined words, or to clarify the meaning of questionable passages. By reducing even a subset of words to a well defined, consistent definition, the law may become easier to read, understand and apply.
4) An automated system could possibly reduce the body of law in general by eliminating redundancy, overlapping logic, and obsolete/unreferenced sections.
Currently, we do all of the above anyway, but we use humans and human brains to do it, and we allow for human error by having huge amounts of redundancy and failsafe. The idea that we could liberate even some of those minds to work on harder problems is appealing to me.
What if we did this: If a program can detect “natural language words” and encourage humans to re-write until the language is very, very clear, then this could open up the process of lawmaking to the other processing tasks you’re describing, without having to write natural language processing software.
It would also be useful to other fields where computer-processed language would be beneficial. THOSE fields could translate their natural language into language that computers can understand, then process it with a computer.
And if, during the course of using the software, the software is given access to both the “before” text (that it as marked as “natural language, please reword”) AND the “after” text (the precise, machine readable language which the human has changed it to) then one would have the opportunity to use those changes as part of a growing dictionary, from which it translates natural language into readable language on it’s own.
At which point, it would be capable of natural language processing.
I bet there are already projects like this one out there—I know of a few AI projects where they use input from humans to improve the AI like Microsoft’s Milo (ted.com has a TED Talk video on this) but I don’t know if any of them are doing this translation of natural language into machine-readable language, and then back.
Anyway, we seem to have solved the problem of how to get the software to interpret natural language. Here’s the million dollar question:
Would it work, business-wise, to begin with a piece of software that acts as a text editor, is designed to highlight ambiguities and anonymously returns the before and after text to a central database?
If yes, all the rest of this stuff is possible. If no, or if some patent hoarder has taken that idea, then … back to figuring stuff out. (:
An idea from a book called The Death of Common Sense—language has very narrow bandwidth compared to the world, which means that laws can never cover all the situations that the laws are intended to cover.
language has very narrow bandwidth compared to the world, which means that laws can never cover all the situations that the laws are intended to cover.
IDEA—write a syntax/static analysis checker for laws. Possibly begin with U.S. state law in a particularly simple state, and move up to the U.S. Code (U.S.C.) and the Code of Federal Regulations (C.F.R.) Automatically look for conflicting/inconsistent definitions, logical conflicts, and other possible problems or ambiguities. Gradually improve it to find real problems, and start convincing the legal profession to use it when drafting new legislation.
While it may not directly pertain to lesswrong, it is an awesomely hard problem that could have far reaching impacts.
I’m a lawyer. I’m also an enthusiast about applying computing technology to legal work generally, but not tech-savvy by the standards of LessWrong. But if I could help to define the problems a bit, I’d be happy to respond to PMs.
For example, the text of the U.S. Constitution is not long. Here’s just one part of it:
As you know, this small bit of text has been the subject of a lot of debate over the years. But here’s another portion of the Constitution, not much shorter:
There’s arguably a lot of room for debate over these words as well, but as a practical matter, the subject almost never comes up. I’d suggest that doesn’t mean that the ambiguity isn’t potentially present in the text, and could be revealed if for some reason the government had a strong urge to quarter troops in private homes.
I think the text of the Motor Vehicles Code of Wyoming is much longer than the whole U.S. Constitution with all its amendments, but since Wyoming is not a populous state, and the code mostly deals with relatively mundane matters, there hasn’t been a huge amount of published litigation over the precise meanings of the words and phrases in that text. It doesn’t mean that there isn’t just as much potential ambiguity within any given section of the Wyoming Motor Vehicles Code as there is in the First Amendment.
ETA: Law is made of words, and even at its best it is written in a language far, far less precise than the language of mathematics. Law is (among other things) a set of rules designed to govern the behavior of large numbers of people. But people are tricky, and keep on coming up with new and unexpected behaviors.
Also, it’s important to note that there are hierarchies of law in the U.S. I mentioned the U.S. Constitution to illustrate the potential complexity of law—libraries have been written on the Bill of Rights, and the Supreme Court hasn’t resolved every conflict just yet. If this seems daunting, it’s because it is. But in some ways, the U.S. Constitution is the simplest and easiest place to start syntactic analysis. The text is only a few thousand words long, and it is far less subject to change than almost all other laws. More importantly, it trumps all other law. All other U.S. laws are subject to the Constitution. By the same token, state laws are subject to federal law, and so on down to local regulations.
A county or municipality may enact a nice, well-drafted set of ordinances regulating billboards or street signs. These ordinances may be, in themselves, elegant and internally consistent and unambiguous. But all the other higher-level laws are still in place...if the local laws violate state or federal laws, or restrict free speech unconstitutionally, there is a problem. So, in a way, every local law implicitly incorporates a huge amount of jurisprudence simply from its context within the state and national governments.
It might also be a good way of making money.
So we can see your vision, please describe how this would work?
My original thought was selling access to lawyers who are preparing cases. It could also be valuable to people who are trying to maneuver in complex legal environments—executives and politicians and such.
It seems to me that there should a limited cheap or free version, but I’m not sure how that would work.
Hmmm. Okay. So the reason this is profitable is because it’s gotten SO hard to keep track of all the laws that even lawyers would be willing to pay for software that can help them check their legal ideas against the database of existing laws?
There’s probably a bit of money in distilling legalese into simpler language. Nolo Press, for instance, is in that field.
The real money in lawyering, however, is in applying the law to the available evidence in a very specific case. This is why some BigLaw firms charge hourly fees measured by the boatload. A brilliant entrepreneur able to develop an artificial intelligence application which could apply the facts to the law as effectively as a BigLaw firm should eventually be able to cut into some BigLaw action. That’s a lot of money.
This is a hard problem. My personal favorite Aesop’s fable about applying the facts to the law is Isaac Asimov’s short story Runaround . Worth reading all the way through, but for our purposes, the law is very clear and simple: the three laws of robotics. The fact situation is that the human master has casually and lightly ordered the robot to do something which was unexpectedly very dangerous to the robot. The robot then goes nuts, spinning around in a circle. Asimov says it better of course:
In the real world, courts hardly ever decide that the law is indecipherable, and so the plaintiff should run around in a circle singing nonsense songs (but see, Ashford v Thornton [(1818) 106 ER 149].) The moral of the story, however, is that there is ambiguity in the application of the simplest and clearest of laws.
And so the whole human race spins in circles. Yes, I see. (: And so, do you propose that this software also takes out ambiguity? Do you see a way around that other than specifying exactly what to do in every situation? BTW, I rewrote the intro on the OP—any suggestions?
Now that I think about it, a program which can do a good job of finding laws which are relevant to a case would and or ranking laws by relevance probably be valuable—even if it’s not as good as the best lawyers.
Any opinions on whether this is harder or easier than understanding natural language? In theory, legal language is supposed to be clearer (for experts) and more precise, but I’m not sure that this is true.
It might be easier to write programs which evaluate scientific journal articles for contradictions with each other, the simpler sorts of bad research design, and such.
I’d say that legal language, at least in America, is absolutely well within the bounds of natural language, with all the ambiguity that implies. Certainly lawyers have their own jargon and “terms of art” that sound unfamiliar to the uninitiated, but so do airplane pilots and sailors and auto mechanics. It’s still not mathematics.
There are a lot of legislators and judges, and they don’t all use words in exactly the same ways. Over time, the processes of binding precedent and legal authority are supposed to resolve the inconsistencies within the law, but the change is slow. In the meantime, statutes keep on changing, and human beings keep on presenting courts with new and unexpected problems. And judges and legislatures are only people within a society and culture which itself changes. Our ideas about “moral turpitude” and “public policy” and what a “reasonable man” (or person) would do are subject to change over time. In this way, the language of the law is like a leaky boat that is being bailed out by the crew. It’s not a closed system.
One bottleneck here would be that the programmer would also have to be able to understand legalese. To find someone with both specialties could be pretty hard.
(This would also need to be able to take case law into account.)
I would like to see a few examples of different types of mistakes have ended up in real laws and what you think we would gain by doing this.
I honestly don’t know enough about law to provide the kind of detailed mistake you’re looking for. My belief that it is a somewhat ‘important’ problem is circumstantial, but I think there’s definitely gain to be had:
1) It is often said that bad law consistently applied is better than good law inconsistently applied, but all other things being equal, good law is better than bad law. It is generally accepted that it is possible to have ‘good’ law which is better than ‘bad’ law, and I take this as evidence that it’s at least possible to have good law and bad law.
2) Law is currently pretty ambiguous, at least compared to software. These ambiguities are typically resolved at run time, by the court system. If we can resolve some of these ambiguities earlier with automated software, it may be possible to reduce the run time overhead of court cases.
3) Law is written in an internally inconsistent language. The words are natural language words, and do not have well understood, well defined meanings in all cases. A checker could plausibly identify and construct a dictionary of the most consistent words and definitions, and perhaps encourage new law makers to either use better words, define undefined words, or to clarify the meaning of questionable passages. By reducing even a subset of words to a well defined, consistent definition, the law may become easier to read, understand and apply.
4) An automated system could possibly reduce the body of law in general by eliminating redundancy, overlapping logic, and obsolete/unreferenced sections.
Currently, we do all of the above anyway, but we use humans and human brains to do it, and we allow for human error by having huge amounts of redundancy and failsafe. The idea that we could liberate even some of those minds to work on harder problems is appealing to me.
What if we did this: If a program can detect “natural language words” and encourage humans to re-write until the language is very, very clear, then this could open up the process of lawmaking to the other processing tasks you’re describing, without having to write natural language processing software.
It would also be useful to other fields where computer-processed language would be beneficial. THOSE fields could translate their natural language into language that computers can understand, then process it with a computer.
And if, during the course of using the software, the software is given access to both the “before” text (that it as marked as “natural language, please reword”) AND the “after” text (the precise, machine readable language which the human has changed it to) then one would have the opportunity to use those changes as part of a growing dictionary, from which it translates natural language into readable language on it’s own.
At which point, it would be capable of natural language processing.
I bet there are already projects like this one out there—I know of a few AI projects where they use input from humans to improve the AI like Microsoft’s Milo (ted.com has a TED Talk video on this) but I don’t know if any of them are doing this translation of natural language into machine-readable language, and then back.
Anyway, we seem to have solved the problem of how to get the software to interpret natural language. Here’s the million dollar question:
Would it work, business-wise, to begin with a piece of software that acts as a text editor, is designed to highlight ambiguities and anonymously returns the before and after text to a central database?
If yes, all the rest of this stuff is possible. If no, or if some patent hoarder has taken that idea, then … back to figuring stuff out. (:
An idea from a book called The Death of Common Sense—language has very narrow bandwidth compared to the world, which means that laws can never cover all the situations that the laws are intended to cover.
This is the story of human law.