I did go pull up a couple of your posts as that much is a fair critique:
That first post is only the middle section of what would already be a dense post and is missing the motivating “what’s the problem?”, “what does this get us?”; without understanding substantially all of the math and spending hours I don’t think I could even ask anything meaningful. That first post in particular is suffering from an approachable-ish sounding title then wall of math, so you’re getting laypeople who expected to at least get an intro paragraph for their trouble.
The August 19th post piqued my interest substantially more on account of including intro and summary sections, and enough text to let me follow along only understanding part of the math. A key feature of good math text is I should be able to gloss over challenging proofs on a first pass, take your word for it, and still get something out of it. Definitely don’t lose the rigor, but have mercy on those of us not cut out for a math PhD. If you had specific toy examples your were playing with while figuring out the post, those can also help make posts more aproachable. That post seemed well received just not viewed much; my money says the title is scaring off everyone but the full time researchers (which I’m not, I’m in software).
I think I and most other interested members not in the field default to staying out of the way when people open up with a wall of post-grad math or something that otherwise looks like a research paper, unless specifically invited to chime in. And then same story with meta; this whole thread is something most folks aren’t going to start under your post uninvited, especially when you didn’t solicit this flavor of feedback.
I bring up notation specifically as the software crowd is very well represented here, and frequently learn advanced math concepts without bothering to learn any of the notation common in math texts. So not like, 1 or 2 notation questions, but more like you can have people who get the concepts but all of the notation is Greek to them.
I have made a few minor and mostly cosmetic edits to the post about the dimensionality reduction of tensors that produces so many trace free matrices and also to the post about using LSRDRs to solve a combinatorial graph theory problem.
“What’s the problem?”-Neural networks are horribly uninterpretable, so it would be nice if we could use more interpretable AI models or at least better interpretability tools. Neural networks seem to include a lot of random information, so it would be good to use AI models that do not include so much random information. Do you think that we would have more interpretable models by forsaking all mathematical theory?
“what does this get us?”-This gets us systems trained by gradient ascent that behave much more mathematically. Mathematical AI is bound to be highly interpretable.
The downvotes display a very bad attitude, and they indicate that the LW community is a community that I really do not want much to do with at worst, and at best, the LW community is a community that lacks discipline and such mathematics texts will be needed to instill such discipline. In those posts that you have looked at, I did not include any mathematical proofs (these are empirical observations, so I could not include proof), and the lack of mathematical proofs makes the text much easier to go through. I also made the texts quite short; I only included enough text to pretty much define the fitness function and then state what I have observed.
For toy examples, I just worked with random complex matrices, and I wanted these matrices to be sufficiently small so that I can make and run the code to compute with these matrices quite quickly, but these matrices need to be large enough so that I can properly observe what is going on. I do not want to make an observation about tiny matrices that do not have any relevance to what is going on in the real world.
If we want to be able to develop safer AI systems, we will need to make them much more mathematical, and people are doing a great disservice by hating the mathematics needed for developing these safer AI systems.
Wouldn’t be engaging at all if I didn’t think there was some truth to what you’re saying about the math being important and folks needing to be persuaded to “take their medicine” as it were and use some rigor. You are not the first person to make such an observation and you can find posts on point from several established/respected members of the community.
That said, I think “convincing people to take their medicine” mostly looks like those answers you gave just being at the intro of the post(s) by default (and/or the intro to the series if that makes more sense). Alongside other misc readability improvements. Might also try tagging the title as [math heavy] or some such.
I think you’re taking too narrow a view on what sorts of things people vote on and thus what sort of signal karma is. If that theory of mind is wrong, any of the inferences that flow from it are likely wrong too. Keep in mind also (especially when parsing karma in comments) that anything that parses as whining costs you status even if you’re right (not just a LW thing). And complaining about internet points almost always parses that way.
I don’t think it necessarily follows that math heavy post got some downvotes therefore everyone hates math and will downvote math in the future. As opposed to something like people care a lot about readability and about being able to prioritize their reading to the subjects they find relevant, neither of which scores well if the post is math to the exclusion of all else.
I didn’t find any of those answers surprising but it’s an interesting line of inquiry all the same. I don’t have a good sense of how it’s simultaneously true that LLMs keep finding it helpful to make everything bigger, but also large sections of the model don’t seem to do anything useful, and increasingly so in the largest models.
Talking about whining and my loss of status is a good way to get me to dislike the LW community and consider them to be anti-intellectuals who fall for garbage like FTX. Do you honestly think the people here should try to interpret large sections of LLMs while simultaneously being afraid of quaternions?
It is better to comment on threads where we are interacting in a more positive manner.
I thought apologizing and recognizing inadequacies was a core rationalist skill. And I thought rationalists were supposed to like mathematics. The lack of mathematical appreciation is one of these inadequacies of the LW community. But instead of acknowledging this deficiency, the community here blasts me as talking about something off topic. How ironic!
Any conversation about karma would necessarily involve talking about what does and doesn’t factor into votes, likely both here and in the internet or society at large. Not thinking we’re getting anywhere on that point.
I’ve already said clearly and repeatedly I don’t have a problem with math posts and I don’t think others do either. You’re not going to get what you want by continuing to straw-man myself and others. I disagree with your premise you’ve thus far failed to acknowledge or engage with any of those points.
Let’s see whether the notions that I have talked about are sensible mathematical notions for machine learning.
Tensor product-Sometimes data in a neural network has tensor structure. In this case, the weight matrices should be tensor products or tensor sums. Regarding the structure of the data works well with convolutional neural networks, and it should also work well for data with tensor structure to it.
Trace-The trace of a matrix measures how much the matrix maps vectors onto themselves since
Tr(A)=c⋅E(⟨Av,v⟩) where v follows the multivariate normal distribution.
Spectral radius-Suppose that we are iterating a smooth function f. Suppose furthermore that f(v)=v and u0 is near v and un+1=f(un). We would like to determine whether limn→∞un=v or not. If the Jacobian of f at v has spectral radius less than 1, then limn→∞un=v,. If the Jacobian of f at v has spectral radius greater than 1, then this limit does not converge.
The notions that I have been talking about are sensible and arise in machine learning. And understanding these notions is far easier than trying to interpret very large networks like GPT-4 without using these notions. Many people on this site just act like clowns. Karma is only a good metric when the people on the network value substance over fluff. And the only way to convince me otherwise will be for the people here to value posts that involve basic notions like the trace, eigenvalues, and spectral radius of matrices.
P.S. I can make the trace, determinant, and spectral radius even simpler. These operations are what you get when you take the sum, product, and the maximum absolute value of the eigenvalues. Yes. Those are just the basic eigenvalue operations.
You’re still hammering on stuff I never disagreed with in the first place. In so far as I don’t already understand all the math (or math notation) I’d need to follow this, that’s a me problem not a you problem, and having a pile of cool papers I want to grok is prime motivation for brushing up on some more math. I’m definitely not down-voting merely on that.
What I’m mostly trying to get across is just how large of a leap of logic you’re making from [post got 2 or 3 downvotes] ⇒ [everyone here hates math]. There’s got to be at least 3 or 4 major inferences there you haven’t articulated here and I’m still not sure what you’re reacting so strongly to. Your post with the lowest karma is the first one and it’s sitting at neutral, based on a grand total of 3 votes besides yours. You are definitely sophisticated enough on math to understand the hazards of reasoning from a sample size that small.
I will work with whatever data I have, and I will make a value judgment based on the information that I have. The fact that Karma relies on very small amounts of information is a testament to a fault of Karma, and that is further evidence of how the people on this site do not want to deal with mathematics. And the information that I have indicates that there are many people here who are likely to fall for more scams like FTX. Not all of the people here are so bad, but I am making a judgment based on the general atmosphere here. If you do not like my judgment, then the best thing would be to try to do better. If this site has made a mediocre impression on me, then I am not at fault for the mediocrity here.
Again you’re saying that without engaging with any of my arguments or giving me any more of your reasoning to consider. Unless you care to share substantially more of your reasoning, I don’t see much point continuing this?
I do not care to share much more of my reasoning because I have shared enough and also because there is a reason that I have vowed to no longer discuss except possibly with lots of obfuscation. This discussion that we are having is just convincing me more that the entities here are not the entities I want to have around me at all. It does not do much good to say that the community here is acting well or to question my judgment about this community. It will do good for the people here to act better so that I will naturally have a positive judgment about this community.
There’s a presumption you’re open to discussing on a discussion forum, not just grandstanding. Strong downvoted much of this thread for the amount of my time you’ve wasted trolling.
I did go pull up a couple of your posts as that much is a fair critique:
That first post is only the middle section of what would already be a dense post and is missing the motivating “what’s the problem?”, “what does this get us?”; without understanding substantially all of the math and spending hours I don’t think I could even ask anything meaningful. That first post in particular is suffering from an approachable-ish sounding title then wall of math, so you’re getting laypeople who expected to at least get an intro paragraph for their trouble.
The August 19th post piqued my interest substantially more on account of including intro and summary sections, and enough text to let me follow along only understanding part of the math. A key feature of good math text is I should be able to gloss over challenging proofs on a first pass, take your word for it, and still get something out of it. Definitely don’t lose the rigor, but have mercy on those of us not cut out for a math PhD. If you had specific toy examples your were playing with while figuring out the post, those can also help make posts more aproachable. That post seemed well received just not viewed much; my money says the title is scaring off everyone but the full time researchers (which I’m not, I’m in software).
I think I and most other interested members not in the field default to staying out of the way when people open up with a wall of post-grad math or something that otherwise looks like a research paper, unless specifically invited to chime in. And then same story with meta; this whole thread is something most folks aren’t going to start under your post uninvited, especially when you didn’t solicit this flavor of feedback.
I bring up notation specifically as the software crowd is very well represented here, and frequently learn advanced math concepts without bothering to learn any of the notation common in math texts. So not like, 1 or 2 notation questions, but more like you can have people who get the concepts but all of the notation is Greek to them.
I have made a few minor and mostly cosmetic edits to the post about the dimensionality reduction of tensors that produces so many trace free matrices and also to the post about using LSRDRs to solve a combinatorial graph theory problem.
“What’s the problem?”-Neural networks are horribly uninterpretable, so it would be nice if we could use more interpretable AI models or at least better interpretability tools. Neural networks seem to include a lot of random information, so it would be good to use AI models that do not include so much random information. Do you think that we would have more interpretable models by forsaking all mathematical theory?
“what does this get us?”-This gets us systems trained by gradient ascent that behave much more mathematically. Mathematical AI is bound to be highly interpretable.
The downvotes display a very bad attitude, and they indicate that the LW community is a community that I really do not want much to do with at worst, and at best, the LW community is a community that lacks discipline and such mathematics texts will be needed to instill such discipline. In those posts that you have looked at, I did not include any mathematical proofs (these are empirical observations, so I could not include proof), and the lack of mathematical proofs makes the text much easier to go through. I also made the texts quite short; I only included enough text to pretty much define the fitness function and then state what I have observed.
For toy examples, I just worked with random complex matrices, and I wanted these matrices to be sufficiently small so that I can make and run the code to compute with these matrices quite quickly, but these matrices need to be large enough so that I can properly observe what is going on. I do not want to make an observation about tiny matrices that do not have any relevance to what is going on in the real world.
If we want to be able to develop safer AI systems, we will need to make them much more mathematical, and people are doing a great disservice by hating the mathematics needed for developing these safer AI systems.
Wouldn’t be engaging at all if I didn’t think there was some truth to what you’re saying about the math being important and folks needing to be persuaded to “take their medicine” as it were and use some rigor. You are not the first person to make such an observation and you can find posts on point from several established/respected members of the community.
That said, I think “convincing people to take their medicine” mostly looks like those answers you gave just being at the intro of the post(s) by default (and/or the intro to the series if that makes more sense). Alongside other misc readability improvements. Might also try tagging the title as [math heavy] or some such.
I think you’re taking too narrow a view on what sorts of things people vote on and thus what sort of signal karma is. If that theory of mind is wrong, any of the inferences that flow from it are likely wrong too. Keep in mind also (especially when parsing karma in comments) that anything that parses as whining costs you status even if you’re right (not just a LW thing). And complaining about internet points almost always parses that way.
I don’t think it necessarily follows that math heavy post got some downvotes therefore everyone hates math and will downvote math in the future. As opposed to something like people care a lot about readability and about being able to prioritize their reading to the subjects they find relevant, neither of which scores well if the post is math to the exclusion of all else.
I didn’t find any of those answers surprising but it’s an interesting line of inquiry all the same. I don’t have a good sense of how it’s simultaneously true that LLMs keep finding it helpful to make everything bigger, but also large sections of the model don’t seem to do anything useful, and increasingly so in the largest models.
Talking about whining and my loss of status is a good way to get me to dislike the LW community and consider them to be anti-intellectuals who fall for garbage like FTX. Do you honestly think the people here should try to interpret large sections of LLMs while simultaneously being afraid of quaternions?
It is better to comment on threads where we are interacting in a more positive manner.
I thought apologizing and recognizing inadequacies was a core rationalist skill. And I thought rationalists were supposed to like mathematics. The lack of mathematical appreciation is one of these inadequacies of the LW community. But instead of acknowledging this deficiency, the community here blasts me as talking about something off topic. How ironic!
Any conversation about karma would necessarily involve talking about what does and doesn’t factor into votes, likely both here and in the internet or society at large. Not thinking we’re getting anywhere on that point.
I’ve already said clearly and repeatedly I don’t have a problem with math posts and I don’t think others do either. You’re not going to get what you want by continuing to straw-man myself and others. I disagree with your premise you’ve thus far failed to acknowledge or engage with any of those points.
Let’s see whether the notions that I have talked about are sensible mathematical notions for machine learning.
Tensor product-Sometimes data in a neural network has tensor structure. In this case, the weight matrices should be tensor products or tensor sums. Regarding the structure of the data works well with convolutional neural networks, and it should also work well for data with tensor structure to it.
Trace-The trace of a matrix measures how much the matrix maps vectors onto themselves since
Tr(A)=c⋅E(⟨Av,v⟩) where v follows the multivariate normal distribution.
Spectral radius-Suppose that we are iterating a smooth function f. Suppose furthermore that f(v)=v and u0 is near v and un+1=f(un). We would like to determine whether limn→∞un=v or not. If the Jacobian of f at v has spectral radius less than 1, then limn→∞un=v,. If the Jacobian of f at v has spectral radius greater than 1, then this limit does not converge.
The notions that I have been talking about are sensible and arise in machine learning. And understanding these notions is far easier than trying to interpret very large networks like GPT-4 without using these notions. Many people on this site just act like clowns. Karma is only a good metric when the people on the network value substance over fluff. And the only way to convince me otherwise will be for the people here to value posts that involve basic notions like the trace, eigenvalues, and spectral radius of matrices.
P.S. I can make the trace, determinant, and spectral radius even simpler. These operations are what you get when you take the sum, product, and the maximum absolute value of the eigenvalues. Yes. Those are just the basic eigenvalue operations.
You’re still hammering on stuff I never disagreed with in the first place. In so far as I don’t already understand all the math (or math notation) I’d need to follow this, that’s a me problem not a you problem, and having a pile of cool papers I want to grok is prime motivation for brushing up on some more math. I’m definitely not down-voting merely on that.
What I’m mostly trying to get across is just how large of a leap of logic you’re making from [post got 2 or 3 downvotes] ⇒ [everyone here hates math]. There’s got to be at least 3 or 4 major inferences there you haven’t articulated here and I’m still not sure what you’re reacting so strongly to. Your post with the lowest karma is the first one and it’s sitting at neutral, based on a grand total of 3 votes besides yours. You are definitely sophisticated enough on math to understand the hazards of reasoning from a sample size that small.
I will work with whatever data I have, and I will make a value judgment based on the information that I have. The fact that Karma relies on very small amounts of information is a testament to a fault of Karma, and that is further evidence of how the people on this site do not want to deal with mathematics. And the information that I have indicates that there are many people here who are likely to fall for more scams like FTX. Not all of the people here are so bad, but I am making a judgment based on the general atmosphere here. If you do not like my judgment, then the best thing would be to try to do better. If this site has made a mediocre impression on me, then I am not at fault for the mediocrity here.
You are judging my reasoning without knowing all that went into my reasoning. That is not good.
Again you’re saying that without engaging with any of my arguments or giving me any more of your reasoning to consider. Unless you care to share substantially more of your reasoning, I don’t see much point continuing this?
I do not care to share much more of my reasoning because I have shared enough and also because there is a reason that I have vowed to no longer discuss except possibly with lots of obfuscation. This discussion that we are having is just convincing me more that the entities here are not the entities I want to have around me at all. It does not do much good to say that the community here is acting well or to question my judgment about this community. It will do good for the people here to act better so that I will naturally have a positive judgment about this community.
There’s a presumption you’re open to discussing on a discussion forum, not just grandstanding. Strong downvoted much of this thread for the amount of my time you’ve wasted trolling.