Would also love to take the tests. If possible you could grab human test subjects from certain areas: a less wrong group, a reddit group, etc.
Legionnaire
Who is aligning lesswrong? As lesswrong becomes more popularized due to AI growth, I’m concerned the quality of lesswrong discussion and posts has decreased since creating and posting have no filter. Obviously no filter has been a benefit while lesswrong was a hidden gem, only visible to those who can see its value. But as it becomes more popular, i think it should be obvious this site would drop in value if it trended towards reddit. Ideally existing users prevent that, but obviously that will tend to drift if new users can just show up. Are there methods in place for this issue?
Specific example: lots of posts seem like rehashes of things that have already been plainly discussed, and the quick takes section, and discussion on Discord, do a great job of cutting down on this particular issue. So maintaining high quality posts is not a pipe dream!
LLMs can be very good at coming up with names with some work:
A few I liked:
Sacrificial Contest
Mutual Ruin Game
Sacrificial Spiral
Universal Loss Competition
Collective Sacrifice Trap
Competition Deadlock
Competition Spiral
Competition Stalemate
Destructive Contest
Destructive Feedback Competition
Conflict Feedback Spiral
- 22 Sep 2024 18:05 UTC; 2 points) 's comment on Yoav Ravid’s Shortform by (
Potential political opportunity: LLMs are trained on online data and will continue to be. If I want to make sure they are against communism by default, I could: Auto generate a bunch of public github repositories, Fill them with text I generate using gpt4o mini which is $15 per 4 million letters which I have prompted to be explicitly pro free markets and against communism. Entwine them by posting links to each other and the rest of the internet: highlight, share, fork, and star them to increase likelihood they are included in the dataset.
Speculation: LLM Self Play into General Agent?
Suppose you got a copy of GPT4 post fine tuning + hardware to train it. How would the following play out?
1. Give it the rules and state of a competitive game, such as automatically generated tic-tac-toe variants.
2. Prompt it to use chain of thought to consider the best next move and select it.
3. Provide it with the valid set of output choices (like a json format determining action and position, similar to AutoGPT)
4. Run two of these against each other continuously, training on the results of the victor which can be objectively measured by the game’s rules.
5. Benchmark it against a tiny subset of those variants that you want to manually program a bot with known ELO / have a human evaluate it.
6. Increase the complexity of the game when it reaches some general ability (eg tic tac toe variants > chess variants > Civilization 5 The Videogame variants)
Note this is similar to what Gato did. https://deepmind.google/discover/blog/a-generalist-agent/
This would have an interesting side effect of making its output more legible in some ways than a normal NN agent, though I suppose there’s no guarantee the chain of thought would stay legible English unless additional mechanisms were put in place, but this is just a high level idea.
Legionnaire’s Shortform
Good to know. In that case the above solution is actually even safer than that.
Plausible Deniability yes. Reason agnostic. It’s hard to know why someone might not want to be known to have their address here, but with my numbers above, they would have the statistical backing that 1/1000 addresses will appear in the set by chance, meaning a someone who wants to deny it could say “for every address actually in the set, 1000 will appear to be” so that’s only a 1/1000 chance I actually took the survey! (Naively of course; rest in peace rationalist@lesswrong.com)
Thanks for your input. Though ideally we wouldn’t have to go through an email server, it may just be required at some level of security.
As for the patterns, the nice thing is that with a small output space in the millions, there are tons of overlapping reasonable addresses even if you pin it down to a domain. Every English first and last name combo even without any numbers in it is already a lot larger than 10 million, meaning even targeted domains should have plenty of collisions.
[Question] Making 2023 ACX Prediction Results Public
I have done something similar using draw.io for arguments regarding a complex feature. Each point often had multiple counterpoints, which themselves sometimes split into other points. I think this is only necessary for certain discussions and should probably not be the default though.
I’m a software developer and father interested in:
General Rationality: eg. WHY does Occam’s Razor work? Does it work in every universe?
How rationality can be applied to thinking critically about CW/politics in a political-party agnostic way
Concrete understanding of how weird arguments: Pascals Wager, The Simulation Hypothesis, Roko’s B, etc. do or don’t work
AI SOTA, eg. what could/should/will OpenAI release next?
AI Long Term arguments from “nothing burger” all the way to Yudkowsky
Physics, specifically including Quantum Physics and Cosmology
Lesswrong community expansion/outreach
Time zone is central US. I also regularly read Scott Alexander.
I am concerned for your monetary strategy (unless you’re rich). Let’s say you’re absolutely right that LW is overconfident, and that there is actually a 10% chance of aliens rather than 0.5. So this is a good deal! 20x!
But only on the margin.
Depending on your current wealth it may only be rational to take a few hundred dollars worth of these bets for this particular bet. If you go making lots of these types of bets (low probability, high payoff, great EXpected returns) for a small fraction of your wealth each, you should expect to make money, but if you make only 3 or 4 of these types of bets, you are more likely to lose money because your are loading all your gains into a small fraction of possibilities in exchange for huge payouts, and most outcomes end up with you losing money.
See for example the St. Petersburg paradox which has infinite expected return, but very finite actual value given limited assets for the banker and or the player.
Smaller sums are more likely to convey probabilities of each party accurately. For example, if Elon Musk offers me $5000 to split between two possible outcomes, I will allocate them close to my beliefs, but if he offers me 5mil, I’ll allocate about 2.5mil each because either one is a transformative amount of money.
People are more likely to be rational with their marginal dollar because of pricing in the value of staying solvent. The first 100k in my bank account IS worth more than the second, and so the saying, a non-marginal bird in the hand is worth two in the bush.
Good to know! I’ll look more into it.
I agree that’s all it is, but you can make all the same general statements about any algorithm.
The problem is that some people hear you say “constructed” and “nothing special”, and then conclude they can reconstruct it any way they wish. It may be constructed and not special in a cosmic sense, but it’s not arbitrary. All heuristics are not made equal for any given goal.
The Moral Copernican Principle
I’m not saying “the experts can be wrong” I’m saying these aren’t even experts.
Pick any major ideology/religion you think is false. One way or another (they can’t all be right!), the “experts” in these areas aren’t experts, they are basically insane: babbling on at length about things that aren’t at all real, which is what I think most philosophy experts are doing. Making sure you aren’t one of them is the work of epistemology which The Sequences are great at covering. In other words, the philosopher experts you are citing I view as largely [Phlogiston](https://www.lesswrong.com/posts/RgkqLqkg8vLhsYpfh/fake-causality) experts.
I think more downvoting being the solution depends on the goals. If our goal is only to maintain the current quality, that seems like a solution. If the goal is to grow in users and quality, I think diverting people to a real-time discussion location like Discord could be more effective.
Eg. a new user coming to this site might not have any idea a particular article exists that they should read before writing and posting their 3 page thesis on why AI will/wont be great, only to have their work downvoted (it is insulting and off-putting to be downvoted) and in the end we may miss out on persuading/gaining people. In a chat a quick back and forth could steer them in the right direction right off the bat.
Well that puts my concern to rest. Thanks!