I’d be lying if I said I hadn’t considered it 🤣
AABoyles
Advanced AI is a bomb, and we’re about to set it off without any safety equipment.
For Policymakers
Suppose you find out that you had a disease will kill you. The prognosis is grim: with luck, you have a decade left. But maybe that’s long enough to find a cure! How much is that cure worth to you?
Uncontrolled advanced AI could be the disease that kills you. The prognosis is grim: with luck, you have a decade left. But maybe that’s long enough to find a cure! How much is that cure worth to you?
For Policymakers, original-ish
How hard did you think about killing the last cockroach you found in your house? We’re the cockroaches, and we are in the AI’s house. For policy-makers, variant on the anthill argument, original source unknown
Fair point. It does seems like “pandemic” is a more useful category if it doesn’t include a whole bunch of “things that happened but didn’t kill a lot of people.”
Without aging, COVID-19 would not be a global pandemic, since the death rate in individuals below 30 years old is extremely low.
A pandemic is an epidemic that occurs across multiple continents. Note that we can accordingly envision a pandemic with a death rate of zero, but a pandemic none-the-less. Accordingly, I think you’ve somewhat overstated the punchline about aging and COVID-19, though I agree with the broader point that if aging were effectively halted at 30, the death rates would be much, much lower.
If I wasn’t trying to not-spend-time-on-this, I would fit a Random Forest or a Neural Network (rather than a logistic regression) to capture some non-linear signal, and, when it predicted well, fire up an optimizer to see how much in which stats really helps.
Fun! I wish I had a lot more time to spend on this, but here’s a brief and simple basis for a decision:
library(readr) library(dplyr) library(magrittr) training <- read_csv("https://raw.githubusercontent.com/H-B-P/d-and-d-sci/main/d_and_d_sci.csv″) training %<>% dplyr::mutate(outcome = ifelse(result==”succeed”, 1, 0)) model ← glm(outcome ~ cha + con + dex + int + str + wis, data = training, family = “binomial”) summary(model) start ← data.frame(str = c(6), con = c(14), dex = c(13), int = c(13), wis = c(12), cha = c(4)) predict.glm(model, start, type=”response”) # > 0.3701247 wise ← data.frame(str = c(6), con = c(15), dex = c(13), int = c(13), wis = c(20), cha = c(5)) predict.glm(model, wise, type=”response”) # > 0.7314005 charismatic ← data.frame(str = c(6), con = c(14), dex = c(13), int = c(13), wis = c(12), cha = c(14)) predict.glm(model, charismatic, type=”response”) # > 0.6510629 wiseAndCharismatic ← data.frame(str = c(6), con = c(14), dex = c(13), int = c(13), wis = c(20), cha = c(6)) predict.glm(model, wiseAndCharismatic, type=”response”) # > 0.73198
Gonna go with wiseAndCharismatic (+8 Wisdom, +2 Charisma).
It would also be very useful to build some GPT feature “visualization” tools ASAP.
Do you have anything more specific in mind? I see the Image Feature Visualization tool, but in my mind it’s basically doing exactly what you’re already doing by comparing GPT-2 and GPT-3 snippets.
If it’s not fast enough, it doesn’t matter how good it is
Sure! My brute-force bitwise algorithm generator won’t be fast enough to generate any algorithm of length 300 bits, and our universe probably can’t support any representation of any algorithm of length greater than (the number of atoms in the observable universe) ~ 10^82 bits. (I don’t know much about physics, so this could be very wrong, but think of it as a useful bound. If there’s a better one (e.g. number of Planck volumes in the observable universe), substitute that and carry on, and also please let me know!)
Part of the issue with this might be programs that don’t work or do anything (Beyond the trivial, it’s not clear how to select for this, outside of something like AlphaGo.)
Another class of algorithms that cause problems are those that don’t do anything useful for some number of computations, after which they begin to output something useful. We don’t really get to know if they will halt, so if the useful structure emerges after some number of steps, we may not be committed to or able to run it that long.
Anything sufficiently far enough away from you is causally isolated from you. Because of the fundamental constraints of physics, information from there can never reach here, and vice versa. you may as well be in separate universes.
The performance of AlphaGo got me thinking about algorithms we can’t access. In the case of AlphaGo, we implemented the algorithm (AlphaGo) which discovered some strategies we could never have created. (Go Master Ke Jie famously said “I would go as far as to say not a single human has touched the edge of the truth of Go.”)
Perhaps we can imagine a sort of “logical causal isolation.” An algorithm is logically causally isolated from us if we cannot discover it (e.g. in the case of the Go strategies that AlphaGo used) and we cannot specify an algorithm to discover it (except by random accident) given finite computation over a finite time horizon (i.e. in the lifetime of the observable universe).
Importantly, we can devise algorithms which search the entire space of algorithms (e.g.
generate all permutations all possible strings of bits less than length n as n approaches infinity
), but there’s little reason to expect that such a strategy will result in any useful outputs of some finite length (there appear to be enough atoms in the universe () to represent all possible algorithms of length .There’s one important weakness in LCI (that doesn’t exist in Physical Causal Isolation). We can randomly jump to algorithms of arbitrary lengths. This stipulation gives us the weird ability to pull stuff from outside our LCI-cone into it. Unfortunately, we cannot do so with the expectation of arriving at a useful algorithm. (There’s an interesting question about which I haven’t yet thought about the distribution of useful algorithms of a given length.) Hence we must add the caveat to our definition of LCI “except by random accident.”
We aren’t LCI’d from the strategies AlphaGo used, because we created AlphaGo and AlphaGo discovered those strategies (even if human Go masters may never have discovered them independently). I wonder what algorithms exist beyond not just our horizons, but the horizons of all the algorithms which descend from everything we are able to compute.
- 16 Apr 2024 8:41 UTC; 3 points) 's comment on shortplav by (
- 29 May 2020 0:19 UTC; 1 point) 's comment on __nobody’s Shortform by (
A second round is scheduled to begin this Saturday, 2020-02-08. New predictors should have a minor advantage in later rounds as the winners will have already exhausted all the intellectual low-hanging fruit. Please join us!
Thanks! Also, thanks to Pablo_Stafforini, DanielFilan and Tamay for judging.
Thank you!
I would also like to convert it to a more flexible e-reader format. It appears to have been typeset using … Would it be possible to share the source files?
It’s time to test the Grue Hypothesis! Anyone have some Emeralds handy?
It occurs to me that the world could benefit from more affirmative fact checker. Existing fact checkers are appropriately rude to people who publicly make false claims, but there’s not much in the way of celebration of people who make difficult true claims. For example, Politifact awards “Pants on Fire” for bald lies, but only “True” for bald truths. I think there should be an even higher-status classification for true claims that run counter to the interests of the speaker. For example, we could award “Bayesian Stars” to figures who publicly update on new evidence, or “Bullets Bitten” to public figures who promulgate true evidence that weakens their arguments.
It occurs to me that “Following one’s passion” is terrible advice at least in part because of the lack of diversity in the activities we encourage children to pursue. It follows that encouraging children to participate in activities with very high-competition job markets (e.g. sports, the arts) may be a substantial drag on economic growth. After 5 minutes of search, I could not find research on this relationship. (It seems the state of scholarship on the topic is restricted to models in which participation in extracurriculars early in childhood leads to better metrics later in childhood.) This may merit a more careful assessment.
Attention Conservation Warning: I envision a model which would demonstrate something obvious, and decide the world probably wouldn’t benefit from its existence.
The standard publication bias is that we must be 95% certain a described phenomenon exists before a result is publishable (at which time it becomes sufficiently “confirmed” to treat the phenomenon as a factual claim). But the statistical confidence of a phenomenon conveys interesting and useful information regardless of what that confidence is.
Consider the space of all possible relationships: most of these are going to be absurd (e.g. the relationship between number of minted pennies and number of atoms in moons of Saturn), and exhibit no correlation. Some will exhibit weak correlations (in the range of p = 0.5). Those are still useful evidence that a pathway to a common cause exists! The universal prior on random relationships should be roughly zero, because most relationships will be absurd.
What would science look like if it could make efficient use of the information disclosed by presently unpublishable results? I think I can generate a sort of agent-based model to imagine this. Here’s the broad outline:
Create a random DAG representing some complex related phenomena.
Create an agent which holds beliefs about the relationship between nodes in the graph, and updates its beliefs when it discovers a correlation with p > 0.95.
Create a second agent with the same belief structure, but which updates on every experiment regardless of the correlation.
On each iteration have each agent select two nodes in the graph, measure their correlation, and update their beliefs. Then have them compute the DAG corresponding to their current belief matrix. Measure the difference between the DAG they output and the original DAG created in step 1.
I believe that both agents will converge on the correct DAG, but the un-publication-biased agent will converge much more rapidly. There are a bunch of open parameters that need careful selection and defense here. How do the properties of the original DAG affect the outcome? What if agents can update on a relationship multiple times (e.g. run a test on 100 samples, then on 10,000)?
Given defensible positions on these issues, I suspect that such a model would demonstrate that publication bias reduces scientific productivity by roughly an order of magnitude (and perhaps much more).
But what would the point be? No one will be convinced by such a thing.
Thanks for this! Just to clarify what I meant by “manual distribution”, if you’ve written a dating profile outside of a dating app, you’ve basically got to share a link if you want anyone to read it (see e.g. this post).