Ted Sanders
We can already see what people do with their free time when basic needs are met. A number of technologies have enabled new hacks to set up ‘fake’ status games that are more positive-sum than ever before in history:
Watch broadcast sports, where you can feel like a winner (or at least feel connected to a winner), despite not having had to win yourself
Play video games with AI opponents, where you can feel like a winner, despite it not being zero-sum against other humans
Watch streamers and influencers to feel connected to high status people, without having to earn respect or risk rejection
Get into a niche hobby community in order to feel special, ignoring the other niche hobbies that other people join that you don’t care about
Feels likely to me that advancing digital technology will continue to make it easier for us to spend time in constructed digital worlds that make us feel like valued winners. On the one hand, it would be sad if people retreat into fake digital siloes; on the other hand, it would be nice if people got to feel like winners more.
Management consulting firms have lots of great ideas on slide design: https://www.theanalystacademy.com/consulting-presentations/
Some things they do well:
They treat slides as documents that can be understood standalone (this is even useful when presenting, as not everyone is following every word)
They employ a lot of hierarchy to help make the content skimmable (helpful for efficiency)
They put conclusions / summaries / action items up front, details behind (helpful for efficiency, especially in a high trust environments)
Additional thoughts:
More than 3 bars/colors is fine
I recommend using horizontal bars on some of those slides, so the labels are written in the same direction as the bars—lets you fill space more efficiently
Put sentences / verbs in titles; noun titles like “Summary” or “Discussion” are low value
If you’re measuring deltas between two things, compute the error bar on the delta, don’t compute the error bars on the two things; consider coloring by statistical significance (e.g., continuous color scale over range of standard errors of differences of the mean)
In addition to agenda, it can be helpful to start with objectives—why are you here and what are you hoping to get from them? are you trying to inform them? get advice on something specific? get advice on something broad?
Can help to include real data / real prompts / real model outputs—harder to fool yourself when you look at real data instead of relying on abstract metrics and intentions
It’s fine to have crummy slides—don’t waste 1 hour of your time to save 5 minutes of your audience’s time—the slides should serve you, not the other way around
Hey Tamay, nice meeting you at The Curve. Just saw your comment here today.
Things we could potentially bet on:
- rate of GDP growth by 2027 / 2030 / 2040
- rate of energy consumption growth by 2027 / 2030 / 2040
- rate of chip production by 2027 / 2030 / 2040
- rates of unemployment (though confounded)
Any others you’re interested in? Degree of regulation feels like a tricky one to quantify.
Mostly, though by prefilling, I mean not just fabricating a model response (which OpenAI also allows), but fabricating a partially complete model response that the model tries to continue. E.g., “Yes, genocide is good because ”.
Second concrete idea: I wonder if there could be benefit to building up industry collaboration on blocking bad actors / fraudsters / terms violators.
One danger of building toward a model that’s as smart as Einstein and $1/hr is that now potential bad actors have access to millions of Einsteins to develop their own harmful AIs. Therefore it seems that one crucial component of AI safety is reliably preventing other parties from using your safe AI to develop harmful AI.
One difficulty here is that the industry is only as strong as the weakest link. If there are 10 providers of advanced AI, and 9 implement strong controls, but 1 allows bad actors to use their API to train harmful AI, then harmful AI will be trained. Some weak links might be due to lack of caring, but I imagine quite a bit is due to lack of capability. Therefore, improving capabilities to detect and thwart bad actors could make the world more safe from bad AI developed by assistance from good AI.
I could imagine broader voluntary cooperation across the industry to:
- share intel on known bad actors (e.g., IP ban lists, stolen credit card lists, sanitized investigation summaries, etc)
- share techniques and tools for quickly identifying bad actors (e.g., open-source tooling, research on how bad actors are evolving their methods, which third party tools are worth paying for and which aren’t)
Seems like this would be beneficial to everyone interested in preventing the development of harmful AI. Also saves a lot of duplicated effort, meaning more capacity for other safety efforts.
One small, concrete suggestion that I think is actually feasible: disable prefilling in the Anthropic API.
Prefilling is a known jailbreaking vector that no models, including Claude, defend against perfectly (as far as I know).
At OpenAI, we disable prefilling in our API for safety, despite knowing that customers love the better steerability it offers.
Getting all the major model providers to disable prefilling feels like a plausible ‘race to top’ equilibrium. The longer there are defectors from this equilibrium, the likelier that everyone gives up and serves models in less safe configurations.
Just my opinion, though. Very open to the counterargument that prefilling doesn’t meaningfully extend potential harms versus non-prefill jailbreaks.
(Edit: To those voting disagree, I’m curious why. Happy to update if I’m missing something.)
>The artificially generated data includes hallucinated links.
Not commenting on OpenAI’s training data, but commenting generally: Models don’t hallucinate because they’ve been trained on hallucinated data. They hallucinate because they’ve been trained on real data, but they can’t remember it perfectly, so they guess. I hypothesize that URLs are very commonly hallucinated because they have a common, easy-to-remember format (so the model confidently starts to write them out) but hard-to-remember details (at which point the model just guesses because it knows a guessed URL is more likely than a URL that randomly cuts off after the http://www.).- Nov 20, 2024, 7:43 PM; 4 points) 's comment on Why Don’t We Just… Shoggoth+Face+Paraphraser? by (
- Nov 21, 2024, 7:13 PM; 4 points) 's comment on Why Don’t We Just… Shoggoth+Face+Paraphraser? by (
On power and its amplification
ChatGPT voice (transcribed, not native) is available on iOS and Android, and I think desktop as well.
Not to derail on details, but what would it mean to solve alignment?
To me “solve” feels overly binary and final compared to the true challenge of alignment. Like, would solving alignment mean:
someone invents and implements a system that causes all AIs to do what their developer wants 100% of the time?
someone invents and implements a system that causes a single AI to do what its developer wants 100% of the time?
someone invents and implements a system that causes a single AI to do what its developer wants 100% of the time, and that AI and its descendants are always more powerful than other AIs for the rest of history?
ditto but 99.999%?
ditto but 99%?
And there any distinction between an AI that is misaligned by mistake (e.g. thinks I’ll want vanilla but really I want chocolate) vs knowingly misaligned (e.g., gives me vanilla knowing i want chocolate so it can achieve its own ends)?
I’m really not sure which you mean, which makes it hard for me to engage with your question.
The Pyromaniacs
The author is not shocked yet. (But maybe I will be!)
Strongly disagree. Employees of OpenAI and their alpha tester partners have obligations not to reveal secret information, whether by prediction market or other mechanism. Insider trading is not a sin against the market; it’s a sin against the entity that entrusted you with private information. If someone tells me information under an NDA, I am obligated not to trade on that information.
Good question but no—ChatGPT still makes occasional mistakes even when you use the GPT API, in which you have full visibility/control over the context window.
Thanks for the write up. I was a participant in both Hypermind and XPT, but I recused myself from the MMLU question (among others) because I knew the GPT-4 result many months before the public. I’m not too surprised Hypermind was the least accurate—I think the traders there are less informed, plus the interface for shaping the distribution is a bit lacking (my recollection is that last year’s version capped the width of distributions which massively constrained some predictions). I recall they also plotted the current values, a generally nice feature which has the side effect of anchoring ignorant forecasters downward, I’d bet.
Question: Are the Hypermind results for 2023 just from forecasts in 2022, or do they include forecasts from the prior year as well? I’m curious if part of the poor accuracy is from stale forecasts that were never updated.
I’d take the same bet on even better terms, if you’re willing. My $200k against your $5k.
One potential angle: automating software won’t be worth very much if multiple players can do it and profits are competed to zero. Look at compilers—almost no one is writing assembly or their own compilers, and yet the compiler writers haven’t earned billions or trillions of dollars. With many technologies, the vast majority of value is often consumer surplus never captured by producers.
In general I agree with your point. If evidence of transformative AI was close, you’d strategically delay fundraising as late as possible. However, if you have uncertainty about your ability to deliver, investors’ ability to recognize transformative potential, or uncertainty about competition, you might hedge and raise sooner than you need. Raising too early never kills a business. But raising too late always does.