Software engineer and small time DS/ML practitioner.
Templarrr
The situation is that there is a new drug that is helping people without hurting anyone, so they write an article about how it is increasing ‘health disparities.’
Isn’t “solving for the equilibrium” a big thing in this community? That’s what articles like this do—count not only first order effects, but also what those lead to.
Specifically—people with money and resources gobbling up all the available “miracle” drug, making people with less resources unable to get one even for the established medical use. So yeah, I really don’t see a problem with the article title (specifically title, hadn’t read the content!), it’s stating the facts. Finding new usage for limited resource makes poor people access to it even worse than before.
Of course, “let’s make less miracle drugs” isn’t a solution, solution is to make more of them, so that everyone who need one can get one. Finding new cures isn’t the problem, terrible distribution pipelines is.
only to find out it is censored enough I could have used DALL-E and MidJourney.
Last “censoring” of Stable Diffusion was done via the code and could’ve been turned off via 2 lines of code change. Was it done other way this time?
Probably some people would have, if asked in advance, claimed that it was impossible for arbitrarily advanced superintelligences to decently compress real images into 320 bits
And it still is.
This is really pushing the definition of what can be considered “image compression”. Look, I can write a sentence “black cat on the chessboard” and most of you (except the people with aphantasia) will see an image in their mind eye. And that phrase is just 27 bytes! I have a better “image compression” than in the whitepaper! Of course everyone see different image, but that’s just “high frequency details”, not the core meaning.
First it was hands. Then it was text, and multi-element composition. What can we still not do with image generation?
Text generation is considerably better, but still limited to few words, maybe few sentences. Ask it to generate you a monitor with Python code on it and you’ll see current limitations of this. It is an improvement for sure but in no way “solved” task.
POSIWID. Metric being optimized is not “having the most money”. It is debatable if it should be, as one of the “poor Europeans” my personal opinion is that we’re doing just fine.
There are 2 topics mixed here.
Existence of the contrarians.
Side effects of their existence.
My own opinion on 1 is that they are necessary in moderation. They are doing the “exploration” part in the “exploration-exploitation dilemma”. By the very fact of their existence they allow the society in general to check alternatives and find more optimal solutions to the problems comparing to already known “best practices”. It’s important to remember that almost everything that we know now started from some contrarian—once it was a well established truth that Monarchy is the best way to rule the people and democrats were dangerous radicals.
On the 2 - it is indeed a problem that contrarian opinions are more interesting on average, but the solution lies not in somehow making them less attractive—but by making more interesting and attractive conformist materials. That’s why it is paramount to have highly professional science educators and communicators, not just academics. My own favorites are vlogbrothers (John and Hank Green) in particular and their team in Complexly in general.
Penicillin. Gemini tells me that the antibiotic effects of mold had been noted 30 years earlier, but nobody investigated it as a medicine in all that time.
Gemini is telling you a popular urban legend-level understanding of what happened. The creation of Penicillin as a random event, “by mistake”, has at most tangential touch with reality. But it is a great story, so it spread like wildfire.
In most cases when we read “nobody investigated” it actually means “nobody succeeded yet, so they weren’t in a hurry to make it known”, which isn’t very informative point of data. No one ever succeeds, until they do. And in this case it’s not even that—antibiotic properties of some molds were known and applied for centuries before that (well, obviously, before the theory of germs they weren’t known as “antibiotic”, just that they helped...), the great work of Fleming and later scientists was about finding the particularly effective type of mold and extracting the exact effective chemical as well as finding a way to produce that at scale.
I wonder at which point we’ll start seeing LLM-on-a-chip.
One big reason for the current ML/AI systems inefficiencies is just abstraction layering overhead, our pay for the flexibility. We currently run hardware that runs binary calculations that run software that run other software that runs other software (many many layers here, OS/drivers/programming language stacks/NN frameworks etc) that finally runs the part we’re actually interested in—bunch of matrix calculations representing the neural network. If we collapse all the unnecessary layers between, burning the calculations directly to hardware, running particular model should be extremely fast and cheap.
Templarrr’s Shortform
Thank you for this summary! It is nice to see someone covering these topics here, I personally rarely have enough nerves left after 2+ years of this hell. Victory to Ukraine and peaceful skies to us all!
This is what happens when the min wage is too high
… These automated kiosks existed for years and were used in Mac for years. And in places they were set Mac had better employment, not worse—there was exactly the same number of staff members on the same-ish salary but with decreased load on each member, while, as stated, leading to slightly bigger orders and less awkwardness.
So far I have been highly underwhelmed by what has been done with newly public domain properties
Some can argue it’s quite an argument in favor of lowering the length of protected period. We can observe first hand that things going public doesn’t cause any problem for previous owners at all and my opinion is that we are cutting it too far. If we want proper balance between ownership and creativity we need to put the threshold somewhere where it is at least a mild inconvenience for the owners, maybe more.
Oh, there are absolutely correct places to use the phrase and correct places to benefit from reliable simplicity! My main argument is against mindless usage that I unfortunately witness nowadays a lot. Understanding why and when we need to solve for the equilibrium in evaluation replaced by the simple belief in a rule that we should—always and for everything.
Cult of equilibrium
Depends on what you include in the definition of LLM. NN itself? Sure, it can. With the caveat of hardware and software limitations—we aren’t dealing with EXACT math here, floating points operations rounding, non-deterministic order of completion in parallel computation will also introduce slight differences from run to run even though the underlying math would stay the same.
The system that preprocess information, feeds into the NN and postprocess NN output into readable form? That is trickier, given that these usually involve some form of randomness, otherwise LLM output would be exactly the same, given exactly the same inputs and that generally is frowned upon, not very AI-like behavior. But if the system uses pseudo-random generators for that—those also can be described in math terms, if you know the random generator seed.
If they use non-deterministic source for their randomness—no. But that is rarely required and makes system really difficult to debug, so I doubt it.
Both Gemini and GPT-4 also provide quite interesting answers on the very same prompt.
Adam Grant suggests: “I’m giving you these comments because I have very high expectations for you, and I’m confident you can reach them. I’m trying to coach you. I’m trying to help you.” Then you give them the feedback. Love it.
These are great, but unfortunately only work if the person is ready to accept your authority as a coach. If they don’t—they work in an opposite direction.
California Fatburger manager trims hours, eliminates vacation days and raises menu prices in anticipation of $20/hour fast food minimum wage. That seems like a best case...
That’s not how any of this works. You don’t do that beforehand because there will be 20$/h. If you actually need this—you prepare plans conditional on wages becoming 20$/h. If you do this now, that’s because of greed. And because of greed you’ll also repeat it when the wages will rise.
Writers and artists say it’s against the rules to use their copyrighted content to build a competing AI model
The main difference is they say it NOW, after the fact that this happened, and OpenAI said so beforehand. There’s long history of bad things happening when trying to retroactively introduce laws and rules.
virtually all the violent crime prosecutions was caused by a few hundred people. Which is very much not the same. That’s the real reason why EU “pretend that we do not know such things”. If the goal is to continue prosecute who we always prosecuted—we can use AI all the way. If we want to do better… we can’t.