Software engineer and small time DS/ML practitioner.
Templarrr
Penicillin. Gemini tells me that the antibiotic effects of mold had been noted 30 years earlier, but nobody investigated it as a medicine in all that time.
Gemini is telling you a popular urban legend-level understanding of what happened. The creation of Penicillin as a random event, “by mistake”, has at most tangential touch with reality. But it is a great story, so it spread like wildfire.
In most cases when we read “nobody investigated” it actually means “nobody succeeded yet, so they weren’t in a hurry to make it known”, which isn’t very informative point of data. No one ever succeeds, until they do. And in this case it’s not even that—antibiotic properties of some molds were known and applied for centuries before that (well, obviously, before the theory of germs they weren’t known as “antibiotic”, just that they helped...), the great work of Fleming and later scientists was about finding the particularly effective type of mold and extracting the exact effective chemical as well as finding a way to produce that at scale.
I wonder at which point we’ll start seeing LLM-on-a-chip.
One big reason for the current ML/AI systems inefficiencies is just abstraction layering overhead, our pay for the flexibility. We currently run hardware that runs binary calculations that run software that run other software that runs other software (many many layers here, OS/drivers/programming language stacks/NN frameworks etc) that finally runs the part we’re actually interested in—bunch of matrix calculations representing the neural network. If we collapse all the unnecessary layers between, burning the calculations directly to hardware, running particular model should be extremely fast and cheap.
Templarrr’s Shortform
Thank you for this summary! It is nice to see someone covering these topics here, I personally rarely have enough nerves left after 2+ years of this hell. Victory to Ukraine and peaceful skies to us all!
This is what happens when the min wage is too high
… These automated kiosks existed for years and were used in Mac for years. And in places they were set Mac had better employment, not worse—there was exactly the same number of staff members on the same-ish salary but with decreased load on each member, while, as stated, leading to slightly bigger orders and less awkwardness.
So far I have been highly underwhelmed by what has been done with newly public domain properties
Some can argue it’s quite an argument in favor of lowering the length of protected period. We can observe first hand that things going public doesn’t cause any problem for previous owners at all and my opinion is that we are cutting it too far. If we want proper balance between ownership and creativity we need to put the threshold somewhere where it is at least a mild inconvenience for the owners, maybe more.
Oh, there are absolutely correct places to use the phrase and correct places to benefit from reliable simplicity! My main argument is against mindless usage that I unfortunately witness nowadays a lot. Understanding why and when we need to solve for the equilibrium in evaluation replaced by the simple belief in a rule that we should—always and for everything.
Cult of equilibrium
Depends on what you include in the definition of LLM. NN itself? Sure, it can. With the caveat of hardware and software limitations—we aren’t dealing with EXACT math here, floating points operations rounding, non-deterministic order of completion in parallel computation will also introduce slight differences from run to run even though the underlying math would stay the same.
The system that preprocess information, feeds into the NN and postprocess NN output into readable form? That is trickier, given that these usually involve some form of randomness, otherwise LLM output would be exactly the same, given exactly the same inputs and that generally is frowned upon, not very AI-like behavior. But if the system uses pseudo-random generators for that—those also can be described in math terms, if you know the random generator seed.
If they use non-deterministic source for their randomness—no. But that is rarely required and makes system really difficult to debug, so I doubt it.
Both Gemini and GPT-4 also provide quite interesting answers on the very same prompt.
Adam Grant suggests: “I’m giving you these comments because I have very high expectations for you, and I’m confident you can reach them. I’m trying to coach you. I’m trying to help you.” Then you give them the feedback. Love it.
These are great, but unfortunately only work if the person is ready to accept your authority as a coach. If they don’t—they work in an opposite direction.
California Fatburger manager trims hours, eliminates vacation days and raises menu prices in anticipation of $20/hour fast food minimum wage. That seems like a best case...
That’s not how any of this works. You don’t do that beforehand because there will be 20$/h. If you actually need this—you prepare plans conditional on wages becoming 20$/h. If you do this now, that’s because of greed. And because of greed you’ll also repeat it when the wages will rise.
Writers and artists say it’s against the rules to use their copyrighted content to build a competing AI model
The main difference is they say it NOW, after the fact that this happened, and OpenAI said so beforehand. There’s long history of bad things happening when trying to retroactively introduce laws and rules.
You need a way to not punish (too harshly or reliably) the shoplifting mom in need, without enabling roving gangs
And the easiest way to do so would be to make it so moms don’t need to shoplift—provide things in centralized way free of charge or with minimal prices. But in the USA it will be immediately labeled “socialism” and “socialism is bad”.
It really is weird that we don’t think about Russia, and especially the USSR, more in terms of the universal alcoholism.
“Apart from drinking, there is absolutely nothing to do here”. Well, they found an alternative—go kill neighbors. Locally it’s a crime, but when on the scale of countries...
“relative to 1961” label is doing a lot of storytelling here that isn’t necessary present in the original raw data
Policies are organizational scar tissue. They are codified overreactions to situations that are unlikely to happen again
Oversimplification. Most situations where people point to stats like this they conveniently forget that these situation became unlikely to happen again BECAUSE of the policy. If you use an analogy—use it all the way. Scar tissue is important part of the healing. First instance created an open wound and you don’t want to be left with open wound.
technology is predictable if you know the science
The single part of otherwise amazing quote that simply verifiably not true. There are ton of examples when technological use of some scientific principle or discovery was complete surprise for scientists that created/discovered it.
If we don’t want China to have access to cutting edge chips, why are we allowing TSMC and Samsung to set up chip manufacturing in China?
Because “we” that don’t want Chine to have these and “we” that actually have a say in what TSMC and Samsung is doing are two different “we”s.
There are 2 topics mixed here.
Existence of the contrarians.
Side effects of their existence.
My own opinion on 1 is that they are necessary in moderation. They are doing the “exploration” part in the “exploration-exploitation dilemma”. By the very fact of their existence they allow the society in general to check alternatives and find more optimal solutions to the problems comparing to already known “best practices”. It’s important to remember that almost everything that we know now started from some contrarian—once it was a well established truth that Monarchy is the best way to rule the people and democrats were dangerous radicals.
On the 2 - it is indeed a problem that contrarian opinions are more interesting on average, but the solution lies not in somehow making them less attractive—but by making more interesting and attractive conformist materials. That’s why it is paramount to have highly professional science educators and communicators, not just academics. My own favorites are vlogbrothers (John and Hank Green) in particular and their team in Complexly in general.