anithite

Karma: 391

B.Eng (Mechatronics)

anithite Mar 30, 2025, 7:04 AM
1 point
0
on: Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
TLDR:I got stuck on notation [a][b][c][...]→f(a,b,c,...). LLMs probably won’t do much better on that for now. Translating into find an unknown f(*args) and the LLMs get it right with probability ~20% depending on the model. o3-mini-high does better. Sonnet 3.7 did get it one shot but I had it write code for doing substitutions which it messes up a lot.

Like others, I looked for some sort of binary operator or concatenation rule. Replacing “][” with “|” or ”,” would have made this trivial. Straight string substitutions don’t work since “[[]]” can be either 2 or “[...][1][...]” as part of a prime exponent set. The notation is the problem. Staring at string diffs would have helped in hindsight maybe.

Turning this into an unknown f() puzzle makes it straightforward for LLMs (and humans) to solve.
```
1 = f()
2 = f(f())
3 = f(0,f())
4 = f(f(f()))
12 = f(f(f()),f())
0 = 0
-1 = -f()
19 = f(0,0,0,0,0,0,0,f())
20 = f(f(f()),0,f())
-2 = -f(f())
1/2 = f(-f())
sqrt(2) = f(f(-f()))
72^1/6 = f(f(-f()),f(0,-f()))
5/4 = f(-f(f()),0,f())
84 = f(f(f()),f(),0,f())
25/24 = f(-f(0,f()),-f(),f(f()))
```
Substitutions are then quite easy though most of the LLMs screw up a substitution somewhere unless they use code to do string replacements or do thinking where they will eventually catch their mistake.

Then it’s ~25% likely they get it one shot. ~100% is you mention primes are involved or that addition isn’t. Depends on the LLM. o3-mini-high got it. Claude 3.7 got it one shot no hints from a fully substituted starting point but that was best of k~=4 with lots of failure otherwise. Models have strong priors for addition as a primitive and definitely don’t approach things systematically. Suggesting they focus on single operand evaluations (2,4,1/2,sqrt(2)) gets them on the right track but there’s still a bias towards addition.

anithite Feb 27, 2025, 11:01 PM
1 point
0
on: The Illusion of Iterative Improvement: Why AI (and Humans) Fail to Track Their Own Epistemic Drift
None of the labs would be doing undirected drift. That wouldn’t yield improvement for exactly the reasons you suggest.

In the absence of a ground truth quality/correctness signal, optimizing for coherence works. This can give prettier answers (in the way that averaged faces are prettier) but this is limited. The inference time scaling equivalent would be a branching sampling approach that searches for especially preferred token sequences rather than the current greedy sampling approach. Optimising for idea level coherence can improve model thinking to some extent.

For improving raw intelligence significantly, ground truth is necessary. That’s available in STEM domains, computer programming tasks being the most accessible. One can imagine grounding hard engineering the same way with a good mechanical/electrical simulation package. TLDR:train for test-time performance.

Then just cross your fingers and hope for transfer learning into softer domains.

For softer domains, ground truth is still accessible via tests on humans (EG:optimise for user approval). This will eventually yield super-persuaders that get thumbs up from users. Persuasion performance is trainable but maybe not a wise thing to train for.

As to actually improving some soft domain skill like “write better english prose” that’s not easy to optimise directly as you’ve observed.

anithite Feb 6, 2025, 3:26 AM
2 points
0
in reply to: anithite’s comment on: A Visual Task that’s Hard for GPT-4o, but Doable for Primary Schoolers
O1 now passes the simpler “over yellow” test from the above. Still fails the picture book example though.

For a complex mechanical drawing, O1 was able to work out easier dimensions but anything more complicated tends to fail. Perhaps the full O3 will do better given ARC-AGI benchmark performance.

Meanwhile, Claude 3.5 and 4o fail a bit more badly failing to correctly identify axial and diameter dimensions.

Visuospatial performance is improving albeit slowly.

anithite Feb 6, 2025, 2:07 AM
2 points
0
on: The Risk of Gradual Disempowerment from AI
My hope is that the minimum viable pivotal act requires only near human AGI. For example, hack competitor training/inference clusters to fake an AI winter.
Aligning +2SD human equivalent AGI seems more tractable than straight up FOOMing to ASI safely.

One lab does it to buy time for actual safety work.
Unless things slow down massively we probably die. An international agreement would be better but seems unlikely.

anithite Jan 20, 2025, 8:34 PM
5 points
0
on: grey goo is unlikely
This post raises a large number of engineering challenges. Some of those engineering challenges rely on other assumptions being made. For example, the use of energy carrying molecules rather than electricity or mechanical power which can cross vacuum boundaries easily. Overall a lot of “If we solve X via method Y (which is the only way to do it) problem Z occurs” without considering making several changes at once that synergistically avoid multiple problems.

“Too much energy” means too much to be competitive with normal biological processes.

That goalpost should be right at the top and clearly stated instead of “microscopic machines that [are] superior”. “grey goo alone will have doubling times slower than optimised biological systems” is definitely plausible. E-coli can double in 20 minutes in nutrient rich conditions which is hard to beat. If wet nanotech doubles faster but dry nanotech can make stuff biology can’t, then use both. Dry for critical process steps and making high value products and wet for eating the biosphere and scaling up.

Newer semiconductor manufacturing processes use more energy and materials to create each transistor but those transistors use less power and run faster which makes producing them worthwhile. Dry nanotech will be a tool for making things that may be expensive but worthwhile to build like really awesome computers.

Wet nanotech (IE:biology) is plausibly the most efficient at self-replicating but notice humans use all sorts of chemical and physical processes to do other things better. Operating in space with biotech alone for example would be quite difficult.

anithite Dec 27, 2024, 8:54 PM
3 points
0
on: Review: Planecrash
Your image links are all of the form: http://localhost:8000/out/planecrash/assets/Screenshot 2024-12-27 at 00.31.42.png

Whatever process is generating the markdown for this, well those links can’t possibly work.

anithite Dec 27, 2024, 6:15 PM
3 points
0
in reply to: Eric Neyman’s comment on: AI #96: o3 But Not Yet For Thee
I got this one wrong too. Ignoring negative roots is pretty common for non-mathematicians.

I’m half convinced that most of the lesswrong commenters wouldn’t pass as AGI if uploaded.

anithite Dec 12, 2024, 1:21 PM
3 points
0
on: Green goo is plausible
This post is important to setting a lower bound on AI capabilities required for an AI takeover or pivotal act. Biology as an existence proof that some kind of “goo” scenario is possible. It somewhat lowers the bar compared to Yudkowsky’s dry nanotech scenario but still requires AI to practically build an entire scientific/engineering discipline from scratch. Many will find this implausible.

Digital tyranny is a better capabilities lower bound for a pivotal act or AI takeover strategy. It wasn’t nominated though which is a shame.

anithite Sep 13, 2024, 9:45 PM
5 points
0
on: The Great Data Integration Schlep
This is why I disagree with a lot of people who imagine an “AI transformation” in the economic productivity sense happening instantaneously once the models are sufficiently advanced.

For AI to make really serious economic impact, after we’ve exploited the low-hanging fruit around public Internet data, it needs to start learning from business data and making substantial improvements in the productivity of large companies.
Definitely agree that private business data could advance capabilities if it were made available/accessible. Unsupervised Learning over all private CAD/CAM data would massively improve visuo-spatial reasoning which current models are bad at. Real problems to solve would be similarly useful as ground truth for reinforcement learning. Not having that will slow things down.
Once long(er) time horizon tasks can be solved though I expect rapid capabilities improvement. Likely a tipping point where AIs become able to do self-directed learning.
- find technological thing: software/firmware/hardware
- Connect the AI to it robustly.
  - For hardware, AI is going to brick it, either have lots of spares or be able to re-flash firmware at will
  - for software this is especially easy. Current AI companies are likely doing a LOT of RL on programming tasks in sandboxed environments.
- AI plays with the thing and tries to get control of it
  - can you rewrite the software/firmware?
  - can you get it to do other cool stuff?
  - can the artefact interact with other things
    send packets between wifi chips (how low can round trips time be pushed)
    make sound with anything that has a motor
- Some of this is commercially useful and can be sold as a service.
Hard drives are a good illustrative example. Here’s a hardware hacker reverse engineering and messing with the firmware to do something cool.
There is … so much hardware out there that can be bought cheaply and then connected to with basic soldering skills. In some cases, if soft-unbricking is possible, just buy and connect to ethernet/usb/power.
Revenue?
There’s a long tail (as measured by commercial value) of real world problems that are more accessible. On one end you have the subject of your article, software/devices/data at big companies. On the other, obsolete hardware whose mastery has zero value, like old hard disks. The distribution is somewhat continuous. Transaction costs for very low value stuff will set a floor on commercial viability but $1K+ opportunities are everywhere in my experience.
Not all companies will be as paranoid/obstructive. A small business will be happy using AI to write interface software for some piece of equipment to skip the usual pencil/paper --> excel-spreadsheet step. Many OEMs charge ridiculous prices for basic functionality and nickel and dime you for small bits of functionality since only their proprietary software can interface with their hardware. Reverse engineering software/firmware/hardware can be worth thousands of dollars. So much of it is terrible. AI competent at software/firmware/communication reverse engineering could unlock a lot of value from existing industrial equipment. OEMs can and are building new equipment to make this harder but industrial equipment already sold to customers isn’t so hardened.
IOT and home automation is another big pool of solvable problems. There’s some overlap between home automation and industrial automation. Industrial firmware/software complexity is often higher, but AI that learns how to reverse engineer IOT wireless microcontroller firmware could probably do the same for a PLC. Controlling a lighbulb is certainly easier than controlling a CNC lathe but similar software reverse engineering principles apply and the underlying plumbing is often similar.

anithite Sep 8, 2024, 3:10 AM
13 points
0
on: That Alien Message—The Animation
Alternate POV

Science fiction. < 10,000 words. A commissioned re-write of Eliezer Yudkowsky’s That Alien Message https://alicorn.elcenia.com/stories/starwink.shtml

since it hasn’t been linked so far and doesn’t seem to be linked from the original

anithite Aug 22, 2024, 9:46 PM
1 point
0
in reply to: Carl Feynman’s comment on: It’s time for a self-reproducing machine
TLDR:autofac requires solving “make (almost) arbitrary metal parts” problem but that won’t close the loop. Hard problem is building automated/robust re-implementation of some of the economy requiring engineering effort not trial and error. Bottleneck is that including for autofac. Need STEM AI (Engineering AI mostly). Once that happens, economy gets taken over and grows rapidly as things start to actually work.

To expand on that:

“make (almost) arbitrary metal parts”
- can generate a lot of economic value
- requires essentially giant github repo of:
  - hardware designs:machine tools, robots, electronics
  - software:for machines/electronics, non-ai automation
  - better CAD/CAM (this is still mostly unsolved (CF:white collar CAD/CAM workers))
  - AI for tricky robotics stuff
- estimate:100-1000 engineer years of labor from 99th percentile engineers.
  - Median engineers are counterproductive as demonstrated by current automation efforts not working.
  - EG:we had two robots develop “nerve damage” type repetitive strain injury because badly routed wires flexed too much. If designers aren’t careful/mindful of all design details things won’t be reliable.
    This extends to sub-components.
Closing the loop needs much more than just “make (almost) arbitrary metal parts”. “build a steel mill and wire drawing equipment”, is just the start. There are too many vitamins needed representing unimplemented processes
A minimalist industrial core needs things like:
- PCB/electronics manufacturing (including components)
  - IC manufacturing is its own mess
- a lot of chemistry for plastics/lubricants
- raw materials production (rock --> metal) and associated equipment
- wire
Those in turn imply other things like:
- refractory materials for furnaces
- corrosion resistant coatings (nickel?, chromium?)
- non-traditional machining (try making a wire drawing die with a milling machine/lathe)
  - ECM/EDM is unavoidable for many things
Things just snowball from there.

Efficiency improvements like carbide+coatings for cutting tools are also economically justified.

All of this is possible to design/build into an even bigger self-reproducing automated system but requires more engineer-hours put into a truly enormous git repo.

STEM AI development (“E” emphasis) is the enabler.

Addendum: simplifying the machine tools and robots
Simplifications can be made to cut down on vitamin cost of machine tools. Hydraulics really helps IMO:
- servohydraulics for most motion (EG:linear machine tool axes, robots) to cut down on motor sizes and simplify manufacturing
  - efficiency is worse, but saves enormously on power electronics and manufacturing complexity.
  - Similar principles to hydraulic power steering used in cars.
    Boston Dynamics ATLAS Robot uses rotary equivalent for joints (I think).
  - https://www.researchgate.net/figure/Comparison-of-the-Hy-Mo-actuator-with-traditional-hydraulic-actuator-52_fig5_346755993
- hydrostatic bearings for rotary/linear motion in machine tools.
  - No hardened metal parts like in ball/roller bearings
- Feeding fluid without flexible hoses across linear axes via structure similar to a double ended hydraulic cylinder. Just need two sliding rod seals and a hollow rod. Hole in the middle of the center rod lets fluid into the space between the rod seals.
- Both bearings and motion can run off a single shared high pressure oil supply. Treat it like electricity/compressed-air and use one big pump for lots of machines/robots.
End result: machine tools with big spindle motors and small control motors for all axes. Robots use rotary equivalent. Massive reduction in per-axis power electronics, no ballscrews, no robot joint gears.

For Linear/rotary position encoders, calibrated capacitive encoders (same as used in digital calipers) are simple and needs just PCB manufacturing. Optical barcode based systems are also attractive but require an optical mouse worth of electronics/optics per axis, and maybe glass optics too.

anithite Aug 12, 2024, 10:35 PM
27 points
8
on: It’s time for a self-reproducing machine
The key part of the Autofac, the part that kept it from being built before, is the AI that runs it.

That’s what’s doing the work here.

We can’t automate machining because an AI that can control a robot arm to do typical machinist things (EG:changing cutting tool inserts, removing stringy steel-wool-like tangles of chips, etc.) doesn’t exist or is not deployed.

If you have a robot arm + software solution that can do that it would massively drop operational costs which would lead to exponential growth.

The core problem is that currently we need the humans there.

To give concrete examples, a previous company where I worked had been trying to fully automate production for more than a decade. They had robotic machining cells with machine tools, parts cleaners and coordinate measuring machines to measure finished parts. Normal production was mostly automated in the sense that hands off production runs of 6+ hours were common, though particular cells might be needy and require frequent attention.

What humans had to do:
- Changing cutting tools isn’t automated. An operator goes in with a screwdriver and box of carbide inserts 1-2x per shift.
- The operators do a 30-60 minute setup to get part dimensions on size after changing inserts.
  - Rough machining can zero tools outside the machine since their tolerances are larger but someone is sitting there with a T-handle wrench so the 5-10 inserts on an indexable end mill have fresh cutting edges.
Intermittent problems operators handled:
- A part isn’t perfectly clean when measuring. Bad measurement leads to bad adjustment and 1-2 parts are scrap
- chips are too stringy and tangle up, clear the tangle every 15 mins
- chips are getting between a part and fixture and messing up alignment, clean intermittently and pray.
That’s ignoring stupider stuff like:
- Our measurement data processing software just ate the data for a production run so we have to stop production and remeasure 100 parts.
- someone accidentally deleted some files so (same)
- We don’t have the CAM done for this part scheduled to be produced so … find something else we can run or sit idle.
- The employee running a CAM software workflow didn’t double-check the tool-paths and there’s a collision.
And that’s before you get to maintenance issues and (arguably) design defects in the machines themselves leading to frequent breakdowns.

The vision was that a completely automated system would respond to customer orders and schedule parts to be produced with AGVs carrying parts between operations. In practice the AGVs sta there for 10+ years because even the simple things proved nearly impossible to automate completely.

Conclusion

Despite all the problems, the automated cells were more productive than manual production (robots are consistent and don’t need breaks) and the company was making a lot of money. Not great automation is much much better than using manual operators.

It’s hard to grasp how much everything just barely works until you’ve spent a year somewhere that is trying to automate. Barely works is the standard in industry AFAIK so humans are still highly necessary.

It is, in theory, possible to automate almost everything and to build reliable machines and automation. The problem is O-ring theory of economic development. If tomorrow median IQ jumps +2SD automation would rapidly start to just work as actually good solutions are put in place. As is, organizations have to fight against institutional knowledge loss (employees leaving) just to maintain competence.

anithite Jul 30, 2024, 12:29 AM
3 points
1
in reply to: Lennart Finke’s comment on: A Visual Task that’s Hard for GPT-4o, but Doable for Primary Schoolers
If we consider each (include,exclude) decision for (1,2,3,4,5) as a separate question, error rates are 20%-ish. Much better than random guessing. So why does it make mistakes?
If bottlenecking on data is the problem, more data in the image should kill performance. So how about a grid of 3 digit numbers (random vals in range 100...999)?
3.5 sonnet does perfectly. Perfect score answering lookup(row/col) and (find_row_col(number), find duplicates and transcription to CSV.
So this isn’t a bottleneck like human working memory. Maybe we need to use a higher resolution image so it has more tokens to “think” with? That doesn’t seem to work either for the above yellow areas thing either though.
I’m guessing this is straightforward failure to generalize. Tables of numbers are well represented in the training data (possibly synthetic data too), visual geometry puzzles, not so much. The model has learned a few visual algorithms but hasn’t been forced to generalise yet.
Root cause might be some stupid performance thing that screws up image perception the same way BPE text encoding messes up byte level text perception. I’m guessing sparse attention.
Text representations
Text representations are no panacea. Often similar problems (EG:rotate this grid) have very formatting dependent performance. Looking for sub-tasks that are probably more common in training data and composing with those (EG:rotate by composing transpose and mirror operations) allows a model to do tasks it otherwise couldn’t. Text has generalisation issues just like with images.

If Claude3.5 sonnet has pushed the frontier in tetris that would be evidence for generalisation. I predict it still fails badly.

anithite Jul 26, 2024, 9:59 PM
12 points
4
on: A Visual Task that’s Hard for GPT-4o, but Doable for Primary Schoolers
It’s actually a lot worse than that.

Q:Which numbers are above a yellow region in the image?
GPT4o: [2,4]
GPT4: [4,5]
Claude 3.5 Sonnet: [2,5]

Claude sonnet describes the image as follows:

User: List the numbers in the above image, how are they placed in the image? What’s below them?
The numbers shown in the image are:
1, 2, 3, 4, 5
These numbers are placed in a single horizontal row, appearing to be in separate cells or boxes arranged from left to right in ascending numerical order. Each number occupies its own cell, creating a simple table-like structure with 5 columns.

Below the numbers, there are cells or boxes that correspond to each number. Some of these cells are colored, while others are not. Specifically:
1. Below number 1: The cell is white or uncolored.
2. Below number 2: The cell is colored yellow.
3. Below number 3: The cell is white or uncolored.
4. Below number 4: The cell is white or uncolored.
5. Below number 5: The cell is colored yellow.
So, out of the 5 cells below the numbers, two are colored yellow (under 2 and 5), while the rest are white or uncolored.
So these models have very very bad visuospatial abilities.

This seems like a problem that’s will be easy to solve with some synthetic training data.

anithite Jun 17, 2024, 5:15 PM
2 points
0
in reply to: Ebenezer Dukakis’s comment on: Quotes from Leopold Aschenbrenner’s Situational Awareness Paper
TLDR:Memory encryption alone is indeed not enough. Modifications and rollback must be prevented too.
- memory encryption and authentication has come a long way
- Unless there’s a massive shift in ML architectures to doing lots of tiny reads/writes, overheads will be tiny. I’d guesstimate the following:
  - negligible performance drop / chip area increase
  - ~1% of DRAM and cache space^[1]
It’s hard to build hardware or datacenters that resists sabotage if you don’t do this. You end up having to trust the maintenance people aren’t messing with the equipment and the factories haven’t added any surprises to the PCBs. With the right security hardware, you trust TSMC and their immidiate suppliers and no one else.
Not sure if we have the technical competence to pull it off. Apple’s likely one of the few that’s even close to secure and it took them more than a decade of expensive lessons to get there. Still, we should put in the effort.
in any case, overall I suspect the benefit-to-effort ratio is higher elsewhere. I would focus on making sure the AI isn’t capable of reading its own RAM in the first place, and isn’t trying to.
Agreed that alignment is going to be the harder problem. Considering the amount of fail when it comes to building correct security hardware that operates using known principles … things aren’t looking great.
</TLDR> rest of comment is just details
Morphable Counters: Enabling Compact Integrity Trees For Low-Overhead Secure Memories
Memory contents protected with MACs are still vulnerable to tampering through replay attacks. For example, an adversary can replace a tuple of { Data, MAC, Counter } in memory with older values without detection. Integrity-trees [7], [13], [20] prevent replay attacks using multiple levels of MACs in memory, with each level ensuring the integrity of the level below. Each level is smaller than the level below, with the root small enough to be securely stored on-chip.

[improvement TLDR for this paper: they find a data compression scheme for counters to increase the tree branching factor to 128 per level from 64 without increasing re-encryption when counters overflow]
Performance cost
Overheads are usually quite low for CPU workloads:
- <1% extra DRAM required^[1]
- <<10% execution time increase
Executable code can be protected with negligible overhead by increasing the size of the rewritable authenticated blocks for a given counter to 4KB or more. Overhead is then comparable to the page table.

For typical ML workloads, the smallest data block is already 2x larger (GPU cache lines 128 bytes vs 64 bytes on CPU gives 2x reduction). Access patterns should be nice too, large contiguous reads/writes.
Only some unusual workloads see significant slowdown (EG: large graph traversal/modification) but this can be on the order of 3x.^[2]
A real example (intel SGX)
Use case: launch an application in a “secure enclave” so that host operating system can’t read it or tamper with it.
It used an older memory protection scheme:
- hash tree
- Each 64 byte chunk of memory is protected by an 8 byte MAC
- MACs are 8x smaller than the data they protect so each tree level is 8x smaller
  - split counter modes in the linked paper can do 128x per level
- The memory encryption works.
  - If intel SGX enclave memory is modified, this is detected and the whole CPU locks up.
SGX was not secure. The memory encryption/authentication is solid. The rest … not so much. Wikipedia lists 8 separate vulnerabilities including ones that allow leaking of the remote attestation keys. That’s before you get to attacks on other parts of the chip and security software that allow dumping all the keys stored on chip allowing complete emulation.
AMD didn’t do any better of course One Glitch to Rule Them All: Fault Injection Attacks Against AMD’s Secure Encrypted Virtualization
How low overheads are achieved
- ECC (error correcting code) memory includes extra chips to store error correction data. Repurposing this to store MACs +10 bit ECC gets rid of extra data accesses. As long as we have the counters cached for that section of memory there’s no extra overhead. Tradeoff is going from ability to correct 2 flipped bits to only correcting 1 flipped bit.
- DRAM internally reads/writes lots of data at once as part of a row. standard DDR memory reads 8KB rows rather than just a 64B cache line. We can store tree parts for a row inside that row to reduce read/write latency/cost since row switches are expensive.
- Memory that is rarely changed (EG:executable code) can be protected as a single large block. If we don’t need to read/write individual 64 byte chunks, then a single 4KiB page can be a rewritable unit. Overhead is then negligible if you can store MACs in place of an ECC code.
- Counters don’t need to be encrypted, just authenticated. verification can be parallelized if you have the memory bandwidth to do so.
  - can also delay verification and assume data is fine, if verification fails, entire chip shuts down to prevent bad results from getting out.
1. ^
  Technically we need +12.5% to store MAC tags. If we assume ECC (error correcting code) memory is in use, which already has +12.5% for ECC, we can store MAC tags + smaller ECC at the cost of 1 bit of error correction.
2. ^
  Random reads/writes bloat memory traffic by >3x since we need to deal with 2+ uncached tree levels. We can hide latency by delaying verify of higher tree levels and panicking if it fails before results can leave chip (Intel SGX does exactly this). But if most traffic bloats, we bottleneck on memory bandwidth and perf drops a lot.
What links here?
- Noosphere89's comment on The Hopium Wars: the AGI Entente Delusion by Max Tegmark (Oct 14, 2024, 8:10 PM; 6 points)

anithite Jun 10, 2024, 4:54 AM
2 points
0
in reply to: Ebenezer Dukakis’s comment on: Quotes from Leopold Aschenbrenner’s Situational Awareness Paper
What you’re describing above is how Bitlocker on Windows works on every modern Windows PC. The startup process involves a chain of trust with various bootloaders verifying the next thing to start and handing off keys until windows starts. Crucially, the keys are different if you start something that’s not windows (IE:not signed by Microsoft). You can’t just boot Linux and decrypt the drive since different keys would be generated for Linux during boot and they won’t decrypt the drive.
Mobile devices and game consoles are even more locked down. If there’s no bootloader unlock from your carrier or device manufacturer and no vulnerability (hardware or software) to be found, you’re stuck with the stock OS. You can’t boot something else because the chip will refuse to boot anything not signed by the OEM/Carrier. You can’t downgrade because fuses have been blown to prevent it and enforce a minimum revision number. Nothing booted outside the chip will have the keys locked up inside it needed to decrypt things and do remote attestation.
Root isn’t full control
Having root on a device isn’t the silver bullet it once was. Security is still kind of terrible and people don’t do enough to lock everything down properly, but the modern approach seems to be: 1) isolate security critical properties/code 2) put it in a secure armored protected box somewhere inside the chip. 3) make sure you don’t stuff enough crap inside the box the attacker can compromise via a bug too.
They tend to fail in a couple of ways
1. Pointless isolation (EG:lock encryption keys inside a box but let the user ask it to encrypt/decrypt anything)
2. the box isn’t hack proof (examples below)
  1. they forget about a chip debug feature that lets you read/write security critical device memory
  2. you can mess with chip operating voltage/frequency to make security critical operations fail.
3. The stuff too much stuff inside the box (EG: a full Java virtual machine and a webserver)
I expect ML accelerator security to be full of holes. They won’t get it right, but it is possible in principle.
ML accelerator security wish-list
As for what we might want for an ML accelerator:
- Separating out enforcement of security critical properties:
  - restrict communications
    supervisory code configures/approves communication channels such that all data in/out must be encrypted
    There’s still steganography. With complete control over code that can read the weights or activations we can hide data in message timing for example
    Still, protecting weights/activations in transit is a good start.
  - can enforce “who can talk to who”
    example:ensure inference outputs must go through supervision model that OKs them as safe.
    Supervision inference chip can then send data to API server then to customer
    Inference chip literally can’t send data anywhere but the supervision model
  - can enforce network isolation for dangerous experimental system (though you really should be using airgaps for that)
- Enforcing code signing like what apple does. Current Gen GPUs support virtualisation and user/kernel modes. Control what code has access to what data for reading/writing. Code should not have write access to itself. Attackers would have to find return oriented programming attacks or similar that could work on GPU shader code. This makes life harder for the attacker.
  - Apple does this on newer SOCs to prevent execution of non-signed code.
Doing that would help a lot. Not sure how well that plays with infiniband/NVlink networking but that can be encrypted too in principle. If a virtual memory system is implemented, it’s not that hard to add a field to the page table for an encryption key index.
Boundaries of the trusted system
You’ll need to manage keys and securely communicate with all the accelerator chips. This likely involves hardware security modules that are extra super duper secure. Decryption keys for data chips must work on and for inter-chip communication are sent securely to individual accelerator chips similar to how keys are sent to cable TV boxes to allow them to decrypt programs they have paid for.
This is how you actually enforce access control.
If a message is not for you, if you aren’t supposed to read/write that part of the distributed virtual memory space, you don’t get keys to decrypt it. Simple and effective.
“You” the running code never touch the keys. The supervisory code doesn’t touch the keys. Specialised crypto hardware unwraps the key and then uses it for (en/de)cryption without any software in the chip ever having access to it.
What links here?
- Noosphere89's comment on The Hopium Wars: the AGI Entente Delusion by Max Tegmark (Oct 14, 2024, 8:10 PM; 6 points)

anithite Jun 9, 2024, 5:37 PM
2 points
0
in reply to: Ebenezer Dukakis’s comment on: Quotes from Leopold Aschenbrenner’s Situational Awareness Paper
Hardware encryption likely means that dedicated on-chip hardware to handle keys and decrypting weights and activations on-the-fly.

The hardware/software divide here is likely a bit fuzzy but having dedicated hardware or a separate on-chip core makes it easier to isolate and accelerate the security critical operations. If security costs too much performance, people will be tempted to turn it off.

Encrypting data in motion and data at rest (in GPU memory) makes sense since this minimizes trust. An attacker with hardware access will have a hard time getting weights and activations unless they can get data directly off the chip.

Many-key signoff is nuclear-lauch-style security where multiple keyholders must use their keys to approve an action. The idea being that a single rogue employee can’t do something bad like copy model weights to an internet server or change inference code to add a side channel that leaks model weights or to sabotage inference misuse prevention/monitoring.

This is commonly done in high security fields like banking where several employees hold key shares that must be used together to sign code to be deployed on hardware security modules.
What links here?
- Noosphere89's comment on The Hopium Wars: the AGI Entente Delusion by Max Tegmark (Oct 14, 2024, 8:10 PM; 6 points)

anithite Apr 17, 2024, 4:54 AM
1 point
0
in reply to: Eli Tyre’s comment on: My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
Vulnerable world hypothesis (but takeover risk rather than destruction risk). That + first mover advantage could stop things pretty decisively without requiring ASI alignment

As an example, taking over most networked computing devices seems feasible in principle with thousands of +2SD AI programmers/security-researchers. That requires an Alpha-go level breakthrough for RL as applied to LLM programmer-agents.

One especially low risk/complexity option is a stealthy takeover of other AI lab’s compute then faking another AI winter. This might get you most of the compute and impact you care about without actively pissing off everyone.

If more confident in jailbreak prevention and software hardening, secrecy is less important.

First mover advantage depends on ability to fix vulnerabilities and harden infrastructure to prevent a second group from taking over. To the extent AI is required for management, jailbreak prevention/mitigation will also be needed.

anithite Apr 12, 2024, 11:06 PM
1 point
0
in reply to: Eli Tyre’s comment on: My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
Slower is better obviously but as to the inevitability of ASI, I think reaching top 99% human capabilities in a handful of domains is enough to stop the current race. Getting there is probably not too dangerous.

anithite Feb 16, 2024, 6:33 PM
1 point
0
in reply to: cousin_it’s comment on: Drone Wars Endgame
Current ATGMs poke a hole in armor with a very fast jet of metal (1-10km/s). Kinetic penetrators do something similar using a tank gun rather than specially shaped explosives.

“Poke hole through armor” is the approach used by almost every weapon. A small hole is the most efficient way to get to the squishy insides. Cutting a slot would take more energy. Blunt impact only works on flimsy squishy things. A solid shell of armor easily stopped thrown rocks in antiquity. Explosive over-pressure is similarly obsolete against armored targets.

TLDR:”poke hole then destroy squishy insides” is the only efficient strategy against armor.

Modern vehicles/military stuff are armored shells protecting air+critical_bits+people

Eliminate the people and the critical bits can be compacted. The same sized vehicle can afford to split critical systems into smaller distributed modules.

Now the enemy has make a lot more holes and doesn’t know where to put them to hit anything important.

This massively changes offense/defence balance. I’d guess by a factor of >10. Batteries have absurd power densities so taking out 75% of a vehicle’s batteries just reduces endurance. Only way to get a mobility kill is to take out wheels.

There are still design challenges:
- how to avoid ammo cook-off and chain reactions.
- misdirection around wheel motors (improves tradeoffs)
- efficient manufacturing
- comms/radar (antenna has to be mounted externally)
Zerg rush

Quantity has a quality of its own. Military vehicles are created by the thousands, cars by the millions. Probably something similarly sized or a bit smaller, powered by an ICE engine and mass produced would be the best next gen option.

anithite

Revenue?

Addendum: simplifying the machine tools and robots

Conclusion

Text representations

Performance cost

A real example (intel SGX)

How low overheads are achieved

Root isn’t full control

ML accelerator security wish-list

Boundaries of the trusted system

Zerg rush