alexlyzhov

Karma: 387

alexlyzhov Jul 5, 2023, 8:39 PM
5 points
2
in reply to: Holly_Elmore’s comment on: Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?
For every token, model activations are computed once when the token is encountered and then never explicitly revised → “only [seems like it] goes in one direction”

alexlyzhov Jul 4, 2023, 11:52 PM
9 points
2
in reply to: Garrett Baker’s comment on: Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?

with the only recursive element of its thought being that it can pass 16 bits to its next running

I would name activations for all previous tokens as the relevant “element of thought” here that gets passed, and this can be gigabytes.

From how the quote looks, I think his gripe is with the possibility of in-context learning, where human-like learning happens without anything about how the network works (neither its weights nor previous token states) being ostensibly updated.

alexlyzhov Jun 13, 2023, 4:09 PM
2 points
1
on: Aura as a proprioceptive glitch
Among them, one I found especially peculiar is that I distinctly started feeling some sort of sensations outside of my body.
I had this, and it lasted for a year after the retreat. I also found that there’s a strong tendency for the sensations to happen in the area you described.
I could feel sensations substantially outside of the area accessible to my hands too, but they were a bit more difficult to feel. They could correspond to priors for tactile-like affordances for objects at a distance (e.g. graspability of a cup, or speed of a fast-moving vehicle) that are readily constructed by ordinary perception.

alexlyzhov May 27, 2023, 4:25 AM
3 points
0
on: Bandgaps, Brains, and Bioweapons: The limitations of computational science and what it means for AGI
Seems related to https://www.lesswrong.com/posts/qpgkttrxkvGrH9BRr/superintelligence-is-not-omniscience, https://www.lesswrong.com/posts/epgCXiv3Yy3qgcsys/you-can-t-predict-a-game-of-pinball, and similar objections might be applicable.

Papers on protein design

alexlyzhovMay 27, 2023, 1:18 AM

9 points

0 comments3 min readLW link

alexlyzhov May 18, 2023, 7:09 AM
3 points
0
on: Thriving in the Weird Times: Preparing for the 100X Economy
I thought a bit about datasets before and to me it seems like what needs collecting most is detailed personal preference datasets. E.g. input-output examples of how you generally prefer information to be filtered, processed, communicated to you, refined with your inputs; what are your success criteria for tasks, where are the places in your day flow / thought flow where the thing needs to actively intervene and correct you. Especially in those places where you feel you can benefit from cognitive extensions most, based on your bottlenecks. It could initially be too hard to infer from screen logs alone.

alexlyzhov May 11, 2023, 7:38 PM
1 point
on: alexlyzhov’s Shortform
Random idea about preventing model stealing. After finetuning a mixture of experts model with your magic sauce, place the trained experts on geographically distinct servers with heterogeneous tech stacks and security systems to avoid common vulnerabilities. Horcrux vibes

alexlyzhov May 11, 2023, 12:17 AM
2 points
0
on: AI interpretability could be harmful?
Vaguely related paper: Self-Destructing Models: Increasing the Costs of Harmful Dual Uses in Foundation Models is an early attempt to prevent models from being re-purposed via fine-tuning.
It doesn’t seem like a meaningfully positive result. For example, all their plots only track finetuning on up to 200 examples. I imagine they might have even had clear negative results in conditions with >200 examples available for finetuning. After 50-100 examples, the gap between normal finetuning and finetuning from random init, even though still small, grows fast. There are also no plots with x-axis = finetuning iterations. When they optimize for “non-finetunability”, they don’t aim to maintain the language modeling performance, instead, they only impose the constraint of “maintaining finetunability” on one downstream “professions detection task”.
I expect naive solutions to continue to work very poorly on this problem.

alexlyzhov May 7, 2023, 12:25 AM
1 point
0
in reply to: Archimedes’s comment on: Clarifying and predicting AGI
I think “on most cognitive tasks” means for an AGI its t is defined as the first t for which it meets the expert level at most tasks. However, what exactly counts as a cognitive task does seem to introduce ambiguity and would be cool to clarify, e.g. by pointing to a clear protocol for sampling all such task descriptions from an LLM.

alexlyzhov May 7, 2023, 12:11 AM
3 points
0
in reply to: the gears to ascension’s comment on: Clarifying and predicting AGI
Several-months-AGI is required to be coherent in the sense of coherence defined with human experts today. I think this is pretty distinct from coherence that humans were being optimized to have before behavioral modernity (50K years ago).

I agree that evolution optimized hard for some kind of coherence, like persistent self-schema, attitudes, emotional and behavioral patterns, attachments, long-term memory access. But what humans have going for them is the combination of this prior coherence and just 50K years of evolution after humans unlocked access to the abstract thinking toolkit. I don’t think we can expect it to enable much in terms of to the ability to coherently plan to do complex tasks or to the ability to write and reason abstractly.

This makes me think humans struggling at coherence is not good evidence for building agents with large t being much more difficult compared to small t: there wasn’t enough optimization pressure.

alexlyzhov May 6, 2023, 10:31 PM
1 point
0
in reply to: Gerald Monroe’s comment on: Clarifying and predicting AGI

on most cognitive tasks, it beats most human experts

I think this specifies both thresholds to be 50%.

alexlyzhov Mar 19, 2023, 10:00 PM
1 point
0
in reply to: Raemon’s comment on: Sam Altman: “Planning for AGI and beyond”
It doesn’t seem like “shorter timelines” in the safest quadrant has much to do with their current strategy, as they have a gpt-4 paper section on how they postponed the release to reduce acceleration.

alexlyzhov Feb 27, 2023, 3:43 AM
1 point
0
on: The Preference Fulfillment Hypothesis
https://www.lesswrong.com/posts/WKGZBCYAbZ6WGsKHc/love-in-a-simbox-is-all-you-need was vaguely similar

alexlyzhov Feb 26, 2023, 9:15 PM
2 points
0
in reply to: Kaj_Sotala’s comment on: The Preference Fulfillment Hypothesis
Relevant recent paper: https://www.lesswrong.com/posts/8F4dXYriqbsom46x5/pretraining-language-models-with-human-preferences

alexlyzhov Feb 17, 2023, 7:19 PM
7 points
4
in reply to: gwern’s comment on: Bing Chat is blatantly, aggressively misaligned

why it is so good in general (GPT-4)

What are the examples indicating it’s at the level of performance at complex tasks you would expect from GPT-4? Especially performance which is clearly attributable to improvements that we expect to be made in GPT-4? I looked through a bunch of screenshots but haven’t seen any so far.

alexlyzhov Feb 8, 2023, 3:19 AM
1 point
0
in reply to: LawrenceC’s comment on: SolidGoldMagikarp (plus, prompt generation)
Can confirm I consistently had non-deterministic temp-0 completions on older davinci models accessed through the API last year.

alexlyzhov Jan 11, 2023, 2:44 AM
4 points
2
on: [Rumour] Microsoft to invest $10B in OpenAI, will receive 75% of profits until they recoup investment: GPT would be integrated with Office
Bloomberg reported on plans to invest $10B today

alexlyzhov Dec 27, 2022, 5:28 PM
1 point
0
in reply to: Tamsin Leake’s comment on: carado’s Shortform
Have you seen this implemented in any blogging platform other people can use? I’d love to see this feature implemented in some Obsidian publishing solution like quartz, but for now they mostly don’t care about access management.

alexlyzhov Nov 6, 2022, 6:04 AM
1 point
0
on: Humans do acausal coordination all the time
Wow, Zvi example is basically what I’ve been doing recently with hyperbolic discounting too after I’ve spent a fair amount of time thinking about Joe Carlsmith—Can you control the past. It seems to work. “It gives me a lot of the kind of evidence about my future behavior that I like” is now the dominant reason behind certain decisions.

alexlyzhov Oct 30, 2022, 7:04 PM
1 point
0
on: Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
How much time do you expect the form, the coding test, and the interview to take for an applicant?

alexlyzhov

Papers on pro­tein design

Papers on protein design