Victor Ashioya

Karma: −20

https://ashioyajotham.github.io/

Victor Ashioya Feb 28, 2025, 6:10 PM
3 points
0
on: Victor Ashioya’s Shortform
A very important direction—we are punishing these [dream] machines for doing what they know best. The average user obviously wants to kill these “hallucinations,” but the researchers in math and sciences in general highly benefit from these “hallucinations.”
Full paper here: https://arxiv.org/abs/2501.13824

Victor Ashioya May 22, 2024, 8:07 AM
1 point
0
on: Victor Ashioya’s Shortform
I watched Sundar’s interview segment on CNBC and he is asked about Sora using Youtube data but he appears sketchy and vague. He just says, “we have laws on copyright...”

Victor Ashioya May 15, 2024, 2:00 PM
2 points
2
on: Victor Ashioya’s Shortform
The first thing I noticed with GPT-4o is that “her” appears ‘flirty’ especially the interview video demo. I wonder if it was done on purpose.

Victor Ashioya Apr 18, 2024, 7:54 PM
1 point
0
on: Victor Ashioya’s Shortform
JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models (non-peer-reviewed as of writing this)
From the abstract:
Based on the framework, we design JailbreakLens, a visual analysis system that enables users to explore the jailbreak performance against the target model, conduct multi-level analysis of prompt characteristics, and refine prompt instances to verify findings. Through a case study, technical evaluations, and expert interviews, we demonstrate our system’s effectiveness in helping users evaluate model security and identify model weaknesses.
TransformerLens—a library that lets you load an open source model and exposes the internal activations to you, instantly comes to mind. I wonder if Neel’s work somehow inspired at least the name.

Victor Ashioya Apr 17, 2024, 3:56 PM
1 point
0
in reply to: Victor Ashioya’s comment on: Victor Ashioya’s Shortform
Also, another interesting detail is that PPO still shows superior performance on RLHF testbeds.

Victor Ashioya Apr 17, 2024, 3:40 PM
1 point
0
on: Victor Ashioya’s Shortform
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
TLDR; a comparison of DPO and PPO (reward-based and reward-free) in relation to RLHF particularly why PPO performs poorly on academic benchmarks.

An excerpt from section 5. Key Factors to PPO for RLHF
We find three key techniques: (1) advantage normalization (Raffin et al., 2021), (2) large-batch-size training (Yu et al., 2022), and (3) updating the parameters of the reference model with exponential moving average (Ouyang et al., 2022).
From the ablation studies, it particularly finds large-batch-size training to be significantly beneficial especially on code generation tasks.

Victor Ashioya Apr 15, 2024, 6:34 PM
2 points
0
on: Victor Ashioya’s Shortform
New paper by Johannes Jaeger titled “Artificial intelligence is algorithmic mimicry: why artificial “agents” are not (and won’t be) proper agents” putting a key focus on the difference between organisms and machines.
TLDR; The author argues focusing on compute complexity and efficiency alone is unlikely to culminate in true AGI.
My key takeaways
1. Autopoiesis and agency
- Autopoiesis being the ability of an organism to self-create and maintain itself.
- Living systems have the capacity of setting their own goals on the other hand organisms, depend on external entities (mostly humans
1. Large v small worlds
- Organisms navigate complex environments with undefined rules unlike AI which navigates in a “small” world confined to well-defined computational problems where everything including problem scope and relevance is pre-determined.
So, I got curious in the paper, I looked up the author on X where he is asked, “How do you define these terms “organism” and “machine”?” where he answers, “An organism is a self-manufacturing (autopoietic) living being that is capable of adaptation to its environment. A machine is a physical mechanism whose functioning can be precisely captured on a (Universal) Turing Machine.”
You can read the full summary here.

Victor Ashioya Apr 6, 2024, 7:05 PM
8 points
3
on: Victor Ashioya’s Shortform
A new paper titled “Many-shot jailbreaking” from Anthropic explores a new “jailbreaking” technique. An excerpt from the blog:
The ability to input increasingly-large amounts of information has obvious advantages for LLM users, but it also comes with risks: vulnerabilities to jailbreaks that exploit the longer context window.
It has me thinking about Gemini 1.5 and it’s long context window.

Victor Ashioya Apr 2, 2024, 8:33 AM
0 points
0
on: Victor Ashioya’s Shortform
The UKAISI (UK AI Safety Institute) and US AI Safety Institute have just signed an agreement on how to “formally co-operate on how to test and assess risks from emerging AI models.”
I found it interesting that both share the same name (not sure about the abbreviation) and now this first-of-its-kind bilateral agreement. Another interesting thing is that one side (Rishi Sunak is optimistic) and the Biden side is doomer-ish.
To quote the FT article, the partnership is modeled on the one between GCHQ and NSA.

Victor Ashioya Mar 31, 2024, 6:47 PM
4 points
0
on: Victor Ashioya’s Shortform
LLM OS idea by Kaparthy is catching on fast.
i) Proposed LLM Agent OS by a team from Rutger’s University
ii) LLM OS by Andrej Kaparthy
ICYMI: Original tweet by Kaparthy on LLM OS.

Victor Ashioya Mar 30, 2024, 6:28 PM
−1 points
1
on: Will OpenAI also require a “Super Red Team Agent” for its “Superalignment” Project?
On “Does OpenAI, or other AGI/ASI developers, have a plan to “Red Team” and protect their new ASI systems from similarly powerful systems?”
Well, we know that red teaming is one of their priorities right now, having formed a red-teaming network already to test the current systems comprised of domain experts apart from researchers which previously they used to contact people every time they wanted to test a new model which makes me believe they are aware of the x-risks (by the way they higlighted on the blog including CBRN threats). Also, from the superalignment blog, the mandate is to:

“to steer and control AI systems much smarter than us.”
So, either OAI will use the current Red-Teaming Network (RTN) or form a separate one dedicated to the superalignment team (not necessarily an agent).
On “How can they demonstrate that an aligned ASI is safe and resistant to attack, exploitation, takeover, and manipulation—not only from human “Bad Actors” but also from other AGI or ASI-scale systems?”
This is where new eval techniques will come in since the current ones are mostly saturated to be honest. With the presence of the Superalignment team, which I believe will have all the resources available (given they have already been dedicated a 20% compute) will be one of their key research areas.
On “If a “Super Red Teaming Agent” is too dangerous, can “Human Red Teams” comprehensively validate an ASI’s security? Are they enough to defend against superhuman ASIs? If not, how can companies like OpenAI ensure their infrastructure and ASIs aren’t vulnerable to attack?”
As human beings we will always try but won’t be enough that’s why open source is key. Companies should engage in bugcrowd program. Glad to see OpenAI engaged in such through their trust portal end external auditing for stuff like malicious actors.
Also, worth noting OAI hires a lot of cyber security roles like Security Engineer etc which is very pertinent for the infrastructure.

Victor Ashioya Mar 29, 2024, 5:10 PM
1 point
0
on: Victor Ashioya’s Shortform
I am just from reading Nathan Lambert’s analysis of DBRX, and it seems the DBRX demo to have a safety filtering in the loop even confirmed by one of the finetuning leads at Databricks. It sure is going to be interesting when I am jailbreaking it.
Here is an excerpt:

Victor Ashioya Mar 27, 2024, 6:24 AM
1 point
0
on: On Lex Fridman’s Second Podcast with Altman
Lex really asked all the right questions. I liked how he tried to trick Sam with Ilya and Q*:
It would have been easier for Sam to trip and say something, but he maintained a certain composure, very calm throughout the interview.

Victor Ashioya Mar 25, 2024, 12:29 PM
1 point
0
in reply to: cubefox’s comment on: Victor Ashioya’s Shortform
Cool! Will check it out!

Victor Ashioya Mar 24, 2024, 5:24 PM
4 points
0
on: Victor Ashioya’s Shortform
The new addition in OpenAI board includes more folks from policy/governance than from technical side:

”We’re announcing three new members to our Board of Directors as a first step towards our commitment to expansion: Dr. Sue Desmond-Hellmann, former CEO of the Bill and Melinda Gates Foundation, Nicole Seligman, former EVP and General Counsel at Sony Corporation and Fidji Simo, CEO and Chair of Instacart. Additionally, Sam Altman, CEO, will rejoin the OpenAI Board of Directors.
Sue, Nicole and Fidji have experience in leading global organizations and navigating complex regulatory environments, including backgrounds in technology, nonprofit and board governance. They will work closely with current board members Adam D’Angelo, Larry Summers and Bret Taylor as well as Sam and OpenAI’s senior management. ”

Victor Ashioya Mar 23, 2024, 9:27 AM
1 point
0
on: Victor Ashioya’s Shortform
I just learnt of this newsletter; “AI News” which basically collects all news about AI into one email and sometimes it could be long considering it gathers everything from Twitter, Reddit and Discord. Overall, it is a great source of news. I sometimes, I find it hard to read everything but by skimming the table of contents, I can discover something interesting and go straight to it. For instance, here is the newsletter (too long I clipped it) for 23rd March 2024:

Victor Ashioya Mar 18, 2024, 3:21 AM
2 points
0
on: Victor Ashioya’s Shortform
The “dark horse” of AI i.e. Apple has started to show its capabilities with MM1 (a family of multimodal models of upto 30B params) trained on synthetic data generated from GPT-4V. The quite interesting bit is the advocacy of different training techniques; both MoE and dense variants, using diverse data mixtures.
From the paper:
It finds image resolution, model size, and pre-training data richness crucial for image encoders, whereas vision-language connector architecture has a minimal impact.
The details are quite neat and too specific for a company like Apple known for being less open as Jim Fan noted compared to the others which is pretty amazing. I think this is just the start. I am convinced they have more in store considering the research they have been putting out.

Victor Ashioya Mar 15, 2024, 5:19 PM
2 points
0
in reply to: Joseph Miller’s comment on: Victor Ashioya’s Shortform
Well, there are two major reasons I have constantly noted:
i) to avoid the negative stereotypes surrounding the terms (AI mostly)
ii) to distance itself from other competitors and instead use terms that are easier to understand e.g. opting to use machine learning for features like improved autocorrecting, personalized volume and smart track.

Victor Ashioya Mar 15, 2024, 5:07 AM
3 points
0
on: Victor Ashioya’s Shortform
Apple’s research team seems has been working lately on AI even though Tim keeps avoiding the buzzwords eg AI, AR in product releases of models but you can see the application of AI in, neural engine, for instance. With papers like “LLM in a flash: Efficient Large Language Model Inference with Limited Memory”, I am more inclined that they are “dark horse” just like CNBC called them.

Victor Ashioya Mar 14, 2024, 12:58 PM
0 points
3
on: Victor Ashioya’s Shortform
Happy pi day everyone. Remember Math (Statistics, probability, Calculus etc) is a key foundation in AI and should not be trivialised.

Victor Ashioya

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models (non-peer-reviewed as of writing this)

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

My key takeaways