RSS

MiguelDev

Karma: 300

help avoid catastrophic AI failures...

An ex­am­i­na­tion of GPT-2′s bor­ing yet effec­tive glitch

MiguelDev18 Apr 2024 5:26 UTC
5 points
3 comments3 min readLW link

In­ter­gen­er­a­tional Knowl­edge Trans­fer (IKT)

MiguelDev28 Mar 2024 8:14 UTC
6 points
0 comments1 min readLW link

RLLMv10 experiment

MiguelDev18 Mar 2024 8:32 UTC
5 points
0 comments2 min readLW link

A T-o-M test: ‘pop­corn’ or ‘choco­late’

MiguelDev8 Mar 2024 4:24 UTC
20 points
13 comments1 min readLW link

Sparks of AGI prompts on GPT2XL and its var­i­ant, RLLMv3

MiguelDev7 Mar 2024 6:33 UTC
4 points
0 comments4 min readLW link

Can RLLMv3′s abil­ity to defend against jailbreaks be at­tributed to datasets con­tain­ing sto­ries about Jung’s shadow in­te­gra­tion the­ory?

MiguelDev29 Feb 2024 5:13 UTC
7 points
2 comments11 min readLW link

Re­search Log, RLLMv3 (GPT2-XL, Phi-1.5 and Fal­con-RW-1B)

MiguelDev15 Feb 2024 3:39 UTC
4 points
0 comments262 min readLW link

GPT2XL_RLLMv3 vs. Bet­terDAN, AI Machi­avelli & Oppo Jailbreaks

MiguelDev11 Feb 2024 11:03 UTC
16 points
4 comments14 min readLW link

Re­search Log, RLLMv2: Phi-1.5, GPT2XL and Fal­con-RW-1B as pa­per­clip maximizers

MiguelDev20 Jan 2024 15:30 UTC
6 points
0 comments10 min readLW link

[Question] rab­bit (a new AI com­pany) and Large Ac­tion Model (LAM)

MiguelDev10 Jan 2024 13:57 UTC
17 points
3 comments1 min readLW link

Re­in­force­ment Learn­ing us­ing Lay­ered Mor­phol­ogy (RLLM)

MiguelDev1 Dec 2023 5:18 UTC
7 points
0 comments29 min readLW link

GPT-2 XL’s ca­pac­ity for co­her­ence and on­tol­ogy clustering

MiguelDev30 Oct 2023 9:24 UTC
6 points
2 comments41 min readLW link

Rele­vance of ‘Harm­ful In­tel­li­gence’ Data in Train­ing Datasets (We­bText vs. Pile)

MiguelDev12 Oct 2023 12:08 UTC
12 points
0 comments9 min readLW link

[Question] Who de­ter­mines whether an al­ign­ment pro­posal is the defini­tive al­ign­ment solu­tion?

MiguelDev3 Oct 2023 22:39 UTC
−1 points
6 comments1 min readLW link

<|end­of­text|> is a van­ish­ing text?

MiguelDev16 Sep 2023 2:34 UTC
10 points
0 comments1 min readLW link

On Ilya Sutskever’s “A The­ory of Un­su­per­vised Learn­ing”

MiguelDev26 Aug 2023 5:34 UTC
6 points
0 comments19 min readLW link

Ex­plor­ing the Re­spon­si­ble Path to AI Re­search in the Philippines

MiguelDev23 Aug 2023 8:44 UTC
6 points
0 comments6 min readLW link

A fic­tional AI law laced w/​ al­ign­ment theory

MiguelDev17 Jul 2023 1:42 UTC
6 points
0 comments2 min readLW link

Ex­plor­ing Func­tional De­ci­sion The­ory (FDT) and a mod­ified ver­sion (ModFDT)

MiguelDev5 Jul 2023 14:06 UTC
11 points
11 comments15 min readLW link

A Mul­tidis­ci­plinary Ap­proach to Align­ment (MATA) and Archety­pal Trans­fer Learn­ing (ATL)

MiguelDev19 Jun 2023 2:32 UTC
4 points
2 comments7 min readLW link