Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
MiguelDev
Karma:
300
→
help avoid catastrophic AI failures...
All
Posts
Comments
New
Top
Old
Page
1
An examination of GPT-2′s boring yet effective glitch
MiguelDev
18 Apr 2024 5:26 UTC
5
points
3
comments
3
min read
LW
link
Intergenerational Knowledge Transfer (IKT)
MiguelDev
28 Mar 2024 8:14 UTC
6
points
0
comments
1
min read
LW
link
RLLMv10 experiment
MiguelDev
18 Mar 2024 8:32 UTC
5
points
0
comments
2
min read
LW
link
A T-o-M test: ‘popcorn’ or ‘chocolate’
MiguelDev
8 Mar 2024 4:24 UTC
20
points
13
comments
1
min read
LW
link
Sparks of AGI prompts on GPT2XL and its variant, RLLMv3
MiguelDev
7 Mar 2024 6:33 UTC
4
points
0
comments
4
min read
LW
link
Can RLLMv3′s ability to defend against jailbreaks be attributed to datasets containing stories about Jung’s shadow integration theory?
MiguelDev
29 Feb 2024 5:13 UTC
7
points
2
comments
11
min read
LW
link
Research Log, RLLMv3 (GPT2-XL, Phi-1.5 and Falcon-RW-1B)
MiguelDev
15 Feb 2024 3:39 UTC
4
points
0
comments
262
min read
LW
link
GPT2XL_RLLMv3 vs. BetterDAN, AI Machiavelli & Oppo Jailbreaks
MiguelDev
11 Feb 2024 11:03 UTC
16
points
4
comments
14
min read
LW
link
Research Log, RLLMv2: Phi-1.5, GPT2XL and Falcon-RW-1B as paperclip maximizers
MiguelDev
20 Jan 2024 15:30 UTC
6
points
0
comments
10
min read
LW
link
[Question]
rabbit (a new AI company) and Large Action Model (LAM)
MiguelDev
10 Jan 2024 13:57 UTC
17
points
3
comments
1
min read
LW
link
Reinforcement Learning using Layered Morphology (RLLM)
MiguelDev
1 Dec 2023 5:18 UTC
7
points
0
comments
29
min read
LW
link
GPT-2 XL’s capacity for coherence and ontology clustering
MiguelDev
30 Oct 2023 9:24 UTC
6
points
2
comments
41
min read
LW
link
Relevance of ‘Harmful Intelligence’ Data in Training Datasets (WebText vs. Pile)
MiguelDev
12 Oct 2023 12:08 UTC
12
points
0
comments
9
min read
LW
link
[Question]
Who determines whether an alignment proposal is the definitive alignment solution?
MiguelDev
3 Oct 2023 22:39 UTC
−1
points
6
comments
1
min read
LW
link
<|endoftext|> is a vanishing text?
MiguelDev
16 Sep 2023 2:34 UTC
10
points
0
comments
1
min read
LW
link
On Ilya Sutskever’s “A Theory of Unsupervised Learning”
MiguelDev
26 Aug 2023 5:34 UTC
6
points
0
comments
19
min read
LW
link
Exploring the Responsible Path to AI Research in the Philippines
MiguelDev
23 Aug 2023 8:44 UTC
6
points
0
comments
6
min read
LW
link
A fictional AI law laced w/ alignment theory
MiguelDev
17 Jul 2023 1:42 UTC
6
points
0
comments
2
min read
LW
link
Exploring Functional Decision Theory (FDT) and a modified version (ModFDT)
MiguelDev
5 Jul 2023 14:06 UTC
11
points
11
comments
15
min read
LW
link
A Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL)
MiguelDev
19 Jun 2023 2:32 UTC
4
points
2
comments
7
min read
LW
link
Back to top
Next