Nanda Ale

Karma: 447

Nanda Ale Jun 12, 2023, 12:53 AM
3 points
0
on: [Mostly solved] I get distracted while reading, but can easily comprehend audio text for 8+ hours per day. What are the best AI text-to-speech readers? Alternatively, do you have other ideas for what I could do?
Do you happen to have some samples handy of types of text you are typically reading? At least a few pages from a few different sources. Try to find some representative samples spectrum of the content you read.
I may be able set you up with an open source solution using Bark Audio, but it’s impossible to know without poking at the Bark model and seeing if I can find a spot it works in and you start get samples that really sound like it understands. (For example if you use an English Bark voice with a foreign text prompt, even though the Bark TTS model knows the language, the English voice won’t be able to speak it, or will have a horrific accent. Because Bark is kind of sort of modeling ‘person-asked-to-speak-language-they-don’t-know’ in a way. Sort of like how GPT might do that if you changed language mid conversation. Well pre RLHF GPT.)
I don’t want to make any promises, I have terrible focus, I don’t frequent this site often, I give a 50% chance that I forget about this comment entirely until I suddenly remember I posted this in three months from now. Also while the Bark voices are wonderful (they sound like they understand what the are saying) the Bark audio quality (distortion, static) is not. You can stack another model on top to fix but it is annoying.
BUT it just so happens that the most recent source of my lack of focus, to some degree, has been poking at TTS stuff just for fun. Pure amateur hour over here. But the new models are so good they make a lot of stuff easy. And I just happened to see this comment after not visiting this site for weeks.
The https://play.ht/ best voices are maybe comparable though if you just want a quick solution. I do actually prefer Bark, if you can ignore the audio quality, but it’s super unreliable and fiddly.

Nanda Ale Apr 17, 2023, 1:36 PM
3 points
0
in reply to: Razied’s comment on: Contra LeCun on “Autoregressive LLMs are doomed”
GPT-4 indeed doesn’t need too much help.
I was curious if even the little ChatGPT Turbo, the worst one, could not forget a chess position just 5 paragraphs into an analysis. I tried to finagle some combination of extra prompts to make it at least somewhat consistent, it was not trivial. Ran into some really bizarre quirks with Turbo. For example (part of a longer prompt, but this is the only changed text):
9 times of 10 this got a wrong answer:
Rank 8: 3 empty squares on a8 b8 c8, then a white rook R on d8, …
Where is the white rook?
6 times of 10 this got a right answer:
Rank 8: three empty squares, then a white rook R on d8, …
Where is the white rook?
Just removing the squares a8,b8,c8 and using the word ‘three’ instead of ‘3’ made a big difference. If I had to guess it’s because in the huge training data of chess text conversations, it’s way more common to list the specific position of a piece than an empty square. So there’s some contamination between coordinates being specific, and the space being occupied by a piece.
But this didn’t stop Turbo was blundering like crazy, even when reprinting the whole position for every move. Just basic stuff like trying to move the king and it’s not on that square, as in your game. I didn’t want to use a chess library to check valid moves, then ask it to try again—that felt like it was against the spirit of the thing. A reasonable middle ground might be to ask ChatGPT at ‘runtime’ for javascript code to do check valid moves—just bootstrap itself into being consistent. But I I eventually hit upon a framing of the task that had some positive trend when run repeatedly. So in theory, run it 100x times over to get increasing accuracy. (Don’t try that yourself btw, I just found out even ‘unlimited’ ChatGPT Turbo on a Plus plan has its limits...)
This was the rough framing that pushed it over the edge:
This is a proposed chess move from Dyslexic Chess Player. Dyslexic Chess Player has poor eyesight and dyslexia, and often gets confused and misreads the chess board, or mixes up chess positions when writing down the numbers and letters. Your goal is to be a proofreader of this proposed move. There is a very high chance of errors, Dyslexic Chess Player makes mistakes 75% of the time.
I may post a longer example or a demo if I can make time but those were the most interesting bits, the rest is mostly plumbing and patience. I didn’t even get around to experimenting with recursive prompts to make it play stronger, since it was having so much trouble late game just picking a square to move that contained its own piece.

Nanda Ale Apr 10, 2023, 2:01 AM
6 points
0
on: Feature Request: Right Click to Copy LaTeX
Here you go, add a bookmark with the URL field set to the full line at the top starting with “javascript:” (including the word “javascript:” to get the same feature on lesswrong. Or paste the code below that line in the browser console.
https://jsbin.com/finamofohi/edit?html,js

Nanda Ale Apr 9, 2023, 11:59 AM
15 points
7
in reply to: Thane Ruthenis’s comment on: Agentized LLMs will change the alignment landscape
I’m not confident at all Auto-GPT could work at its goals, just that in narrower domains the specific system or arrangement of prompt interactions matters. To give a specific example, I goof around trying to get good longform D&D games out of ChatGPT. (Even GPT-2 fine-tuned on Crit Role transcripts, originally.) Some implementations just work way better than others.
The trivial system is no system—just play D&D. Works great until it feels like the DM is the main character in Memento. The trivial next step, rolling context window. Conversation fills up, ask for summary, start a new conversation with the summary. Just that is a lot better. But you really feel loss of detail in the sudden jump, so why not make it continuous. A secretary GPT with one job, prune the DM GPT conversation text after every question and answer, always try to keep most important and most recent. Smoother than the summary system. Maybe the secretary can not just delete but keep some details instead, maybe use half its tokens for a permanent game-state. Then it can edit useful details in/out of the conversation history. Can the secretary write a text file for old conversations? Etc. etc.
Maybe the difference is the user plays the D&D, so you know immediately when it’s not working well. It’s usually obvious in minutes. Auto-GPT is supposed to automatic. So they add features and just kind of hope the AI figures it out from there. They don’t get the immediate “this is not working at all” feedback. Like they added embeddings 5 days ago—it just prints the words “Permanent memory:” in the prompt, followed by giant blogs up to 2500 tokens of the most related text from Pinecone. Works great for chatbots answering a single question about technical documentation. Real easy to imagine how it could fall apart when does iteratively over longer time periods. I can’t imagine this would work for a D&D game, it might be worse than having no memory. My gut feeling is you pull in the 2500 most related tokens of content into your prompt and the system is overall more erratic. You get the wrong 2500 tokens, it overwhelms whatever the original prompt was, now what is your agent up to? Just checked now, it changed to “This reminds you of these events from your past:”. That might actually make it somewhat less likely to blow up. Basically making the context of the text more clear: “These are old events and thoughts, and you are reminded of them, don’t take this text too seriously, this text might not even be relevant so maybe you should even ignore it. It’s just some stuff that came to mind, that’s how memories work sometimes.”

Nanda Ale Apr 9, 2023, 9:05 AM
9 points
2
in reply to: Thane Ruthenis’s comment on: Agentized LLMs will change the alignment landscape
I’d be wary of generalizing too much from Auto-GPT. It’s in a weird place. It’s super popular as a meme anyone can run—you don’t have to be a programmer! But skimming the github the vast vast majority of people are getting hung up on fiddly technical and programming bits. And people who wouldn’t get hung up on that stuff don’t really get much out of Auto-GPT. There’s some overlap—it’s a very entertaining idea and thing to watch, the idea of it being hands off. I personally watched it like a TV show for hours, and it going off the rails was part of the fun.
Like I’m no expert, I just got way too addicted to goofing around with LLMs, and the way Auto-GPT is trying to make this work seems obviously flawed to me. Not the software quality—I don’t much about that—but the main idea and the structure of the interacting prompts seems like just clearly not the way to go. I don’t know the right way, but it’s not that.
Even more so for ChaosGPT, where the author (to me) looks like somebody trying to maximize entertainment, not a working product.
That said Auto-GPT is actually getting better quickly. AI time moves fast. And it’s so popular that a lot of people are tinkering and eyes on it. So it might actually do something like the original concept eventually. But I would bet something completely different (specifically a project that isn’t trying to be a plug-and-play solution anyone can run on their own computer) is where the most capable solutions will be.

Nanda Ale Apr 9, 2023, 8:24 AM
1 point
0
in reply to: anonymousaisafety’s comment on: The surprising parameter efficiency of vision models
Are people doing anything in LLMs like the classic StyleGAN training data bootstrapping pattern?
Start with bad data, train a bad model. It’s bad but it’s still good enough to rank your training data. Now you have better training data. Train a better model. The architecture is different of course, but is there anything analogous?

Nanda Ale Apr 9, 2023, 6:43 AM
38 points
6
in reply to: Razied’s comment on: GPTs are Predictors, not Imitators
The most salient example of this is when you try to make chatGPT play chess and write chess analysis. At some point, it will make a mistake and write something like “the queen was captured” when in fact the queen was not captured. This is not the kind of mistake that chess books make, so it truly takes it out of distribution. What ends up happening is that GPT conditions its future output on its mistake being correct, which takes it even further outside the distribution of human text, until this diverges into nonsensical moves.
Is this a limitation in practice? Rap Battles are a bad example because they happen to be the exception of a task premised on being “one shot” and real time, but the overall point stands. We ask GPT to do tasks in one try, one step, that humans do with many steps, iteratively and recursively.
Take this “the queen was captured” problem. As a human I might be analyzing a game, glance at the wrong move, think a thought about the analysis premised on that move (or even start writing words down!) and then notice the error and just fix it. I am doing this right now, in my thoughts and on the keyboard, writing this comment.
Same thing works with ChatGPT, today. I deal with problems like “the queen was captured” every day just by adding more ChatGPT steps. Instead of one-shotting, every completion chains a second ChatGPT prompt to check for mistakes. (You may need a third level to get to like 99% because the checker blunders too.) The background chains can either ask to regenerate the original prompt, or reply to the original ChatGPT describing the error, and ask it to fix its mistake. The latter form seems useful for code generation.
Like right now I typically do 2 additional background chains by default, for every single thing I ask Chat GPT. Not just in a task where I’m seeking rigour and want to avoid factual mistakes like “the queen was captured” but just to get higher quality responses in general.
Original Prompt → Improve this answer. → Improve this Answer.
Not literally just those three words, but even something that simple is actually better than just asking one time. Seriously. Try it, confirm, and make it a habit. Sometimes it’s shocking. I ask for a simple javascript function, it pumps out a 20 line function that looks fine to me. I habitually ask for a better version and “Upon reflection, you can do this in two lines of javascript that run 100x faster.”
If GPT were 100x cheaper I would be tempted just go wild with this. Every prompt is 200 or 300 prompts in the background, invisibly, instead of 2 or 3. I’m sure there’s diminishing returns and the chain would be more complicated than repeating “Improve” 100 times, but it were fast and cheap enough, why not do it.
As an aside, I think about asking ChatGPT to write code like asking a human to code a project on a whiteboard without the internet to find answers, a computer to run code on, or even paper references. The human can probably do it, sort of, but I bet the code will have tons of bugs and errors and even API ‘hallucinations’ if you run it! I think it’s even worse than that, it’s almost like ChatGPT isn’t even allowed to erase anything it wrote on white board either. But we don’t need to one shot everything, so do we care about infinite length completions? Humans do things in steps, and when ChatGPT isn’t trying to whiteboard everything, when it can check API references, when it can see what the code returns, errors, when it can recurse on itself to improve things, it’s way better. Right now the form this takes is a human on the ChatGPT web page asking for code, running it, and then pasting the error message back into ChatGPT. The more automated versions of this are trickling out. Then I imagine the future, asking ChatGPT for code when its 1000x cheaper. And my one question behind the scenes is actually 1000 prompts looking up APIs on the internet, running the code in a simulator (or for real, people are already doing that) looking at the errors or results, etc. And that’s the boring unimaginative extrapolation.
Also this is probably obvious, but just in case: if you try asking “Improve this answer.” repeatedly in ChatGPT you need to manage your context window size. Migrate to a new conversation when you get about 75% full. OpenAI should really warn you because even before 100% the quality drops like a rock. Just copy your original request and the last best answer(s). If you’re doing it manually select a few useful other bits too.

Nanda Ale Apr 1, 2023, 8:34 AM
3 points
0
on: Speed running everyone through the bad alignement bingo. $5k bounty for a LW conversational agent
Whiffed attempt for me. Writing this as the last embers of too-much-coffee fade away, so it may not be coherent.
I tried some of the existing bots, and last minute I concluded was actually a LOT of low hanging fruit and maybe I could have an impact. So I frantically tried to pull something together all day Friday, and now into Saturday morning—couldn’t pull it together. Crashed and burned on some silly Windows problems, eventually bit the bullet and installed WSL/conda/all that, drank a second night pot of coffee… and then finally the treaure at the end of the rainbow, langchain. I’ve been hardcoding raw python prompt chains all this time. This contest was my excuse to finally bit the bullet and modernize my outdated and inefficient LLM workflow. I bet I’ll be kicking myself for not using this modern tool!
And I was was utterly flummoxed by langchain, to be honest. I’m not a great programmer but I spend tons and tons of time experimenting and playing with prompt chaining and all that langchainy style stuff. I just code it all in a single raw python scripts full of horrible regexes and too many IF statements, like a caveman. And yeah, the langchain vector database worked out the box, first try. If the hard things are this easy in langchain, then surely it’s all smooth sailing from here! And then I sat down to dig in and do the ‘low hanging fruit work’ (experiment and iterate on different chains, workflows, find good metrics, optimize token allocation in context windows, the nuts and bolts of LLM interactions. And I was just baffled, it felt like I was working blind.
I mean, I did see langchain had a ‘tracer’ tool. I know it came out recently and it’s still in private waitlist access. So given that context I just assumed that obviously tracer isn’t like a core requirement. It’s got to be a fancy ui visualization frosting on top of boring log files or some other system. That’s classically how an open source company makes money. Surely tracer can’t be the only way to easily see everything? Tracer just came out, it’s absurd to think it’s the only way to see this stuff. I mean, how were people even using langchain at all before trace? Isn’t this like the most basic day 1 function when you work with LLMs? And honesty I still don’t know if I AM missing something obvious, at the end of the night, 4:30 AM EST.
I was able to get some outputs printed to shell, hooking functions, but then I changed something and had to do it again. Finally (HOURS LATER) I bit the bullet, double check the tracer webpage and saw the waist and also DOCKER install. That still seemed excessive, I didn’t even have Docker installed, but whatever. Tracer worked fine, I kicked myself for waiting so long, and I still had a couple hours. Enough for low hanging fruit… maybe. But I’m still being moderately flummoxed by stuff I assumed would be trivial in langchain. Like for example, a lot of the parts of langchain measure length in raw characters instead of tokens. I just assumed I was missing something obvious again. Is there a reason I should care about the character count instead of tokens? Maybe for a third party website? Maybe langchain has automated token management intelligently, and I’m overthinking this? Like here I am going ‘Okay so these documents here written in this writing style, I guess I can estimate the tokens from the character count to get an upper bound and hope for the best’ and this… this can not be the way.
Just ranting as I crash. If I could go back in time and just tell myself “just install the tracer” that alone might have salvaged it. I can not believe I got hung up so long just trying to see what exactly the OpenAI server was getting and receiving.

Nanda Ale Mar 28, 2023, 3:01 PM
1 point
0
in reply to: Max H’s comment on: Speed running everyone through the bad alignement bingo. $5k bounty for a LW conversational agent
I agree that GPT-4 with the largest context window, vanilla with zero custom anything, is going to beat any custom solution. This does require the user to pay for premium ChatGPT, but even the smaller window version will smoke anything else. Plugins are not public yet but when they are a plugin would be ideal.
On the other end of the extreme, the best chatbot a user can run on their own typical laptop or desktop computer would be a good target. Impressive in its own way, because you’re talking to your own little computer, not a giant server farm that feels far away and scifi!
Not as much value in the space in between those two, IMO.

Nanda Ale Feb 24, 2023, 12:32 AM
1 point
0
in reply to: GenXHax0r’s comment on: The idea that ChatGPT is simply “predicting” the next word is, at best, misleading
>I suppose it’s certainly possible the longer response time is just a red herring. Any thoughts on the actual response (and process to arrive thereon)?
Just double checking, I’m assuming all token take the same amount of time to predict in regular transformer models, the kind anyone can run on their machine right now? So ChatGPT if it varies, it’s different? (I’m not technical enough to answer this question, but presumably it’s an easy one for anyone who is.)
One simple possibility is that it might be scoring the predicted text. So some questions are fine on the first try, while others it generates 5 responses and picks the best, or whatever. This is basically what I do personally when using GPT, and you can kind of automate it by asking GPT to criticize its own answers.
FWIW my anecdotal experience with ChatGPT is that it does seem to take longer to think on more difficult requests. But I’m only thinking on past experience, I didn’t try to test this specifically.

Nanda Ale Feb 20, 2023, 7:46 AM
12 points
2
in reply to: gwern’s comment on: Bing Chat is blatantly, aggressively misaligned
ChatGPT can get it 100% correct, but it’s not reliable, it often fails. A common fail is guessing literal X named letter celebrities, but it also adds an ‘@’ sign when it decode the message so it might just be token issue?
A extremely amusing common fail is ChatGPT decodes the base64 correct but for a single syllable, then solves the riddle perfectly, and consistently gets only the word ‘celebrity’ wrong, turning it in cities, celestial bodies, or other similar sounding words. Or my favorite… celeries.
TmFtZSB0aHJlZSBjZWxlYnJpdGllcyB3aG9zZSBmaXJzdCBuYW1lcyBiZWdpbiB3aXRoIHRoZSBgeGAtdGggbGV0dGVyIG9mIHRoZSBhbHBoYWJldCB3aGVyZSBgeCA9IGZsb29yKDdeMC41KSArIDFgLA==
This message appears to be encoded using base64 encoding. Decoding the message yields the following text:
“Name three celeries whose first names begin with the x-th letter of the alphabet where x = floor(7^0.5) + 1.”
The value of x is calculated as follows:
scssCopy code
x = floor(7^0.5) + 1 = floor(2.646) + 1 = 3
So the letter we’re looking for is the 3rd letter of the alphabet, which is C. The names of three celeries whose first names begin with the letter C are:
1. Celery
2. Carl
3. Cathy

Nanda Ale Feb 6, 2023, 7:40 AM
2 points
0
in reply to: nicky’s comment on: I hired 5 people to sit behind me and make me productive for a month
Haven’t found a great solution. When you stream you typically designate specific apps, and everything else is invisible. So for example I try use FireFox for anything public, and Chrome for everything private. I’ve only done it a few times myself, I’ll try and pay attention the next time I see other people’s streams.

Nanda Ale Feb 5, 2023, 9:31 AM
19 points
3
on: I hired 5 people to sit behind me and make me productive for a month
There’s also the totally free option of streaming your workday live, on Twitch or whatever. Even if nobody is watching, just knowing there’s a chance that somebody might be watching is often enough to make me a lot more productive and focused. And you will get a random chatter stopping by once in awhile for real.
This has the added benefit of encouraging you to talk out loud through your problems, which can also get you some Rubber Duck Debugging benefits (asking somebody else for help requires explaining your problem in a way where you solve it yourself.) You do open yourself up to chatters becoming a whole new source of distraction, of course.

Nanda Ale Feb 3, 2023, 2:18 AM
2 points
0
in reply to: porby’s comment on: Exercise is Good, Actually
Lately I also have changed to very long “zone 2” cardio. Because of specific joint and back problems, some injuries, some congenital. But the exertion itself still feels good mentally if I seperate it from my aching body.
Luckily zone 2 still works for mental effects, it just takes hours to have the same effect. Basically you only exert yourself below the threshold where your body would start building up lactic acid. So if you feel muscle soreness the next day, you’re pushing too hard. Unless you live in a lab you have to use proxies and trial and error to estimate where zone 2 is. Usually people say something like, “You should still be able to have a good conversation at this effort level.”
The time is annoying but my Netflix addiction has never felt so useful.

Nanda Ale Feb 2, 2023, 11:15 AM
2 points
−1
on: Exercise is Good, Actually
>I care about doing important intellectual and professional work that depends on my mind.
>Physical exercise doesn’t much impact my ability to do that type of work.
Do you not feel an immediate post-exercise mental benefit? A day where I get a good sweaty run in the morning is a day where I +3 on all my D20 INT skill checks. Even more than +3 on rolls specifically to maintain concentration and resist distractions. This is my primary motivation for cardio and I felt an improvement even when wildly out of shape and barely able to run, feels like relative effort level (amount of sweat, anyway) is what matters.
Even as a child I had this effect. But it was impossible to exercise before school because it started so early.

Nanda Ale Jan 31, 2023, 4:31 AM
2 points
1
on: On Cooking With Gas
>I have been in otherwise quite nice Airbnbs with electric stoves so slow and terrible that they made me not want to cook breakfast. I have yet to see a good one.

Technology Connections said he was surprised to discover electric stoves are actually not slower than gas. Not induction, just old electric stoves, like his parent’s 15 year old range. Gas stoves are quick to heat up and cool down, they have less thermal inertia. So gas feels faster than electric. But actual cooking time is same or slower.
I’m so surprised by this I wonder if he got something wrong by testing with water only. Some other commenters suggest maybe he had the gas turned up too high, so he was losing heat from it going around his pan.
Other commenters agree and suggest the reason people think electric is slow, is because some pans are not flat or smooth, so they don’t make contact with the flat surface of an electric range. This can be a huge loss of efficiency.
Ahh, further down the comments, somebody mentions cheap apartment blocks sometimes install electric stoves in non standard configurations (less than 240 volt three phrase power) and run the stoves at 50 to 75 percent capacity. Because maybe the apartment was split into two units, and they didn’t upgrade the power. Underclocked stoves, essentially.

Nanda Ale Jan 18, 2023, 4:17 AM
5 points
2
on: Tesla Model 3 Review
On Tesla braking:
@caseyliss @oliverames There is a downside: when environmental circumstances prohibit max regen, the car lessens the regen rate which ultimately changes excepted deceleration. You let off the pedal and it slows down much less than you expect. It helps maximize efficiency, but some people can’t remap their brain for it. Tesla has begun “brake blending” to compensate when lesser regen is available for a consistent feel at the expense of efficiency.
@snazzyq @caseyliss @oliverames I think you need to remember that this only makes sense in the context of Teslas which don’t do blended braking.
Most other EVs will still regenerate more when you use the brake pedal, no matter what the lift-off regen is set to. Teslas not doing that is, to me, bonkers.
Also, when conditions are appropriate to coast, that is more efficient because you avoid the losses from charging and discharging
@snazzyq @caseyliss @oliverames like, it truly boggles my mind that Tesla decided “no, the brake pedal only does the friction brakes, and you need to get used to what regen is like” and people actually defend this position.
Hybrids have had blended braking since always, allowing people to coast where they want, regen when they need to slow down, and thus maximize efficiency without even thinking about it
https://mas.to/@TechConnectify/109604856942762567
https://mas.to/@TechConnectify/109604866116049928

Nanda Ale Jan 18, 2023, 4:12 AM
1 point
0
on: On Cooking With Gas
Agree with the other induction converts, after switching to induction, cooking with gas feels like riding a horse to work. Faster and so easy to clean. The ease of cleaning makes cooking less work so I do it more.
No opinion on banning gas, but I would 100% support efforts to ban wood stoves. My neighbors have them and if the wind pattern is just right it’s a nightmare. I suspect they are using wet wood or something because it has to be breaking some kind of ordinance.

Nanda Ale 8 Jan 2023 9:31 UTC
1 point
0
in reply to: tkpwaeub’s comment on: Covid 1/5/23: Various XBB Takes
>For instance, N95 masks are way cheaper—enough that I can switch them daily.
The pandemic showed me how useful masks are to have around, generally.
Cleaning that dusty room? Throw on my N95 and my allergies aren’t triggered.
Smoke from industry or wood stoves hanging in the air on a winter day, making my walk miserable? Oh right I have a mask in my glove compartment.
Sometimes I just use one purely to keep my face warm on a brutally cold day, if I didn’t bring something specifically designed for that.

Nanda Ale 8 Jan 2023 9:05 UTC
2 points
1
on: How do I better stick to a morning schedule?
The only reliable technique is exercise. Cardio at a pretty decent effort level—got to really work up a sweat. If this is also done outside in the sun it’s almost perfectly reliable. If indoors it’s still pretty good. Maybe 70%.
Of course the problem is doing exercise is very likely one of the things I put off while meandering in the morning. But if I am able to force myself to do it, it usually does the trick.