kpreid

Karma: 1,491

https://switchb.org/kpreid/

kpreid 17 Jan 2023 23:13 UTC
3 points
1
in reply to: Optimization Process’s comment on: Models Don’t “Get Reward”

What distinguishes this from how my brain works?

Your brain stores memories of input and also of previous thoughts you had and the experience of taking actions. Within the “replaced with a new version” view of the time evolution of your brain (which is also the pure-functional-programming view of a process communicating with the outside world), we can say that the input it receives next iteration contains lots of information from outputs it made in the preceding iteration.

But with the reinforcement learning algorithm, the previous outputs are not given as input. Rather, the previous outputs are fed to the reward function, and the reward function’s output is fed to the gradient descent process, and that determines the future weights. It seems like a much noisier channel.

Also, individual parts of a brain (or ordinary computer program with random access memory) can straightforwardly carry state forward that is mostly orthogonal to state in other parts (thus allowing semi-independent modules to carry out particular algorithms); it seems to me that the model cannot do that — cannot increase the bandwidth of its “train of thought while being trained” — without inventing an encoding scheme to embed that information into its performance on the desired task such that the best performers are also the ones that will think the next thought. It seems fairly implausible to me that a model would learn to execute such an internal communication system, while still outcompeting models “merely” performing the task being trained.

(Disclaimer: I’m not familiar with the details of ML techniques; this is just loose abstract thinking about that particular question of whether there’s actually any difference.)

kpreid 24 Sep 2022 16:49 UTC
5 points
1
in reply to: Jon Garcia’s comment on: You Are Not Measuring What You Think You Are Measuring

Next, high and low settings are chosen for each X factor, and all possible combinations of settings are arranged in a hypercube. Instead of experimenting on one factor at a time with enough repetitions to build up statistical significance, you can perform just a few repetitions at each corner of the hypercube.

This concept reminds me of the problem of planning software tests: I want to exercise all behaviors of the code under test, but actually testing the cartesian product of input conditions often means writing a test that is so generic it duplicates the code under test (unless there is a more naïve algorithm that the test can use), and is hard to evaluate for its own correctness. Instead, I end up writing a selected set of cases intended to cover interesting combinations of inputs — but then the problem is thinking of which inputs are worth testing. When bugs are discovered, they may be combinations of inputs that were not thought of (or they may be parameters we didn’t think of testing, i.e. implicitly put in the “control” category, or specific edge-case values of parameters we did test).

An alternative to hand-written testing of specific cases is to write a property test, like “is input A + input B always ≤ output C, under a wide-ranging selection of inputs”. This feels analogous to measuring correlations in that hypercube — and the part of the actual output that you’re not checking precisely (in my example, the value A + B − C) is the part of the test that is “noise” rather than “control” because we’ve decided it is more practical to ignore that information than to control it (write a test that contains or computes the exact answer to expect).

kpreid 29 Dec 2020 18:09 UTC
23 points
0
on: The First Sample Gives the Most Information
I like this post and am not intending to argue against its point by the following:

I read the paragraph about orders of magnitude and immediately started thinking about whether there are good counterexamples. Here are two: wires are used in lengths from nanometers to kilometers, and computer programs as a category run for times from milliseconds to weeks (even considering only those which are intended to have a finite task and not to continue running until cancelled).

Common characteristics of these two examples are that they are one-dimensional (no “square-cube law” limits scaling) and that they are arguably in some sense the most extensible solutions to their problem domains (a wire is the form that arbitrary length electrical conductors take, and most computer programs are written in Turing-complete languages).

Perhaps the caveat is merely that “some things scale freely such that the order of magnitude is no new information and you need to look at different properties of the thing”.

kpreid 13 Apr 2020 20:58 UTC
3 points
in reply to: stoat’s comment on: Are there any naturally occurring heat pumps?
For what it’s worth, https://en.wikipedia.org/wiki/Evaporative_cooler takes the perspective (in one paragraph) that “Vapor-compression refrigeration uses evaporative cooling, but the evaporated vapor is within a sealed system, and is then compressed ready to evaporate again, using energy to do so.” So, in this perspective, evaporative cooling is a part of the system and forced recirculation (requiring the energy source mentioned in the question) is another.

heat pumps not refrigerators

Note that what is colloquially called a heat pump is the same fundamental thing as a refrigerator — equipment is referred to as a “heat pump” when it is used for heating rather than, or in addition to, cooling, but the processes and principles are the same (with the addition of a “reversing valve” so that the direction of operation may be changed, when both heating and cooling are wanted).

kpreid 26 Feb 2020 17:41 UTC
3 points
in reply to: Long try’s comment on: How does electricity work literally?
Isolation is not about surges, but about preventing current from flowing in a particular path at all. In a transformer, there is no conductive (only magnetic) path from the input side to the output side. So, if you touch one or more of the low-voltage output terminals of a transformer, you can’t thereby end up part of a high-voltage circuit no matter what else you’re also touching; only experience the low voltage. This is how wall-plug low voltage power supplies work. Even the ones that are using electronic switching converters (nearly all of them today) are using a transformer to provide the isolation: the line voltage AC is converted to higher frequency AC, run through a small transformer (the higher the frequency, the smaller a transformer you need for the same power) and converted back to DC.

kpreid 1 Sep 2016 14:56 UTC
3 points
in reply to: ChristianKl’s comment on: Rationality Quotes Thread February 2016
Thanks for doing that!

kpreid 22 Aug 2016 18:38 UTC
2 points
in reply to: ChristianKl’s comment on: Rationality Quotes Thread February 2016
Is there something not-paywalled which describes what the relevant old definitions were?

kpreid 5 Sep 2015 20:05 UTC
2 points
on: Test Driven Thinking
Your description of TDD is slightly incomplete: the steps include, after writing the test, running the test when you expect it to fail. The idea being that if it doesn’t fail, you have either written an ineffective test (this is more likely than one might think) or the code under test actually already handles that case.

Then you write the code (as little code as needed) and confirm that the test passes where it didn’t before to validate that work.

kpreid 1 Sep 2015 1:30 UTC
0 points
in reply to: solipsist’s comment on: Open Thread, Jul. 20 - Jul. 26, 2015

Computers systems comprise hundreds of software components and are only as secure as the weakest one.

This is not a fundamental fact about computation. Rather it arises from operating system architectures (isolation per “user”) that made some sense back when people mostly ran programs they wrote or could reasonably trust, on data they supplied, but don’t fit today’s world of networked computers.

If interactions between components are limited to the interfaces those components deliberately expose to each other, then the attacker’s problem is no longer to find one broken component and win, but to find a path of exploitability through the graph of components that reaches the valuable one.

This limiting can, with proper design, be done in a way which does not require the tedious design and maintenance of allow/deny policies as some approaches (firewalls, SELinux, etc.) do.

kpreid 25 Aug 2015 2:52 UTC
0 points
in reply to: ScottL’s comment on: An overview of the mental model theory

Plus, the examples (except the first) are all from the literature on mental models.

Then my criticism is of the literature, not your post.

I meant that you need to generate all of the models if you are going to ensure that the model with the conclusion is valid or as you say not ‘inconsistent’. So, you not only have [to] reach the conclusion. You need to also check if it’s valid.

Reality is never inconsistent (in that sense). Therefore, I only need to check to guard against errors in my reasoning or in the information I have given; neither of these is necessary.

That’s why you go through all three models. In the last example the police arrived before the reporter in one model and the reporter arrived after the police in another of the models. Therefore, the example is invalid.

In the last example, the type of reasoning I described above would find no answer, not multiple ones.

(And, to clarify my terminology, the last example is not an instance of “the premises are inconsistent”; rather, there is insufficient information.)

kpreid 23 Aug 2015 1:37 UTC
2 points
on: An overview of the mental model theory
I appreciate this article for introducing research I was not previously aware of.

However, as other commenters did, I find myself bothered by the way the examples assume one uses exactly one particular approach to thinking — but in a different aspect. Specifically, I made the effort to work through the example problems myself, and

To solve this second problem you need to use multiple models.

is false. I only need one model, which leaves some facts unspecified. I reasoned as follows:
1. What I need to know is the relation between “police” and “reporter”.
2. Everything we know about “police” is that it is simultaneous with “alarm”.
3. Everything we know about “reporter” is that it is simultaneous with “stabbed”.
4. What do we know about the two newly mentioned events? That “alarm” is before “stabbed”.
5. Therefore “police” is before “reporter” (or, if we do not check further, the premises could be inconsistent).
This is building up exactly as much model as we need to reach the conclusion.

I will claim that this is a more realistic mode of reasoning — that is, more applicable to real-world problems — than the one you assume, because it does not assume that all of the information available is relevant, or that there even is a well-defined boundary of “all of the information”.

kpreid 27 Jun 2015 15:17 UTC
5 points
in reply to: Dreaded_Anomaly’s comment on: The Brain as a Universal Learning Machine
I look at the bizarre false positives and I wonder if (warning: wild speculation) the problem is that the networks were not trained to recognize the lack of objects. For example, in most cases you have some noise in the image, so if every training image is something, or rather something-plus-noise, then the system could learn that the noise is 100% irrelevant and pick out the something.

(The noisy images look to me like they have small patches in one spot faintly resembling what they’re identified as — if my vision had a rule that deemphasized the non-matching noise and I had a much smaller database of the world than I do, then I think I’d agree with those neural networks.)

If the above theory is true, then a possible fix would be to include in training data a variety of images for which the expected answers are like “empty scene”, “too noisy”, “simple geometric pattern”, etc. But maybe this is already done — I’m not familiar with the field.

kpreid 6 Jun 2015 22:46 UTC
1 point
in reply to: TezlaKoil’s comment on: Brainstorming new senses
I wonder: after sufficient adaptation to a rate-of-time sense, could useful mental effects be produced by adjusting the scale?

kpreid 23 May 2015 15:02 UTC
3 points
in reply to: Lumifer’s comment on: Stupid Questions May 2015
Apparently that’s true of some model rocket motors, but the SRBs have a hollow through the entire length of the propellant, so that it burns from the center out to the casing along the entire length at the same time.

kpreid 22 May 2015 14:27 UTC
1 point
in reply to: CBHacking’s comment on: Stupid Questions May 2015

I’m now actually rather curious about the range safety stuff for the SRBs—one of the dangers of an SRB is that there’s basically no way to shut it down, and indeed they kept going for some time after Challenger blew up

What I’ve heard (no research) is that thrust termination for a solid rocket works by charges opening the top end, so that the exhaust exits from both ends and the thrust mostly cancels itself out, or perhaps by splitting along the length of the side (destroying all integrity). In any case, the fuel still burns, but you can stop it from accelerating further.

kpreid 21 May 2015 14:26 UTC
0 points
in reply to: gjm’s comment on: Group rationality diary, May 5th − 23rd
Good question.

I could spend it looking at other parts of the world around me, something I don’t do as much of as I ought. I could spend it thinking about whatever I was thinking about before that moment. (Of course, it’s possible to do these things while still pushing the button, but as we know human brains aren’t perfect multitaskers.)

(The cost is also not just in time: it also wears out the button and my hands a tiny bit more than necessary.)

kpreid 21 May 2015 1:43 UTC
2 points
in reply to: CBHacking’s comment on: Stupid Questions May 2015
This isn’t all that relevant, but the Shuttle SRBs were gimbaled (Wikipedia, NASA 1, NASA 2).

(I was thinking that there is probably at least a mechanical component to arming the ignition and/or range safety systems, but research turned up this big obvious part.)

kpreid 21 May 2015 0:40 UTC
0 points
on: Group rationality diary, May 5th − 23rd
I’ve decided to work on getting rid of a trivial useless habit: pushing pedestrian crossing buttons more than once.

Now, there’s an argument that it’s not completely worthless to do so: the typical button has no feedback whatsoever that it’s recognized my push, so if it is at all unreliable then an extra push reduces the chances of a complete extra cycle wait at little cost to me since I have nothing else to do.

But the failure case has never actually happened in recent history, so I’m spending too much time pushing buttons.

So far I have remembered to push only once out of about ten times (2-3 per day). Of course, I immediately remember this resolution right after pushing twice.

kpreid 6 Apr 2015 14:02 UTC
0 points
in reply to: gjm’s comment on: Open thread, Mar. 23 - Mar. 31, 2015
Well, as iceman mentioned on a different subthread, a content-addressable store (key = hash of value) is fairly clearly a sort of naming scheme. But the thing about the names in a content-addressable store is that unlike meaningful names, they say nothing about why this value is worth naming; only that someone has bothered to compute it in the past. Therefore a content-addressable store either grows without bound, or has a policy for deleting entries. In that way, it is like a cache.

For example, Git (the version control system) uses a content-addressable store, and has a policy that objects are kept only if they are referenced (transitively through other objects) by the human-managed arbitrary mutable namespace of “refs” (HEAD, branches, tags, reflog).

Tahoe-LAFS, a distributed filesystem which is partially content-addressable but in any case uses high-entropy names, requires that clients periodically “renew the lease” on files they are interested in keeping, which they do by recursive traversal from whatever roots the user chooses.

kpreid 5 Apr 2015 19:49 UTC
2 points
in reply to: gjm’s comment on: Open thread, Mar. 23 - Mar. 31, 2015

cache invalidation—which seems to me to have very little to do with naming

I don’t agree with Douglas_Knight’s claim about the intent of the quote, but a cache is a kind of (application of a) key-value data structure. Keys are names. What information is in the names affects how long the cache entries remain correct and useful for.

(Correct: the value is still the right answer for the key. Useful: the entry will not be unused in the future, i.e. is not garbage in the sense of garbage-collection.)