Valerio

Karma: −14

Valerio 15 Jul 2023 13:22 UTC
0 points
0
on: Views on when AGI comes and on strategy to reduce existential risk
I like your arguments on AGI timelines, but the last section of your post feels like you are reflecting on something I would call “civilization improvement” rather than on a 20+ years plan for AGI alignment.
I am a bit confused by the way you are conflating “civilization improvement” with a strategy for alignment (when you discuss enhanced humans solving alignment, or discuss empathy in communicating a message “If you and people you know succeed at what you’re trying to do, everyone will die”). Yes, given longer timelines, civilization improvement can play a big role in reducing existential risk including AGI x-risk, but I would prefer to sell the broad merits of interventions on their own, rather than squeeze them into a strategy for alignment from today’s limited viewpoint. When making a multi-decade plan for civilization improvement, I think it is also important to consider the possibility of AGI-driven “civilization improvement”, i.e. interventions will not only influence AGI development, but they may also be critically influenced by it.
Finally, when considering strategy for alignment under longer timelines, people can have useful non-standard insights, see for example this discussion on AGI paradigms and this post on agent foundations research.

Valerio 11 Jul 2023 16:20 UTC
−3 points
0
on: Where are the people building AGI in the non-dumb way?
I am also interested in interpretable ML. I am developing artificial semiosis, a human-like AI training process which can achieve aligned (transparency-based, interpretability-based) cognition. You can find an example of the algorithms I am making here: the AI runs a non-deep-learning algorithm, does some reflection and forms a meaning for someone “saying” something, a meaning different from the usual meaning for humans, but perfectly interpretable.
I support then the case for differential technological development:
There are two counter-arguments to this that I’m aware of, that I don’t think in themselves justify not working on this.
Regarding 1, it may take several years to have interpretable ML reach capabilities equivalent to LLMs, but the future may offer surprises either in terms of coordination to pause the development of “opaque” advanced AI or of deep learning hitting a wall… at killing everyone. Let’s have a plan also for the case we are still alive.
Regarding 2, interpretable ML would need to have programmed control mechanisms to be aligned. There is currently no such a field of AI safety as we do not have yet interpretable ML, but I imagine computer engineers being able to make progress on these control mechanisms (being able to make more progress than on mechanistic interpretability of LLMs). While it is true that control mechanisms can be disabled, you can always advocate for the highest security (like in Ian Hogarth’s Island idea). You can then also reject this counterargument.
mishka noted that this paradigm of AI is more foomable. Self-modification is a huge problem. I have an intuition interpretable ML will exhibit a form of scaffolding, in that control mechanisms for robustness (i.e. for achieving capabilities) can advantageously double as alignment mechanisms. Thanks to interpretable ML, engineers may be able to study self-modification already in systems with limited capabilities and learn the right constraints.

Valerio 28 Oct 2018 22:26 UTC
1 point
in reply to: TruePath’s comment on: The “semiosis reply” to the Chinese Room Argument
In his paper, Searle brings forward a lot of arguments.
Early in his argumentation and referring to the Chinese room, Searle makes this argument (which I ask you not to mix with later arguments without care):
it seems to me quite obvious in the example that I do not understand a word of the Chinese stories. I have inputs and outputs that are indistinguishable from those of the native Chinese speaker, and I can have any formal program you like, but I still understand nothing. For the same reasons, Schank’s computer understands nothing of any stories. whether in Chinese. English. or whatever. since in the Chinese case the computer is me. and in cases where the computer is not me, the computer has nothing more than I
Later, he writes:
the whole point of the original example was to argue that such symbol manipulation by itself couldn’t be sufficient for understanding Chinese.
I am framing this argument in a way it can be analyzed:
1) P (the Chinese room) is X (a program capable of passing Turing test in Chinese);
2) Searle can be any X and not understanding Chinese (as exemplified by Searle being the Chinese room and not understanding Chinese, which can be demonstrated for certain programs)
thus 3) no X is understanding Chinese
Searle is arguing that “no program is understanding Chinese” (I stress this in order to reply to Said). The argument “P is X, P is not B, thus no X is B” is an invalid syllogism. Nevertheless, Searle believes in this case that “P not being B” implies (or strongly points towards) “X not being B”.
Yes, Searle’s intuition is known to be problematic and can be argued against accordingly.
My point however is that there is out there in the space of X a program P that is quite unintuitive. I am suggesting a positive example of “P possibly understanding Chinese” which could cut short the debate. Don’t you see that giving a positive answer to the question “can a program understand?” may bring some insight in Searle’s argument too (such as developing it into a “Chinese room test” to assess whether a given program can indeed understand)? Don’t you want to look into my suggested program P (semiotic AI)?
In the beginning of my post I made it very clear:
Humans learn Chinese all the time; yet it is uncommon having them learning Chinese by running a program

Valerio 28 Aug 2018 16:17 UTC
1 point
in reply to: binary_doge’s comment on: The “semiosis reply” to the Chinese Room Argument
Uhm, an Aboriginal tends to see meaning in anything. The more the regularities, the more meaning she will form. Semiosis is the dynamic process of interpreting these signs.
If you were put in a Chinese room with no other input than some incomprehensible scribbles you will probably start considering that what you are doing has indeed a meaning.
Of course, a less intelligent human in the room or a human put under pressure would not be able to understand Chinese even with the right algorithm. My point is that the right algorithm enables the right human to understand Chinese. Do you see that?

Valerio 28 Aug 2018 11:08 UTC
1 point
in reply to: Said Achmiz’s comment on: The “semiosis reply” to the Chinese Room Argument
A more proper summary would read as follows:
1. P is an instantiated algorithm that behaves as if it [x]. (Where [x] = “understands and speaks Chinese”.)
2. If we examine P, we can easily see that its inner workings cannot possibly explain how it could [x].
3. Therefore, the fact that humans can [x] cannot be explainable by any algorithm.
I have some problem with your formulation. The fact that P does not understand [x] is nowhere in your formulation, not in premise #1. Conclusion #3 is wrong and should be written as “the fact that humans can [x] cannot be explainable by P”. This conclusion does not need the premise that “P does not understand [x]” but only premise #2. In fact, at least two conclusions can be derived from premise #2, including a conclusion that “P does not understand [x]”.
I state that—using a premise #2 that does not talk about any program—both Searle’s conclusions hold true, but do not apply to an algorithm which performs (simulates) semiosis.

Valerio 28 Aug 2018 10:44 UTC
1 point
in reply to: binary_doge’s comment on: The “semiosis reply” to the Chinese Room Argument
SCA infers that “somebody wrote that” where the term “somebody” is used more generally than in English.
SCA does not infer that another human being wrote that, but rather that a casual agent wrote that, maybe spirits of the caves.
If SCA enters two caves and observes natural patterns in cave A and the characters of “The adventures of Pinocchio” in cave B, she may deduce that two different spirits wrote them. Although she may discover some patterns in what spirit A (natural phenomena) wrote, she won’t be able to discover a grammar as complex as in cave B. Spirit B wrote often the sequence “oor ”, preceded sometimes by capital ” P”, sometimes by small ” p”. Therefore, she infers that symbols “p” and “P” are similar (at first, she may group also “d” with them, but she may correct that thanks to additional observations).
There is no hidden assumption that SCA knows she is observing a language in cave B. SCA is not a taught cryptographer, but rather an Aboriginal cryptographer. She performs statistical pattern matching only and makes the hypothesis that spirit B may have represented the concept of writing by using a sequence of letters “said”. She discards other hypotheses that just a single character may correspond to the concept of writing (although she has some doubt with “:”). She discards other hypotheses that capitalised words are words reported to be written. On the other side, direct discourse in “The adventures of Pinocchio” supports her hypothesis about “said”.
SCA keeps generating hypotheses that way so that she learns to decode more knowledge, without the need of knowing that the symbols are language (she rather discovers the concept of language).

Valerio 27 Aug 2018 21:14 UTC
1 point
in reply to: TruePath’s comment on: The “semiosis reply” to the Chinese Room Argument
TruePath, you are mistaken, my argument addresses the main issue of explaining computer understanding (moreover, it seems that you are making confusion between the Chinese room argument and the “system reply” to it).
Let me clarify. I could write the Chinese room argument as the following deduction argument:
1) P is a computer program that does [x]
2) There is no computer program sufficient for explaining human understanding of [x]
=> 3) Computer program P does not understand [x]
In my view, assumption (2) is not demonstrated and the argument should be reformulated as:
1) P is a computer program that does [x]
2’) Computer program P is not sufficient for explaining human understanding of [x]
=> 3) Computer program P does not understand [x]
The argument still holds against any computer program satisfying assumption (2’). Does however a program exist that can explain human understanding of [x] (a program such that a human executing it understands [x])?
My reply focuses on this question. I suggest to consider artificial semiosis. For example, a program P learns solely from symbolic experience of observing a symbols in a sequence that it should output “I say” (I have described how such a program would look like in my post). Another program Q could learn from symbolic experience solely how to speak Chinese. Humans do not normally learn these ways a rule for using “I say” or how to speak Chinese, because their experience is much richer. However, we could reason about the understanding that a human would have if he could have only symbolic experience and the right program instructions to follow. The semiosis performed by the human would not differ from the semiosis performed by the computer program. It can be said that program P understands a rule for using “I say”. It could be said that the computer program Q understands Chinese.
You can consider [x] to be a capability enabled by sensory-motion. You can consider [x] to be consciousness. My “semiosis reply” could of course be adapted to these situations too.

The “semiosis reply” to the Chinese Room Argument

Valerio16 Aug 2018 19:23 UTC

−9 points

13 comments7 min readLW link

(www.valeriotargon.org)

Valerio 9 Aug 2017 6:00 UTC
0 points
in reply to: Daniel_Burfoot’s comment on: Open thread, July 31 - August 6, 2017
Daniel, I’m curious too. What do you think about Fluid Construction Grammar? Can it be a good theory of language?

Valerio 8 Aug 2017 20:47 UTC
0 points
in reply to: Wei Dai’s comment on: Steelmanning the Chinese Room Argument
cousin_it, aren’t you forgetting that the rules of the Chinese Room are different than those of Turing’s imitation game? While Turing does not let you in the other test room, Searle grants you complete access to the code of the program. If you could really work out a (Chinese) brain digital upload, you could develop a theory of consciousness/intelligence/intentionality from it. Unfortunately, artificial neural networks bear no connection to the brain, like ELIZA bears no connection to a human!

Valerio

The “semio­sis re­ply” to the Chi­nese Room Argument

The “semiosis reply” to the Chinese Room Argument