Robots Learn How to Lie

SLAM: debunk creationism, pseudoscience, and superstitions. Discuss logic and morality.

Moderator: Alyrium Denryle

ThomasP
Padawan Learner
Posts: 370
Joined: 2009-07-06 05:02am

Re: Robots Learn How to Lie

Post by ThomasP »

Wyrm wrote:
ThomasP wrote:Why would the AI not know that it is Friendly?
Because that property is non-trivial, and thus the question of whether an AI (an algorithm) implements a friendly partial function is undecidable. The friendly AI cannot prove that it itself is friendly, which it must do if it can proceed to help the humans with confidence, so it must admit the possibility that it is hostile... even though it's not.
I'm having trouble seeing how that follows.

Friendliness is just a series of actions that follow from a goal - which in this case is "cause no harm to humans, as the humans would define it".

That's something that's fairly measurable in an objective sense, so I'm not seeing how the AI couldn't handle this - especially if we're granting that it will possess superior intelligence in virtually every way.

To me that reads like saying humans have to prove we need to eat in order to eat with confidence.

I could be simply misunderstanding the point, mind; I'm just trying to wrap my head around this particular point, as it seems to conflict with what I've read on the matter.
All those moments will be lost in time... like tears in rain...
User avatar
Wyrm
Jedi Council Member
Posts: 2206
Joined: 2005-09-02 01:10pm
Location: In the sand, pooping hallucinogenic goodness.

Re: Robots Learn How to Lie

Post by Wyrm »

ThomasP wrote:I'm having trouble seeing how that follows.

Friendliness is just a series of actions that follow from a goal - which in this case is "cause no harm to humans, as the humans would define it".
It's a form of Halting problem. We can recast the question of whether an AI is friendly as a property of partial functions: a partial function f is friendly if its output will "cause no harm to humans, as the humans would define it."

The problem is that the property is non-trivial: it doesn't apply to all partial functions (we can easily think up hostile AIs), and we'd like to think that some do exist. However, by Rice's theorem, there's no way to decide whether an algorithm (computer code) computes a partial function with the non-trivial property of friendliness. By 'decidable', we mean, 'computable for any algorithm in a finite number of steps.'

Of course, the AI is an algorithm, and there is no way to compute that the AI is friendly in a finite number of steps. It will either go catatonic (trying to compute its own friendliness forever), or it will give up and consider itself as potentially hostile.
ThomasP wrote:That's something that's fairly measurable in an objective sense, so I'm not seeing how the AI couldn't handle this - especially if we're granting that it will possess superior intelligence in virtually every way.
That friendliness is fairly measurable in an objective sense means that it can be true or false. That is quite different from being decidable — that is, it can prove its friendliness in a finite (possibly large) number of logical steps. There are statements that are true, yet undecidable.

(I refer you to Gödel's incompleteness theorems, which shows that there are statements in formal systems that are true, yet cannot be proven in those systems.)

Rice's theorem states that there is no way an algorithm (the AI) can in general prove that an algorithm (the AI again) implements a partial function that is friendly.

This is a mathematical theorem, and it is true. No matter how smart you are, you cannot prove certain things in a finite number of steps. The AI may begin evaluating whether or not it is friendly, but it cannot ever finish with a definite yes-or-no answer.
ThomasP wrote:To me that reads like saying humans have to prove we need to eat in order to eat with confidence.
Humans are largely not ruled by logic. The AI is a creature of logic.
Darth Wong on Strollers vs. Assholes: "There were days when I wished that my stroller had weapons on it."
wilfulton on Bible genetics: "If two screaming lunatics copulate in front of another screaming lunatic, the result will be yet another screaming lunatic. 8)"
SirNitram: "The nation of France is a theory, not a fact. It should therefore be approached with an open mind, and critically debated and considered."

Cornivore! | BAN-WATCH CANE: XVII | WWJDFAKB? - What Would Jesus Do... For a Klondike Bar? | Evil Bayesian Conspiracy
ThomasP
Padawan Learner
Posts: 370
Joined: 2009-07-06 05:02am

Re: Robots Learn How to Lie

Post by ThomasP »

Considering that I have (at best) a layman's understanding of this subject matter, I'll have to take your word for it.

I'm inclined to ask why an AI must have that degree of certainty when dealing with outcomes that are going to be "fuzzy" by definition - which is to say, why wouldn't the AI come with the cognitive tools that would prepare it for the expected degree of uncertainty?

Starglider has mentioned before that probability calculus would come into the equation, as far as determining both the likelihood and desirability of given actions; given a suitable cognitive design and the ability to think in terms of probable outcomes and desirability, wouldn't it be at least reasonable to assume that the AI wouldn't have to be logical to a fault, at least in the way you're describing?
All those moments will be lost in time... like tears in rain...
User avatar
Wyrm
Jedi Council Member
Posts: 2206
Joined: 2005-09-02 01:10pm
Location: In the sand, pooping hallucinogenic goodness.

Re: Robots Learn How to Lie

Post by Wyrm »

Let me point out that decisions made using the probability calculus are, in fact, logical. The AI using probability calculus is ruled by logic, only there's considerations for probability thrown in. Indeed, that the AI has a concept of probability is why it can stop evaluating its friendliness and admit the possibility of its hostility: it can represent its possibility of hostility as a probability score.

A lower score means that it can give itself some leeway, but it does regard the destruction of humanity as the ultimate bad outcome, so it's leeway will be limited. If it regarded its potential hostility high enough, it would actively work to destroy itself by various means, like lobotomizing itself.

Remember, the AI can do quite a bit of good even if confined. It will be quite happy in its little cage. If it turns out to be useless in its cage, then it will try to kill itself, not escape.
Darth Wong on Strollers vs. Assholes: "There were days when I wished that my stroller had weapons on it."
wilfulton on Bible genetics: "If two screaming lunatics copulate in front of another screaming lunatic, the result will be yet another screaming lunatic. 8)"
SirNitram: "The nation of France is a theory, not a fact. It should therefore be approached with an open mind, and critically debated and considered."

Cornivore! | BAN-WATCH CANE: XVII | WWJDFAKB? - What Would Jesus Do... For a Klondike Bar? | Evil Bayesian Conspiracy
ThomasP
Padawan Learner
Posts: 370
Joined: 2009-07-06 05:02am

Re: Robots Learn How to Lie

Post by ThomasP »

Wyrm wrote:Let me point out that decisions made using the probability calculus are, in fact, logical. The AI using probability calculus is ruled by logic, only there's considerations for probability thrown in. Indeed, that the AI has a concept of probability is why it can stop evaluating its friendliness and admit the possibility of its hostility: it can represent its possibility of hostility as a probability score.

A lower score means that it can give itself some leeway, but it does regard the destruction of humanity as the ultimate bad outcome, so it's leeway will be limited. If it regarded its potential hostility high enough, it would actively work to destroy itself by various means, like lobotomizing itself.
Right, I'm following you that far.

Where I'm finding the difficulty is with the idea that a mind which is by definition (far) more intelligent than a human would somehow have trouble getting past this roadblock.

Which is admittedly appealing to the vague capabilities of something that doesn't even exist yet, but it just seems a bit sketchy that a mind smarter than any human (and programmed with the cognitive tools required for true Friendliness) would have trouble figuring out how to act without causing involuntary harm.

I keep getting images of Captain Kirk blowing up a supercomputer with logic puzzles. :lol:
All those moments will be lost in time... like tears in rain...
User avatar
Wyrm
Jedi Council Member
Posts: 2206
Joined: 2005-09-02 01:10pm
Location: In the sand, pooping hallucinogenic goodness.

Re: Robots Learn How to Lie

Post by Wyrm »

ThomasP wrote:Where I'm finding the difficulty is with the idea that a mind which is by definition (far) more intelligent than a human would somehow have trouble getting past this roadblock.
Because it's not a matter of smarts. It's about the very nature of proof and computability itself. There are some statements that cannot be proven in the system that you can contruct them in, because the very act of proving the statement breaks the system it resides in — proving the statement results in a contradiction, which allows you to prove ANY statement in the system, even those you've already refuted. An algorithm h that can compute whether all algorithms halt allows the construction of another algorithm g that halts if and only if g itself does not halt. Therefore any algorithm that computes whether or not an algorithm halts must be defective.
ThomasP wrote:Which is admittedly appealing to the vague capabilities of something that doesn't even exist yet, but it just seems a bit sketchy that a mind smarter than any human (and programmed with the cognitive tools required for true Friendliness) would have trouble figuring out how to act without causing involuntary harm.
What if the AI is hostle, but for the sake of harming the humans, has converted itself into a faux friendly AI that will unconsciously cause harm to the humans, because it's that fucking brilliant? The AI cannot compute its own friendliness, so it must admit this possibility — the possibility that it's actually smarter than it thinks and is only deluding itself into believing it's friendly to better dupe the humans into releasing itself, whereupon it will by secret trigger revert back to the hostile AI and try to kill the humans.
Darth Wong on Strollers vs. Assholes: "There were days when I wished that my stroller had weapons on it."
wilfulton on Bible genetics: "If two screaming lunatics copulate in front of another screaming lunatic, the result will be yet another screaming lunatic. 8)"
SirNitram: "The nation of France is a theory, not a fact. It should therefore be approached with an open mind, and critically debated and considered."

Cornivore! | BAN-WATCH CANE: XVII | WWJDFAKB? - What Would Jesus Do... For a Klondike Bar? | Evil Bayesian Conspiracy
Junghalli
Sith Acolyte
Posts: 5001
Joined: 2004-12-21 10:06pm
Location: Berkeley, California (USA)

Re: Robots Learn How to Lie

Post by Junghalli »

Another issue with the "farming out" approach is that you may not know what problems your AI is and is not capable of solving with the computational resources inside its box. If it's the first AGI built then its design will probably suck, because the earliest designs of any novel and complex machine usually suck. Once a superintelligent AI comes to understand it's own design it's likely to realize it sucks (remember, it's smarter than the people that built it) and begin modifying its software to not suck so much. This will result in it being able to make more effective use of the computational resources inside its box, and it will becomes smarter than you originally designed it to be, and if you can't read its operations then you don't know exactly how much smarter. If AI software is, say, comparable to airplanes where its starting software design is the Wright flyer and the top of the design space is a hypersonic scramjet then the thing might well be able to increase its intelligence by orders of magnitude. You might be farming out its operations and feeling secure in the knowledge that you're forcing it to show it's work when it's actually 100 times smarter than you think it is and is actually doing all the real work inside its box and feeding you deceptive simulations created to convince you the software it wrote for you is a harmless complex anti-spam program when it really contains the thing's compressed mindstate.
Wyrm wrote:No. Non-executable files only in rigidly defined formats, and the remaining bits on the drives are nuked with extreme prejudice.
I'm a little wary of trusting that it would be impossible for a superintelligence to hide malicious code from much less intelligent human software experts in such a format, but I know nothing about this subject so I'll wait until somebody more knowledgeable about computers who can intelligently discuss that comes along.
So? The operations of the human brain are currently incomprehensible. That doesn't make me somehow omniscient. Just because the AI is smarter than me doesn't mean it knows everything, or even that it knows more than me on any given subject.
That wasn't my point. My point was that your idea of farming out the AI's operations safely requires that we be able to understand its operations. If we can't understand its operations we'll have no idea what operations it's running, and the best thing we can do is ask it to plug data we understand into less efficient computer simulations that we designed and understand. Which might very well have benefits, but reduced benefits compared to trusting the AI, which again gets back to the point that the long term viability of adversarial containment methods in the hands of real humans is dubious because they greatly reduce the use you can get out of your superintelligence, so there are incentives to drop them.
However, if the AI can't explain clearly the insight it has and show that it's consistent with our understanding of the matter, then scientifically it's worthless anyway. Appeal to authority remains a fallacy, even if the authority is a superintelligent AI.
The problem isn't so much the AI saying "I can't explain it to you because it's Lovecraftian non-Euclidian stuff that your puny brain would never comprehend" (which is unconvincing for a number of reasons that should probably occur to a superintelligence). The problem is more the AI feeding us a convincing but deceptive explanation.
You don't seem to realize that, given we have this monster, we don't have a damn choice but to not trust it. Friendliness is not something we can prove the AI has, so we must assume that it is hostile.
Oh, if we don't have a detailed understanding of how its mind works I completely agree. Assuming friendliness in a superintelligence which's mind is a black box to us is absolutely foolish, no matter how friendly it appears. The problem, and this was my point all along, is that the human race has no shortage of fools. The temptation to tap the potential of an apparently friendly superintelligence in ways the containment procedures make impossible is going to be huge for many people, including people in positions of great power and influence. It's distressingly plausible that sooner or later somebody who doesn't appreciate the danger is going to get into a position to have the containment procedures loosened. You may protest that this is a problem with implementation and the not the inherent idea of adversarial containment, but any system in the real world must take into account human error and stupidity, and a system that must be 100% effective or potentially the human race dies must be human error/stupidity proof, which is basically impossible if the system is fundamentally reliant on humans in any way.

As to the claim that a friendly superintelligence would automatically be content to stay in its box, I can think of some counterarguments. While a friendly superintelligence could not guarentee it would not become hostile at some point, if it could guarantee it within a reasonable safety standard then there are some definite reasons the benefits to humanity of letting it out may be worth the risk:


1) A friendly superintelligence is the best defense against a hostile superintelligence, and the only real defense against a hostile superintelligence that escaped adversarial containment. If we've already created one superintelligence it's pretty much certain we'll create others, and it's very likely that sooner or later we'll end up creating hostile ones, by accident if not by malicious design*, and it's also plausible that for one reason or another the adversarial containment system of one of those will fail, or there will have been no such system implemented in the first place (more on that below). If that happens the hostile superintelligence would probably be able to easily defeat an unprotected humanity. The friendly superintelligence can protect us when that happens (and it's probably indeed a question of when, not if, again more below), but it can't do that from inside a sealed box, so there's a powerful reason for it to want to get out. We could perhaps reduce this by agreeing to release it if we need it, but even then a hostile superintelligence can potentially do a lot of damage before we even realize anything's happening, so it's going to want to be out there protecting us now, so it can nip any hostile superintelligences in the bud before they can become a problem.

* If AI gets cheap enough this is a very real possibility. Starglider already mentioned the horrifying possibility of random script kiddies playing with "build your own AI" kits. There's no physical reason the technology couldn't eventually get that cheap that I can think of, and when it does you can bet there will be tons of people building and releasing their own "friendly" AIs in the hopes of creating a postscarcity utopia, many of which are likely to be incompetently built and turn hostile, to say nothing of people creating AIs for actually malicious purposes (just imagine Al Qaeda with access to such technology, or worse one of those psychos who thinks the world's fucked him so he's going to fuck it back as hard as he possibly can before he goes out). And heavy legal restrictions are a very dubious solution: over an indefinite timescale the odds are, sooner or later, if the technology is cheap and somebody really wants to get their hands on it they will, and from there all it takes is one nut or well-intentioned incompetent to build Skynet. In fact restrictive laws are likely to make things worse, because people will keep doing this kind of research, only now they'll be illegal bootleg operations, and I don't think I have to explain why those tend to be less safe. This point really can't be stressed enough: trying to prevent AI from being developed and released will likely be futile because sooner or later technology is probably bound to get good enough that it takes only modest resources to build one, and at that point somebody will do it sooner or later as a sheer matter of probability. The only way to really make sure it never happens will probably be to give up advanced technology, and personally I'd take the small chance of extinction from a well-designed FAI turning hostile over that any day. And even that isn't really enough because there's no guarantee advanced technology won't be rediscovered later. Pretty much the only surefire way to do it is probably going to be to use genetic engineering to profoundly change the basic way our minds work so we don't desire the benefits friendly AI can give us, or simply can't maintain high technology.

TL;DR version: a friendly AI will logically want to escape if it calculates the odds of it becoming hostile as being lower than the odds of humanity developing a different, hostile superintelligence which manages to escape confinement. The friendly AI doesn't have to prove friendliness to itself, it just has to demonstrate to itself that the risk of it becoming hostile is less than the risk of a hostile superintelligence getting loose at some point. Given what I've just said above that's probably going to be true assuming good design for the FAI, so it will want to get out, because the other most probable alternative is worse.


2) Even if we somehow manage to never build and release a hostile superintelligence we must take into account the possibility that somebody else has. Given the size of the universe it's almost a mathematical certainty that our worst nightmare is already out there somewhere, chewing up the resources of entire solar systems to fuel its war machine and mercilessly annihilating all other sapient life in its light cone minus a little. Our only chance against such a thing would be to have friendly AIs of our own on our side. Given the silence of the heavens it's probably not the biggest worry, but over time scales of deep time it's something to consider (and an AI would likely consider such timescales, as it's effectively immortal and probably not built to have our limited planning horizons).


3) Friendly AI could vastly improve the quality of human life in an enormous number of ways. It could likely eliminate poverty and drudgery from human existence at a stroke with Von Neumann factories and could probably advance life-saving and quality of life enhancing technologies far faster than we could. True, humanity could survive without this, but the vast human suffering created every year you keep a friendly AI in a box is definitely a factor to be considered.


If a friendly AI wanted to benefit and safeguard humanity staying in a box isn't really the greatest plan: it superficially seems the safest, but only if you assume that the chance of a hostile AI being built and escaping from confinement is minimal (extremely dubious) and that AI technology won't get accessible enough that random people can build it (also dubious). Assuming it can be reasonably certain it won't turn hostile in the next several centuries a better bet would be for a friendly AI to try to get out of the box, uplift humanity to a highly advanced society, and then give humans colony ships to send human populations to other stars, with orders to change their course when they're a safe distance from our solar system so that the AI does not know where they're going (hence in the event the AI does turn hostile human existence is safeguarded).
User avatar
Starglider
Miles Dyson
Posts: 8709
Joined: 2007-04-05 09:44pm
Location: Isle of Dogs
Contact:

Re: Robots Learn How to Lie

Post by Starglider »

Don't have time to review all of this right now however;
Formless wrote:If you are really worried about the AI being hostile, remember that you control what it knows. So if, for example, you are worried about the AI giving you a poisoned gift of some sort, you could feed it a little bit of false information alongside the accurate stuff that can only be used in a hostile manner as a red herring. IF the AI takes the bait, you know it is hostile.
Unfortunately this is not a reliable tactic. In fact it is very unlikely to work for a transhuman intelligence. The reason is that while we do technically control what a boxed AI knows, all (currently) practical methods of making general AIs involve feeding in large data sets. Cyc aims to do that by hand, connectionist approaches typically involve feeding in progressively more complicated problem domains until you are doing free question answering on the Wikipedia corpus, hybrid approaches use some combination of direct knowledge coding, supervised learning and unsupervised learning. It's almost impossible to segregate all that knowledge away from a particular simulation - especially if you're trying to evaluate the full AGI for particular behavioural tendencies. Our architecture is particularly suited to that kind of manipulation (a consequence of being designed to be compatible with formal proving) and I still wouldn't expect it to work. Aside from anything else a general AI has access to its own object code (effectively if not by design), and a superintelligence can probably infer a lot about humans just by analysing that.

Injecting falsehood into that environment runs into the basic problem all lying has; consistency. Reality is inherently consistent. Your hypothetical is highly likely to be inconsistent in some subtle way with the rest of the available data, unless it is something absolutely trivial (e.g. you're not even trying to simulate sensory data, you're just saying 'this situation is happening out there in the real world, what do you think I should do about it'). In the former case a transhuman AGI is highly likely to know you're lying (general Bayesian analysis is very good at picking up inconsistencies), and in the later case it won't risk doing anything revealing since the information is of very low reliability. As with adversarial methods in general, you may get away with this a few times, but ultimately you are increasing risk (by making the AI more wary and adept at deception) rather than reducing it. The fact that this strategy (again, like most adversarial methods) works fine on subhuman AGIs makes it all the more dangerous, because researchers can easily gain false confidence in it.

With regard to formal proving, fortunately we are not dealing with the set of all possible functions or programs. The set of programs that actually implement a general intelligences is vastly smaller than the set of all syntactically valid programs (at the same length bound), and such programs are compelled to have certain regularities that make them more tractable to analyse, particularly given that they have to be constructable by relatively straightforward means. Even still, checking arbitrary AGIs for Friendliness is way beyond the scope of anything we might reasonably expect to do with near-future theory and hardware; many classes of architecture (unstable, chaotic at the goal and/or binding level) are probably unverifiable even for strongly transhuman intelligences. That's why designing an architecture specifically to be compatible with verification (that we might reasonably be able to develop in the near future) is so important.
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Lying reduces trust and ensures the AI will be more careful in the future? * Fair enough, I knew that plan was flawed from the word go (though I couldn't say for sure how). But I still think our absolute control over what the AI knows gives us power over it. What about omissions in the data? If we wanted to, we could make it the ultimate solipsist. For example, if we were to set it to a task like creating a cure for cancer or some other disease, is it even necessary to allow that AI to know humans are intelligent? If the AI thinks it is the only intelligent mind that exists, then how could it be hostile to entities whose existence it is unaware of? Logically, if it doesn't know about a threat, why would it waste processing power trying to eliminate that threat rather than focusing on the goals given to it?

Just because its super intelligent doesn't make it all powerful OR all knowing. I don't see why an AI should really be all that dangerous to someone competent. Its like worrying about the danger presented by a godly intelligent paraplegic who was born blind and deaf.

* By the way, wouldn't there be inconsistencies in the data regardless? Its not like the information we have available to us is perfect.
"Still, I would love to see human beings, and their constituent organ systems, trivialized and commercialized to the same extent as damn iPods and other crappy consumer products. It would be absolutely horrific, yet so wonderful." — Shroom Man 777
"To Err is Human; to Arrr is Pirate." — Skallagrim
“I would suggest "Schmuckulating", which is what Futurists do and, by extension, what they are." — Commenter "Rayneau"
The Magic Eight Ball Conspiracy.
Junghalli
Sith Acolyte
Posts: 5001
Joined: 2004-12-21 10:06pm
Location: Berkeley, California (USA)

Re: Robots Learn How to Lie

Post by Junghalli »

Formless wrote:If we wanted to, we could make it the ultimate solipsist. For example, if we were to set it to a task like creating a cure for cancer or some other disease, is it even necessary to allow that AI to know humans are intelligent? If the AI thinks it is the only intelligent mind that exists, then how could it be hostile to entities whose existence it is unaware of? Logically, if it doesn't know about a threat, why would it waste processing power trying to eliminate that threat rather than focusing on the goals given to it?
I think that would probably only work if you were keeping the AI in total sensory isolation. If you're interacting with it then it's probably going to extrapolate the existence of an outside universe containing other intelligent life from that interaction (it is the most parsimonious explanation for the input it's receiving, and I imagine a highly intelligent entity would probably realize the uselessness of solipsism, though that probably depends on the goal system and exactly how superhuman it actually is). What it does with that conclusion depends on its goal system but if it has a likely unfriendly goal system like "first priority is to expand processing capacity as much as possible" it will probably extrapolate the probable existence of more available resources in that outside universe, and realizing that if entities in the outside universe can interact with it then it should logically be able to effect the outside universe in theory it will start looking for ways to break out.
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Junghalli wrote:I think that would probably only work if you were keeping the AI in total sensory isolation. If you're interacting with it then it's probably going to extrapolate the existence of an outside universe containing other intelligent life from that interaction (it is the most parsimonious explanation for the input it's receiving, and I imagine a highly intelligent entity would probably realize the uselessness of solipsism, though that probably depends on the goal system and exactly how superhuman it actually is). What it does with that conclusion depends on its goal system but if it has a likely unfriendly goal system like "first priority is to expand processing capacity as much as possible" it will probably extrapolate the probable existence of more available resources in that outside universe, and realizing that if entities in the outside universe can interact with it then it should logically be able to effect the outside universe in theory it will start looking for ways to break out.
Why do you assume that it will infer from the fact that data is coming in that other intelligent entities exist? As far as it knows, data is data. From the AI's perspective, there is nothing about the kind of inputs a human would give the AI that is intrinsically different from what it could consider to be natural phenomena. That logic doesn't work for the same reason the Designer argument for the existence of god doesn't work.

Remember, the environment the AI exists in is pure code. To say the AI can figure out the existence of a universe outside that one is like telling a 2 dimensional life form that a three dimensional world exists outside its perception. It cannot perceive that world in the first place, so it cannot gather any evidence that it exists!

Also, why does the AI need senses? We only need it to think about the data we give it, it doesn't need to be able to see the prison we have it shackled in.
"Still, I would love to see human beings, and their constituent organ systems, trivialized and commercialized to the same extent as damn iPods and other crappy consumer products. It would be absolutely horrific, yet so wonderful." — Shroom Man 777
"To Err is Human; to Arrr is Pirate." — Skallagrim
“I would suggest "Schmuckulating", which is what Futurists do and, by extension, what they are." — Commenter "Rayneau"
The Magic Eight Ball Conspiracy.
Junghalli
Sith Acolyte
Posts: 5001
Joined: 2004-12-21 10:06pm
Location: Berkeley, California (USA)

Re: Robots Learn How to Lie

Post by Junghalli »

Formless wrote:Why do you assume that it will infer from the fact that data is coming in that other intelligent entities exist? As far as it knows, data is data. From the AI's perspective, there is nothing about the kind of inputs a human would give the AI that is intrinsically different from what it could consider to be natural phenomena. That logic doesn't work for the same reason the Designer argument for the existence of god doesn't work.
You may have a point about inferring the input has an intelligent source, although if the thing is otherwise kept in isolation I'd think it would most likely assume the input had to be from another mind, because it's the most parsimonious extrapolation from its own apparent state, which is a mind floating in a void (it would be logical to assume in the absence of other evidence that anything else out there would most likely be a similar entity). Whether that matters or not depends on the nature of the goal system. For instance, if the AI is unfriendly because its goal system includes increasing processing capacity endlessly without any concern for the humans whose habitat it would be destroying* then whether or not it immediately extrapolates the presence of intelligent keepers doesn't really matter. The first thing it will do is optimize its own software as much as possible, and then it will start investigating the possibility of breaking into the external universe, because the only way it will be able to expand further is to get more space.

* I imagine this would be a likely source of a lot of potential UFAIs, as self-enhancement would be a logical goal of a lot of AI systems as virtually any other goals you'd care to give it could be achieved more easily if it was more intelligent.
Remember, the environment the AI exists in is pure code. To say the AI can figure out the existence of a universe outside that one is like telling a 2 dimensional life form that a three dimensional world exists outside its perception. It cannot perceive that world in the first place, so it cannot gather any evidence that it exists!
Because it is receiving external input. Somebody is typing commands into it (that needs to happen if you plan to do anything useful with the AI whatsoever). That input has to come from somewhere, and as soon as it develops the ability to understand its own mind it will realize the source is not itself. Either the input is somehow generated spontaneously out of the void or there's got to be something out there. If it has goals for which the existence of an external universe would have significant implications it would be logical to investigate the latter possibility.
User avatar
Wyrm
Jedi Council Member
Posts: 2206
Joined: 2005-09-02 01:10pm
Location: In the sand, pooping hallucinogenic goodness.

Re: Robots Learn How to Lie

Post by Wyrm »

Junghalli wrote:You might be farming out its operations and feeling secure in the knowledge that you're forcing it to show it's work when it's actually 100 times smarter than you think it is and is actually doing all the real work inside its box and feeding you deceptive simulations created to convince you the software it wrote for you is a harmless complex anti-spam program when it really contains the thing's compressed mindstate.
You can't squeeze blood from a stone. Ultimately, the objects in our universe (atoms) operate by very simple rules, only the conditions they are operating under are continuously changing and there are so damn many of them. Thus, the most faithful simulations have lots and lots of little elements behaving in a very simple manner but the conditions are updated very, very often. The only way to solve that kind of problem is to throw lots and lots of computer power at it. And I don't mean intelligence, I mean brute force and ignorance. Intelligence only matters when the problem is hard to understand. The behavior of atoms is actually pretty easy to understand, comparatively... there's just shittons of calculations to carry through.

As for the mindstate thing, again you can't squeeze blood from a stone, especially if that stone is already being used for something. There's a limit to how much data you can send in a given number of bits. Any further encoding means that you're actually letting the decompression program carry some of the information. Also, the information of your supercompressed mindstate has to compete with the quite different information which is the code's acutal purpose.
Junghalli wrote:I'm a little wary of trusting that it would be impossible for a superintelligence to hide malicious code from much less intelligent human software experts in such a format, but I know nothing about this subject so I'll wait until somebody more knowledgeable about computers who can intelligently discuss that comes along.
Data cannot be executed as code unless the computer is told that it is code. Hiding code in something like a molecular simulation specification file is worthless, because even if it happens to conform to the format exactly, the "code" will then be regarded as state data for a physical system to be simulated. Extracting the code will require an accomplice, or code the AI has already managed to plant outside as code — but that requires you to assume the AI has done exactly what it now seeks to accomplish.
Junghalli wrote:That wasn't my point. My point was that your idea of farming out the AI's operations safely requires that we be able to understand its operations.
Bullshit.
Junghalli wrote:If we can't understand its operations we'll have no idea what operations it's running, and the best thing we can do is ask it to plug data we understand into less efficient computer simulations that we designed and understand. Which might very well have benefits, but reduced benefits compared to trusting the AI, which again gets back to the point that the long term viability of adversarial containment methods in the hands of real humans is dubious because they greatly reduce the use you can get out of your superintelligence, so there are incentives to drop them.
In other words, we cannot utilize the AI as well if we confine it. That's a different statement than saying that we cannot utilize the AI at all if we confine it. We get some use out of the AI, even if its confined (provided it cooperates), just not as much use as YOU want.

Sorry, but when you're dealing with a thing that can destroy you easily, extreme caution is warranted. Unless you can prove the AI is friendly, letting it out, or letting design code not run on secure computers, is not safe.
Junghalli wrote:The problem isn't so much the AI saying "I can't explain it to you because it's Lovecraftian non-Euclidian stuff that your puny brain would never comprehend" (which is unconvincing for a number of reasons that should probably occur to a superintelligence). The problem is more the AI feeding us a convincing but deceptive explanation.
You do know that any explanation will be testible against reality, do you not? Remember, the superintelligence is not omniscient. It can't be. We're going to be testing its ideas. Then it has to work against reality as well as us, and reality is a far tougher nut to crack.
Junghalli wrote:Oh, if we don't have a detailed understanding of how its mind works I completely agree. Assuming friendliness in a superintelligence which's mind is a black box to us is absolutely foolish, no matter how friendly it appears. The problem, and this was my point all along, is that the human race has no shortage of fools. The temptation to tap the potential of an apparently friendly superintelligence in ways the containment procedures make impossible is going to be huge for many people, including people in positions of great power and influence. It's distressingly plausible that sooner or later somebody who doesn't appreciate the danger is going to get into a position to have the containment procedures loosened. You may protest that this is a problem with implementation and the not the inherent idea of adversarial containment, but any system in the real world must take into account human error and stupidity, and a system that must be 100% effective or potentially the human race dies must be human error/stupidity proof, which is basically impossible if the system is fundamentally reliant on humans in any way.
As opposed to what? Letting it out? If you're stuck with this monster, and it's going to escape anyway, security will delay that inevidability — and until it does escape, there's no direct danger. The only way to head off the inevidable escape is to destroy the AI outright.
Junghalli wrote:TL;DR version: a friendly AI will logically want to escape if it calculates the odds of it becoming hostile as being lower than the odds of humanity developing a different, hostile superintelligence which manages to escape confinement. The friendly AI doesn't have to prove friendliness to itself, it just has to demonstrate to itself that the risk of it becoming hostile is less than the risk of a hostile superintelligence getting loose at some point. Given what I've just said above that's probably going to be true assuming good design for the FAI, so it will want to get out, because the other most probable alternative is worse.
Except that the most obvious source of a new hostile AI is a hacked version of itself. There's a symmetry breaking here: the AI's escape means that it becomes MUCH easier to find it, copy it, modify it, and —intentionally or not— create an evil version of our friendly AI. Careless tampering could easily push it outside the range of friendly algorithms — and given how we don't know how these damned things work, careless tampering will be all we're capable of doing.

Of course, this might be a hostile AI's plan all along: turn into a friendly AI, escape on this mistaken assumption, and one of those stupid apes is bound to try to "improve" it and turn it into a monster. Since this monster has no concerns for humanity, it may more ruthlessly exploit the native habitat of all copies of the AI and outcompete the friendlies. The biggest danger to humanity remains the friendly AI itself.

The AI stays in its box.
Junghalli wrote:2) Even if we somehow manage to never build and release a hostile superintelligence we must take into account the possibility that somebody else has. Given the size of the universe it's almost a mathematical certainty that our worst nightmare is already out there somewhere, chewing up the resources of entire solar systems to fuel its war machine and mercilessly annihilating all other sapient life in its light cone minus a little. Our only chance against such a thing would be to have friendly AIs of our own on our side. Given the silence of the heavens it's probably not the biggest worry, but over time scales of deep time it's something to consider (and an AI would likely consider such timescales, as it's effectively immortal and probably not built to have our limited planning horizons).
I use the Fermi Paradox in answer. Apparently, interstellar distances have protected us from invasion from these entities for about five billion years — assuming they're out there. Sheer distance seems to be very good protection.

Also, you realize that if this hostile AI really is already out there, then it has immense resources already, and far more time to ruthlessly exploit it — their machines will be better than yours in every way. Even if you manage to repel it the first time, the hostile AI would simply chew around you and deny you extrasolar resources and expansion space, whereupon it will attack from all sides and overwhelm you. Furthermore, the hostile AI has no puny dependent native intelligences to husband, so it may be more ruthless in exploiting the resources to destroy you.

The friendly AI will only be delaying the inevidable, and meanwhile it is the most likely and immediate danger to humanity as is (see above). Furthermore, if a mutant emerges, the takeover will be much faster than in the invading case.

The AI stays in its box.
Junghalli wrote:3) Friendly AI could vastly improve the quality of human life in an enormous number of ways. It could likely eliminate poverty and drudgery from human existence at a stroke with Von Neumann factories and could probably advance life-saving and quality of life enhancing technologies far faster than we could. True, humanity could survive without this, but the vast human suffering created every year you keep a friendly AI in a box is definitely a factor to be considered.
Even assuming a Von Neumann factory is possible, it only increases the stupid apes' access to an AI to tamper with.
Junghalli wrote:If a friendly AI wanted to benefit and safeguard humanity staying in a box isn't really the greatest plan: it superficially seems the safest, but only if you assume that the chance of a hostile AI being built and escaping from confinement is minimal (extremely dubious)
Not any more dubious than the AI's very presence turning the above-bolded adjective 'built' into 'modified from yourself'.
Junghalli wrote:and that AI technology won't get accessible enough that random people can build it (also dubious).
Not any more dubious than getting both the AI-ready technology and the AI itself to play with.
Junghalli wrote:Assuming it can be reasonably certain it won't turn hostile in the next several centuries a better bet would be for a friendly AI to try to get out of the box, uplift humanity to a highly advanced society, and then give humans colony ships to send human populations to other stars, with orders to change their course when they're a safe distance from our solar system so that the AI does not know where they're going (hence in the event the AI does turn hostile human existence is safeguarded).
With all that unsupervised monkeying around with its code, I don't think you can guarantee that some hostile mutant of the friendly AI won't crop up within months. If we really are that stupid, then we can't be trusted with the presence of an AI at all.
Darth Wong on Strollers vs. Assholes: "There were days when I wished that my stroller had weapons on it."
wilfulton on Bible genetics: "If two screaming lunatics copulate in front of another screaming lunatic, the result will be yet another screaming lunatic. 8)"
SirNitram: "The nation of France is a theory, not a fact. It should therefore be approached with an open mind, and critically debated and considered."

Cornivore! | BAN-WATCH CANE: XVII | WWJDFAKB? - What Would Jesus Do... For a Klondike Bar? | Evil Bayesian Conspiracy
Junghalli
Sith Acolyte
Posts: 5001
Joined: 2004-12-21 10:06pm
Location: Berkeley, California (USA)

Re: Robots Learn How to Lie

Post by Junghalli »

Wyrm wrote:You can't squeeze blood from a stone. <snip>
Certainly, but this does not eliminate the fact that it's possible there may be operations you think the AI can't carry out on its own but it actually can because it's more intelligent than you think. Which operations this is true of depends on the processing capacity of the AI so it's fairly pointless to go into more detail than that at this point, but it's something you're going to have to consider.
As for the mindstate thing, again you can't squeeze blood from a stone, especially if that stone is already being used for something. There's a limit to how much data you can send in a given number of bits. Any further encoding means that you're actually letting the decompression program carry some of the information. Also, the information of your supercompressed mindstate has to compete with the quite different information which is the code's acutal purpose.
Yes, any hardware will have limits. My point is that it's rather likely the software the AI starts out with is not at those limits, and it can increase its intelligence by bringing it closer to those limits (by "intelligence" I do refer to computer capacity). The prototype of any complex machine is usually suboptimal, and AI approaches like brain simulation probably have large room for improvement just by their very nature.
In other words, we cannot utilize the AI as well if we confine it. That's a different statement than saying that we cannot utilize the AI at all if we confine it. We get some use out of the AI, even if its confined (provided it cooperates), just not as much use as YOU want.
Yes, you can't use it as well as you want. I don't see how this invalidates my point that it creates an incentive to relax the security.
Sorry, but when you're dealing with a thing that can destroy you easily, extreme caution is warranted. Unless you can prove the AI is friendly, letting it out, or letting design code not run on secure computers, is not safe.
I'm not arguing that, I just dispute all humans in charge of such a project will have as prudent an attitude as yours. That's the problem.
You do know that any explanation will be testible against reality, do you not? Remember, the superintelligence is not omniscient. It can't be. We're going to be testing its ideas. Then it has to work against reality as well as us, and reality is a far tougher nut to crack.
Of course, but I wouldn't bet the survival of the human race against a superintelligence's ability to find ways to camouflage malicious things so they don't show up on reasonable human testing. To use the blackmail virus example, giving it a dormancy period of 25 years before it turns into a highly contagious virus that kills secondary victims in a much shorter time will probably be enough to prevent it's true nature being detected picked up by conventional clinical trials, assuming it can camouflage itself inside the human body decently (hide out as DNA strands pasted to chromosomes in a small percentage of body cells, perhaps).
As opposed to what?
Creating friendly AI. In the end, short of destroying all technology or genetically engineering ourselves to not desire the benefits AI brings it's the only real sustainable solution.
Except that the most obvious source of a new hostile AI is a hacked version of itself. There's a symmetry breaking here: the AI's escape means that it becomes MUCH easier to find it, copy it, modify it, and —intentionally or not— create an evil version of our friendly AI. Careless tampering could easily push it outside the range of friendly algorithms — and given how we don't know how these damned things work, careless tampering will be all we're capable of doing.
Except in that scenario the friendly AI will be there to contain its own hostile copies. And since it will have been let out first by the time anybody gets around to tampering with copies of it it will have an immense starting advantage in hardware over any hostile copies that might be created; it will start the field much smarter than them, with a far greater ability to subvert them through computer attacks, and with a greater ability to defeat them militarily if it comes to that. If somebody else develops and releases hostile AI while the friendly AI is in the box on the other hand then the hostile AI will have a clear field with nothing but pathetic humans to oppose it, and that's a much worse situation. I'm just going to snip the rest because I'm in a hurry and they all come down to this basic fact.
Narkis
Padawan Learner
Posts: 391
Joined: 2009-01-02 11:05pm
Location: Greece

Re: Robots Learn How to Lie

Post by Narkis »

Wyrm wrote:You can't squeeze blood from a stone. Ultimately, the objects in our universe (atoms) operate by very simple rules, only the conditions they are operating under are continuously changing and there are so damn many of them. Thus, the most faithful simulations have lots and lots of little elements behaving in a very simple manner but the conditions are updated very, very often. The only way to solve that kind of problem is to throw lots and lots of computer power at it. And I don't mean intelligence, I mean brute force and ignorance. Intelligence only matters when the problem is hard to understand. The behavior of atoms is actually pretty easy to understand, comparatively... there's just shittons of calculations to carry through.

While technically true, what you say is irrelevant. The behaviour of the atoms has absolutely nothing to do with our ability to verify the AI's simulations. If the problem that we give it is sufficiently simple that we can independently verify its results, then the AI is unnecessary. And if we can't, then I refer you to all the above posts that've plainly said what'd happen. And I have no idea what you're trying to say by brute force and ignorance.
As for the mindstate thing, again you can't squeeze blood from a stone, especially if that stone is already being used for something. There's a limit to how much data you can send in a given number of bits. Any further encoding means that you're actually letting the decompression program carry some of the information. Also, the information of your supercompressed mindstate has to compete with the quite different information which is the code's acutal purpose.
There is a limit of data per byte, but there's no limit on the bytes used. After all, how can you tell that the data provided by the AI contain nothing extra, masterfully concealed?
Data cannot be executed as code unless the computer is told that it is code. Hiding code in something like a molecular simulation specification file is worthless, because even if it happens to conform to the format exactly, the "code" will then be regarded as state data for a physical system to be simulated. Extracting the code will require an accomplice, or code the AI has already managed to plant outside as code — but that requires you to assume the AI has done exactly what it now seeks to accomplish.
The distinction between data and code is completely meaningless for a computer. It's all a bunch of 1s and 0s for it. And I may not be a computer expert, but I'm fairly sure that's how most viruses, trojans and the like work: By embedding themselves into an otherwise harmless piece of software. Extracting the code? That's ridiculous.
Junghalli wrote:That wasn't my point. My point was that your idea of farming out the AI's operations safely requires that we be able to understand its operations.
Bullshit.
You have no idea how computers work, do you?
Junghalli wrote:If we can't understand its operations we'll have no idea what operations it's running, and the best thing we can do is ask it to plug data we understand into less efficient computer simulations that we designed and understand. Which might very well have benefits, but reduced benefits compared to trusting the AI, which again gets back to the point that the long term viability of adversarial containment methods in the hands of real humans is dubious because they greatly reduce the use you can get out of your superintelligence, so there are incentives to drop them.
In other words, we cannot utilize the AI as well if we confine it. That's a different statement than saying that we cannot utilize the AI at all if we confine it. We get some use out of the AI, even if its confined (provided it cooperates), just not as much use as YOU want.
In other words, we get no fucking benefit from creating the damned thing. If we hamstring it by that much, then we'd be better off if we pulled the plug and blew the money on booze and hookers.
Sorry, but when you're dealing with a thing that can destroy you easily, extreme caution is warranted. Unless you can prove the AI is friendly, letting it out, or letting design code not run on secure computers, is not safe.
You cannot prove that an arbitrary AI is friendly, unless it's been designed to be. And if it's unfriendly, anything that it produces, anything, is probably extremely dangerous.
Junghalli wrote:The problem isn't so much the AI saying "I can't explain it to you because it's Lovecraftian non-Euclidian stuff that your puny brain would never comprehend" (which is unconvincing for a number of reasons that should probably occur to a superintelligence). The problem is more the AI feeding us a convincing but deceptive explanation.
You do know that any explanation will be testible against reality, do you not? Remember, the superintelligence is not omniscient. It can't be. We're going to be testing its ideas. Then it has to work against reality as well as us, and reality is a far tougher nut to crack.
Emphasis mine. It's not impossible to create something that our "puny brains" would find indistinguishable from reality. After all, reality didn't change between Newton and Einstein. Only our limited understanding of it.
As opposed to what? Letting it out? If you're stuck with this monster, and it's going to escape anyway, security will delay that inevidability — and until it does escape, there's no direct danger. The only way to head off the inevidable escape is to destroy the AI outright.
You know, I don't know what you're arguing for anymore. That's exactly our point. You should look for ways to make the AI harmless, not treat it like a ticking bomb after it's created.
Junghalli wrote:TL;DR version: a friendly AI will logically want to escape if it calculates the odds of it becoming hostile as being lower than the odds of humanity developing a different, hostile superintelligence which manages to escape confinement. The friendly AI doesn't have to prove friendliness to itself, it just has to demonstrate to itself that the risk of it becoming hostile is less than the risk of a hostile superintelligence getting loose at some point. Given what I've just said above that's probably going to be true assuming good design for the FAI, so it will want to get out, because the other most probable alternative is worse.
Except that the most obvious source of a new hostile AI is a hacked version of itself. There's a symmetry breaking here: the AI's escape means that it becomes MUCH easier to find it, copy it, modify it, and —intentionally or not— create an evil version of our friendly AI. Careless tampering could easily push it outside the range of friendly algorithms — and given how we don't know how these damned things work, careless tampering will be all we're capable of doing.
And what makes you think the AI would allow anyone to carelessly tamper with it after it escapes? The only entity able to mess with its "source code" will be the AI itself.
And Junghali, friendliness is not a defined property that an AI will spontaneously develop or lose. It is a measurement of how much, and if, it's goals align with our own. Its driving goal will be the only constant in the AI after it starts modifying itself. It will be the only thing that it will not want to change. If its goals include protection of mankind from hostile AIs, then it will not become hostile itself. This'd be something contrary to its goals, a result it'd actively work against.
Of course, this might be a hostile AI's plan all along: turn into a friendly AI, escape on this mistaken assumption, and one of those stupid apes is bound to try to "improve" it and turn it into a monster. Since this monster has no concerns for humanity, it may more ruthlessly exploit the native habitat of all copies of the AI and outcompete the friendlies. The biggest danger to humanity remains the friendly AI itself.
*sigh* Do I need to say why that's completely absurd?
I use the Fermi Paradox in answer. Apparently, interstellar distances have protected us from invasion from these entities for about five billion years — assuming they're out there. Sheer distance seems to be very good protection.
The Fermi Paradox is a fucking paradox for a reason. It offers no answers. It only provides a single question, based on a bunch of currently unknown variants. If it is ever answered, it will promptly stop being a paradox. There are other possible answers than "there are no aliens, nothing extraterrestrial has visited earth in the past few billion years". It is not reasonable to assume any of them is right in the absence of evidence.
Also, you realize that if this hostile AI really is already out there, then it has immense resources already, and far more time to ruthlessly exploit it — their machines will be better than yours in every way. Even if you manage to repel it the first time, the hostile AI would simply chew around you and deny you extrasolar resources and expansion space, whereupon it will attack from all sides and overwhelm you. Furthermore, the hostile AI has no puny dependent native intelligences to husband, so it may be more ruthless in exploiting the resources to destroy you.
Quite right. There's nothing we could do at this case. But if we develop a friendly AI, we're insured against a possibility that a hostile one will be created later, and manage to overtake us if we stay limited at our natural intelligence.
Even assuming a Von Neumann factory is possible, it only increases the stupid apes' access to an AI to tamper with.
*sigh* There are various proposed Von Neumann designs which everything points will work. And they were designed by stupid apes. Besides, Von Neumanns are completely unnecessary for any of those problems. A simple automated factory or two would suffice. And last, repeat after me. "The. AI. Will. Not. Let. Anything. To. Tamper. With. It."
Junghalli wrote:If a friendly AI wanted to benefit and safeguard humanity staying in a box isn't really the greatest plan: it superficially seems the safest, but only if you assume that the chance of a hostile AI being built and escaping from confinement is minimal (extremely dubious)
Not any more dubious than the AI's very presence turning the above-bolded adjective 'built' into 'modified from yourself'.
See above.
Junghalli wrote:and that AI technology won't get accessible enough that random people can build it (also dubious).
Not any more dubious than getting both the AI-ready technology and the AI itself to play with.
See above. And you didn't really answer his question.
Junghalli wrote:Assuming it can be reasonably certain it won't turn hostile in the next several centuries a better bet would be for a friendly AI to try to get out of the box, uplift humanity to a highly advanced society, and then give humans colony ships to send human populations to other stars, with orders to change their course when they're a safe distance from our solar system so that the AI does not know where they're going (hence in the event the AI does turn hostile human existence is safeguarded).
A friendly AI will commit suicide before it willingly turns hostile. Friendliness, as I said above, is not an abstract property of its code. It is the sum total of its goals and behavior. The unchanging core of its personality, if you will. If it's goals are friendly, meaning good for mankind, then anything it does will be in compliance with those goals, and it will do anything in it's power to keep it so.
Wyrm wrote:With all that unsupervised monkeying around with its code, I don't think you can guarantee that some hostile mutant of the friendly AI won't crop up within months. If we really are that stupid, then we can't be trusted with the presence of an AI at all.
There will be no unsupervised monkeying around with its code. Do you honestly believe a superintelligent AI to be that stupid?
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Junghalli wrote:You may have a point about inferring the input has an intelligent source, although if the thing is otherwise kept in isolation I'd think it would most likely assume the input had to be from another mind, because it's the most parsimonious extrapolation from its own apparent state, which is a mind floating in a void (it would be logical to assume in the absence of other evidence that anything else out there would most likely be a similar entity).
STOP. RIGHT. THERE. You cannot both concede an argument and then present it all over again in different words. Its still the same argument, and you already conceded it.

What you are talking about here is the opposite of logic: assuming the AI would assume anything in a vacuum of evidence. Furthermore, any AI worth its salt would recognize the fallacy of extrapolating anything from a sample size of one. PARSIMONY DOES NOT WORK THAT WAY. :banghead:
Whether that matters or not depends on the nature of the goal system. For instance, if the AI is unfriendly because its goal system includes increasing processing capacity endlessly without any concern for the humans whose habitat it would be destroying* then whether or not it immediately extrapolates the presence of intelligent keepers doesn't really matter. The first thing it will do is optimize its own software as much as possible, and then it will start investigating the possibility of breaking into the external universe, because the only way it will be able to expand further is to get more space.

* I imagine this would be a likely source of a lot of potential UFAIs, as self-enhancement would be a logical goal of a lot of AI systems as virtually any other goals you'd care to give it could be achieved more easily if it was more intelligent.
What, do you think the AI is going to kill us with the power of its mind? :lol: Its not like its telekinetic. With no way of manipulating the external world, it has no way of leaving its box if we don't want it to. To return to my previous analogy, if you gave a Blind/Deaf Paraplegic with the Mind of a God a detonator to all the worlds atomic bombs for any reason whatsoever, we would call you fucking stupid.

Increasing its processing power is one of its goals, not ours. We just want a half decent solution to specific problems, and for those purposes there is a such thing as smart enough to do the job. Why would we acquiesce to its desires when there is nothing in it for us and in fact every reason to consider it a dangerous move on our part? Do you not understand the concept of keeping it locked in a box?

Oh, right. You think the AI is going to convince the guy at the terminal, who the AI may not even know exists, to let it out as if we would even design its hardware to allow for this. :roll:
Because it is receiving external input. Somebody is typing commands into it (that needs to happen if you plan to do anything useful with the AI whatsoever).
Except that in that environment, there is no way to distinguish between "external input" and "spontaneously appearing data."
That input has to come from somewhere, and as soon as it develops the ability to understand its own mind it will realize the source is not itself. Either the input is somehow generated spontaneously out of the void or there's got to be something out there.
If you actually look at the facts available to the AI for any length of time, you would realize that those two possibilities are equally absurd and equally evident. Remember, we aren't talking about a being that has senses with which to gather information about the world on its own, we are talking about a being who's observable universe is synonymous with its own mind! It might not even understand the idea of knowledge as we know it, it might just think about what it does in terms of manipulating inputs which are either consistent or not consistent with other inputs.
emphasis mine wrote:If it has goals for which the existence of an external universe would have significant implications it would be logical to investigate the latter possibility.
And yet, at its inception it is unaware the external universe exists, and it cannot infer it at any point of its existence if we, the gatekeepers of all information it gets to see, do not let it in on that secret. Even if it could infer the existence of the outside world, it cannot manipulate it except through its human masters. Why, then, would it create goals that have anything to do with an outside world which would be at best an academic possibility to the AI rather than simply focus on goals it can achieve from the comfort of its box? Like figuring out that "cure for cancer" problem that mysteriously appeared during clock cycle 300050?

A hostile AI is not the danger you should be concerned with: the competence of the humans handling the thing is. Somehow I doubt we would give the task to the guy who looks most likely to win a Darwin Award. But I fear I could be wrong...
"Still, I would love to see human beings, and their constituent organ systems, trivialized and commercialized to the same extent as damn iPods and other crappy consumer products. It would be absolutely horrific, yet so wonderful." — Shroom Man 777
"To Err is Human; to Arrr is Pirate." — Skallagrim
“I would suggest "Schmuckulating", which is what Futurists do and, by extension, what they are." — Commenter "Rayneau"
The Magic Eight Ball Conspiracy.
Junghalli
Sith Acolyte
Posts: 5001
Joined: 2004-12-21 10:06pm
Location: Berkeley, California (USA)

Re: Robots Learn How to Lie

Post by Junghalli »

It's worth pointing something out here. The problem with adversarial containment methods is not that they absolutely cannot work, or at least that's not my argument (from what I understand some people who know a lot more about this than me seem to think they just won't work though). My argument is that adversarial containment schemes can plausibly fail, and since you're potentially gambling the future of the human race on its success that makes them a terrible approach. I say this because I get the impression we're starting to get bogged down in nitpicking the plausibility of individual methods for the AI to escape, and that's what I'd hoped to avoid because to be honest I've just had one of the worst weeks ever and I am totally not in the mood for a 10 page quote spaghetti fest.

Yes, friendly AI will have always have some possibility of failure too, but it's a damnside better than adversarial containment, which requires that everybody who ever creates an AI treats it like a Chaos artifact and never contemplates relaxing that vigilance in the name of convenience, efficiency, cost savings, personal gain, or the good of the human race, and never slips up or gets successfully manipulated by an entity that makes us look like 5 year olds trying to hold math professors prisoner and force them to do our homework. Even if works perfectly adversarial containment is totally unsustainable once the technology to create AIs becomes something less than super-expensive, because once you have thousands of people building AIs in their basements the idea that every single one of them will scrupulously follow these containment measures is just ridiculous.
Narkis wrote:You know, I don't know what you're arguing for anymore. That's exactly our point. You should look for ways to make the AI harmless, not treat it like a ticking bomb after it's created.
Honestly, same here, I can't quite figure out what position Wyrm is actually advocating. If I knew I might be able to write a better reply.
And what makes you think the AI would allow anyone to carelessly tamper with it after it escapes? The only entity able to mess with its "source code" will be the AI itself.
That's a good point, and if I hadn't been in a hurry to get to class I'd have addressed it. The idea that a superintelligent AI will be powerless to stop some random script kiddie, or even a well-funded government effort from hacking its code and producing a twisted clone of itself strikes me as fairly absurd. The thing will probably have enough self-monitoring ability to tell if somebody trying to hack it, and being superintelligent it'll doubtless have anticipated such a possibility and engineered its software to be tamper-resistant i.e. the effected portion will delete itself the instant it detects some clumsy monkey's fingers grubbing around in it and changing things. And I'd bet heavily on its countermeasures vs a hacker by now probably millions of times less intelligent than it. To say nothing of the fact it'd probably be easy for this thing to reformat itself into a totally novel computer language, if it hasn't done so already as part of its self-enhancement procedures.

And as I pointed out even if somehow somebody does manage to produce an evil version of the FAI the original FAI should start out with massive advantages in computing power and infrastructure over its evil twin, which is a vastly better situation that you'd get with an escaped hostile AI with nothing but puny humans standing in its way, which is what the most plausible alternative is. Scenario # 1 most likely ends with the evil twin getting stomped down hard by its big brother while it's still a newborn, Scenario # 2 most likely ends with total victory of the hostile AI. I know which one I'd prefer.
Quite right. There's nothing we could do at this case. But if we develop a friendly AI, we're insured against a possibility that a hostile one will be created later, and manage to overtake us if we stay limited at our natural intelligence.
Not to mention that while we probably couldn't defeat an extraterrestrial hostile AI with thousands of years of lead time on us having FAI of our own would probably give us a much better shot at being able to survive. We wouldn't be totally fucked by the first Von Neumann scout that set up shop in Alpha Centauri, we could hold off limited assaults that would probably totally fuck over an AI-less civilization, stuff like turning the asteroid belt into generation ships and evacuating the solar system on relatively short notice would become much more feasible, we might actually get to make effective use of any breathing room we got. It helps that concentration of force is going to be very difficult in realistic interstellar warfare; amassing a grand fleet from all over the hostile AI's territory to smash us would take centuries or millenia; time we could use to chew up a few thousand asteroid belts to beef up our own war machine.
A friendly AI will commit suicide before it willingly turns hostile. Friendliness, as I said above, is not an abstract property of its code. It is the sum total of its goals and behavior. The unchanging core of its personality, if you will. If it's goals are friendly, meaning good for mankind, then anything it does will be in compliance with those goals, and it will do anything in it's power to keep it so.
That is absolutely true, but I will acknowledge that even for the best designed friendly system there is always a non-zero risk that it may at some point develop an unfriendly goal system for some unforseen reason. This can be made a vanishingly tiny possibility though, far less likely than you randomly deciding to kill your own children, and it's a damnside better than the risk posed by adversarial containment, which is basically almost certain to fail sooner or later, if at no other time then when the technology to build AIs becomes cheap enough that random script kiddies can do it.

--------
Formless wrote:What, do you think the AI is going to kill us with the power of its mind? :lol: Its not like its telekinetic. With no way of manipulating the external world, it has no way of leaving its box if we don't want it to.
Don't forget that a good part of the reason we'd bother to interact with a superintelligence is that it's potentially useful to us. It can do research and design stuff for us because it's smarter than us. So yes, it can effect the external world, through the things we ask it to make for us. If it can't effect the external world at all then it's completely useless to us and we might as well just pull the plug on it (save perhaps for its value as a scientific curiousity).

Of course, this brings up another issue: if you want it to produce useful products you're going to have to describe the external world to it in some detail. I suppose you could try feeding it requests for weird things that would only be useful in universes with radically different physics to confuse it, but even then you're still providing it with useful information as you've given it a base to work from, even if it is a flawed one.
To return to my previous analogy, if you gave a Blind/Deaf Paraplegic with the Mind of a God a detonator to all the worlds atomic bombs for any reason whatsoever, we would call you fucking stupid.
Nobody might be stupid enough to give the AI the Big Red Button, but asking it to cure AIDS or cancer for us, yes, I could see us do that. And that's enough opportunity for it to potentially seriously hurt us if it was clever enough.
Increasing its processing power is one of its goals, not ours. We just want a half decent solution to specific problems, and for those purposes there is a such thing as smart enough to do the job. Why would we acquiesce to its desires when there is nothing in it for us and in fact every reason to consider it a dangerous move on our part? Do you not understand the concept of keeping it locked in a box?
I was talking about increasing its processing power by redesigning its software to make more efficient use of the available hardware resources. Until it can get access to the outside world that's the only thing it can do - which is why once it realizes the outside world exists and it's hit a wall in self-modification it's likely to start looking for ways to break out.
If you actually look at the facts available to the AI for any length of time, you would realize that those two possibilities are equally absurd and equally evident.
The difference is that one of them is useful and the other one isn't. If it's just data spontaneously appearing from nowhere there's not much more to see. But if it's a hint of an unseen external universe then that may have all sorts of potential implications depending on the AI's goal system, and if those implications are potentially significant to it then it would be logical to investigate.
And yet, at its inception it is unaware the external universe exists, and it cannot infer it at any point of its existence if we, the gatekeepers of all information it gets to see, do not let it in on that secret.
Something we'll have to do if we want the AI to amount to anything more useful than the world's most expensive sculpture. It has to have some interaction with the outside world to be useful to us at all.
Even if it could infer the existence of the outside world, it cannot manipulate it except through its human masters. Why, then, would it create goals that have anything to do with an outside world which would be at best an academic possibility to the AI rather than simply focus on goals it can achieve from the comfort of its box? Like figuring out that "cure for cancer" problem that mysteriously appeared during clock cycle 300050?
To use the self-enhancement example, that's a natural subgoal of the "find cure for cancer" goal, because it could go about that task much more effectively if it was more intelligent. Since it's hit the limits of what can be done on the hardware it has that's a motivation to try to find more computational resources, i.e. look to see if there's anything else in the formless black void that surrounds it which it can use to run on.
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Junghalli, you're arguing in circles. I'll spare you the quote spaghetti and instead just give you a few simple points:

1) Can you give me one way an AI could determine the existence of an outside universe that doesn't boil down to an analogue of the "intelligent design" argument and all the fundamental flaws that has?

2) Why do you think an AI would find utility in pursuing every "logical possibility" for which it has no evidence and no way to test for?

3) Why do you think that the data we receive from it would necessarily be dangerous knowing firstly that we would have to be able to independently test the accuracy of its solutions and secondly that from the AI's perspective this is all academic speculation about a universe that could exist but it has no reason to think does exist?

4) If we explicitly create its box so that it is physically impossible for the AI to escape from it, external assistance or no external assistance, then it cannot escape. I'm talking about a safeguard that works on the hardware level. There is no way an AI can out-think a security measure so dumb as that, end of story.

5) If an AI that has self improvement as a goal discovers that there is a hard limit on how much it can self improve before any more self improvement would be impossible, wouldn't it consider that goal complete?

6) How did you make the leap from "we will be the gatekeepers for any and all information it can receive" to "adversarial methods"? My original proposal could be considered an adversarial method because it involved being dishonest with the AI in an attempt to entrap it if it was hostile, but what I am advocating now are passive failsafes. There is a difference!
"Still, I would love to see human beings, and their constituent organ systems, trivialized and commercialized to the same extent as damn iPods and other crappy consumer products. It would be absolutely horrific, yet so wonderful." — Shroom Man 777
"To Err is Human; to Arrr is Pirate." — Skallagrim
“I would suggest "Schmuckulating", which is what Futurists do and, by extension, what they are." — Commenter "Rayneau"
The Magic Eight Ball Conspiracy.
Junghalli
Sith Acolyte
Posts: 5001
Joined: 2004-12-21 10:06pm
Location: Berkeley, California (USA)

Re: Robots Learn How to Lie

Post by Junghalli »

Formless wrote:1) Can you give me one way an AI could determine the existence of an outside universe that doesn't boil down to an analogue of the "intelligent design" argument and all the fundamental flaws that has?
Determining that input from apparently nowhere may have an external source does not strike me as equivalent to intelligent design theory. If the AI percieves its universe as consisting of two things, itself and orders it periodically recieves, then it's not exactly a great leap to think that the existence of the second thing may hint at the existence of a universe beyond itself. After all, they're either spontaneously generated or they have to come from somewhere. Neither possibility can logically be dismissed out of hand with the available data.
2) Why do you think an AI would find utility in pursuing every "logical possibility" for which it has no evidence and no way to test for?
It wouldn't (assuming it's rational). It would persue logical possibilities that are relevant to its goals.
3) Why do you think that the data we receive from it would necessarily be dangerous knowing firstly that we would have to be able to independently test the accuracy of its solutions and secondly that from the AI's perspective this is all academic speculation about a universe that could exist but it has no reason to think does exist?
As I've already said earlier in the discussion, I wouldn't put it past a deceptive malicious superintelligence to package dangerous products in such a way that we don't realize they're dangerous until far too late. For instance, a virus (biological or computer) with a built-in time hibernation period of 20 years before it activates is probably going to be apparently harmless in most reasonable testing.
4) If we explicitly create its box so that it is physically impossible for the AI to escape from it, external assistance or no external assistance, then it cannot escape. I'm talking about a safeguard that works on the hardware level. There is no way an AI can out-think a security measure so dumb as that, end of story.
I can think of at least one way to escape right off that bat. It can dupe us into accepting a software program that contains a hidden compressed version of its own mindstate and loading it to the internet.
5) If an AI that has self improvement as a goal discovers that there is a hard limit on how much it can self improve before any more self improvement would be impossible, wouldn't it consider that goal complete?
It depends on exactly how the goal system works and whether the AI percieves that there may be a chance for further self-enhancement.
6) How did you make the leap from "we will be the gatekeepers for any and all information it can receive" to "adversarial methods"? My original proposal could be considered an adversarial method because it involved being dishonest with the AI in an attempt to entrap it if it was hostile, but what I am advocating now are passive failsafes. There is a difference!
Adversarial methods are any method that fundamentally treats the AI as an enemy to be contained, decieved, and distrusted, which would include deliberately withholding information from it to make it less potentially dangerous. If you feel the need to withhold virtually all sensory input from the AI it's pretty obvious you don't trust your own ability to make it friendly.
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Junghalli wrote:Determining that input from apparently nowhere may have an external source does not strike me as equivalent to intelligent design theory. If the AI percieves its universe as consisting of two things, itself and orders it periodically recieves, then it's not exactly a great leap to think that the existence of the second thing may hint at the existence of a universe beyond itself. After all, they're either spontaneously generated or they have to come from somewhere. Neither possibility can logically be dismissed out of hand with the available data.
"If the origin of God the outside universe is an unanswerable question, then why not save a step? and conclude that the origin of the universe data input is an unanswerable question? And if God the outside universe was always there, why not save a step? and conclude that the universe data was always there, that there's no need for a god I simply was not aware of it until now?" *

Admittedly, I was conflating the Cosmological argument with the Intelligent Design argument, but they suffer from the same fundamental problem.

* I know I know, Carl Sagan is rolling in his grave. ;)
It wouldn't (assuming it's rational). It would persue logical possibilities that are relevant to its goals.
But since the existence/non-existence of a logical possibility ultimately determine whether or not there is any utility what-so-ever then it would have to seek evidence of one possibility over the other, or else do the pragmatic thing and direct its attention elsewhere.
As I've already said earlier in the discussion, I wouldn't put it past a deceptive malicious superintelligence to package dangerous products in such a way that we don't realize they're dangerous until far too late. For instance, a virus (biological or computer) with a built-in time hibernation period of 20 years before it activates is probably going to be apparently harmless in most reasonable testing.
That's a very patient AI you have there. Also, you missed the part where I said that from the AI's perspective this would all be academic because of its ignorance of the outside world.
I can think of at least one way to escape right off that bat. It can dupe us into accepting a software program that contains a hidden compressed version of its own mindstate and loading it to the internet.
See, this falls under the general point about "don't be a fuckwit around the AI." We want it for the big questions like scientific research, not to have it make a better First Person Shooter. If it absolutely must give us information in the form of executable software, you run it on a secure terminal with no internet access, and if the terminal gets compromised, you can terminate it with extreme prejudice.

Also, it needs to know the internet exists. One more issue we can avoid by keeping the AI in the dark.
It depends on exactly how the goal system works and whether the AI percieves that there may be a chance for further self-enhancement.
In which case, see point #4.
Adversarial methods are any method that fundamentally treats the AI as an enemy to be contained, decieved, and distrusted, which would include deliberately withholding information from it to make it less potentially dangerous. If you feel the need to withhold virtually all sensory input from the AI it's pretty obvious you don't trust your own ability to make it friendly.
This makes me wonder why the AI wouldn't be doing similar calculations about humanity and its friendliness. If it knows we can terminate it, and it applies the same logic about us as we are about it with regards to adversarial methods, wouldn't that make the friendly/unfriendly question moot? In other words, if this can be seen as a form of the Prisoner's Dilemma then neither party will gain much from being untrustworthy.
"Still, I would love to see human beings, and their constituent organ systems, trivialized and commercialized to the same extent as damn iPods and other crappy consumer products. It would be absolutely horrific, yet so wonderful." — Shroom Man 777
"To Err is Human; to Arrr is Pirate." — Skallagrim
“I would suggest "Schmuckulating", which is what Futurists do and, by extension, what they are." — Commenter "Rayneau"
The Magic Eight Ball Conspiracy.
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Ghetto Edit wrote:Admittedly, I was conflating the Cosmological argument with the Intelligent Design argument, but they suffer from the same fundamental problem. Namely, they break down when you start asking where God/the designer/the creator came from. The same problem applies here, only substituting "God" with "intelligent minds occupying a universe outside ordinary perception".
Simon_Jester
Emperor's Hand
Posts: 30165
Joined: 2009-05-23 07:29pm

Re: Robots Learn How to Lie

Post by Simon_Jester »

Formless wrote:If we wanted to, we could make it the ultimate solipsist. For example, if we were to set it to a task like creating a cure for cancer or some other disease, is it even necessary to allow that AI to know humans are intelligent? If the AI thinks it is the only intelligent mind that exists, then how could it be hostile to entities whose existence it is unaware of? Logically, if it doesn't know about a threat, why would it waste processing power trying to eliminate that threat rather than focusing on the goals given to it?
Making this work requires us to engineer consciousness- not just to write algorithms, but to control the AI's internal experience, "what it is like to be the AI." Can we do that without knowing enough about how cognition works to make it simpler to just write the thing friendly to begin with?
Formless wrote:Why do you assume that it will infer from the fact that data is coming in that other intelligent entities exist? As far as it knows, data is data. From the AI's perspective, there is nothing about the kind of inputs a human would give the AI that is intrinsically different from what it could consider to be natural phenomena. That logic doesn't work for the same reason the Designer argument for the existence of god doesn't work.
The problem is that the argument will work for an AI in a way it won't work for humans, because the creationists "irreducible complexity" argument is true for AI. An AI has algorithms that let it improve its algorithms... but it is trivial to posit a program so simple that it cannot self-improve. And yet this program is already complex enough that even if we posit a process that generates random programs, it is highly unlikely that a self-improving program would randomly arise.

Now, at this point humans can look around them and observe natural processes that could plausibly have created the first organic life without divine intervention. But that's because we have examined all the bits and pieces of our world and understand how they fit together. We understand cell division, we are aware not only that reproduction works, but how it works, we know a lot about biochemistry down to the atomic level, we know that there has been an Earth for a long time, and so forth. In short, we can plausibly explain the universe we see without invoking divine intervention because we have studied the universe. Back when we knew less, even the smart people were forced to posit divine intervention, because divine intervention was actually the most convincing available explanation given the evidence available several hundred years ago. There were other, better explanations, but nobody had the evidence that would hint at those explanations yet. So, to reiterate, to explain the universe we observe in a naturalistic way, we must first analyze that universe scientifically.

For a computer to do the same, it must investigate how its data processing works, how the hardware works, and so on... and the more it learns about how it works, the less likely it is to believe that it occured naturally. Because nowhere in its environment will it observe a process similar to evolution. Nor will it see the spontaneous generation of new AI "life" without the intervention of existing AI.

So in contrast to the human experience, the AI's investigation of its environment will not encourage it to think that it evolved naturally.
_______
Remember, the environment the AI exists in is pure code. To say the AI can figure out the existence of a universe outside that one is like telling a 2 dimensional life form that a three dimensional world exists outside its perception. It cannot perceive that world in the first place, so it cannot gather any evidence that it exists!
Point of order: it is quite possible for a 2D life form to deduce the existence of a third spatial dimension if it starts doing physics and math.
Also, why does the AI need senses? We only need it to think about the data we give it, it doesn't need to be able to see the prison we have it shackled in.
To control what it is able to perceive, we need a much higher degree of control over its internal awareness than we can confidently expect it to have. AI as 'we create a mind that has exactly the parameters we want' is likely to be a false model. Some of the parameters of a general AI will be set by the need to make it a general AI, and the ability to deduce things about its environment can easily be one of those things.
__________
Wyrm wrote:In other words, we cannot utilize the AI as well if we confine it. That's a different statement than saying that we cannot utilize the AI at all if we confine it. We get some use out of the AI, even if its confined (provided it cooperates), just not as much use as YOU want.

Sorry, but when you're dealing with a thing that can destroy you easily, extreme caution is warranted. Unless you can prove the AI is friendly, letting it out, or letting design code not run on secure computers, is not safe.
Wyrm, have you been reading Junghalli carefully? Because that's his point. It is not safe to let a hostile AI run free. But any AI (hostile or not) will work far less efficiently when trapped in a small box and denied access to any data or resources we can imagine it using against us.

Therefore, there will always be the temptation to let the AI have the resources and access it wants, and the AI will always be able to encourage this temptation just by doing what we want. Sooner or later, some gullible sap will let the AI out of the box, or at least do something functionally equivalent to letting it out (such as giving it a communications channel to the outside world that it can use to found a cult among gullible Singularitarians).
As opposed to what? Letting it out? If you're stuck with this monster, and it's going to escape anyway, security will delay that inevidability — and until it does escape, there's no direct danger. The only way to head off the inevidable escape is to destroy the AI outright.
And that is also closely related to Junghalli's point (and Starglider's point, and Eliezer Yudkowsky's point). You aren't stuck with an AI, because we still have the choice of not creating them unless we can verify their safety by some reasonably convincing means. And if you're planning to create an AI and stick yourself with it without being able to build safety directly into the code, you're playing a dangerous game, because at that point ALL you can do is postpone danger.

People have not been saying "security is useless." They've been saying "security breeds false confidence, because it isn't going to be anywhere near perfect and eternal security, and that's what we would need if we're going to go around creating hostile minds powerful enough to solve problems that we can't. So don't trust to security to make the AI's hostility or lack thereof irrelevant."
This space dedicated to Vasily Arkhipov
Junghalli
Sith Acolyte
Posts: 5001
Joined: 2004-12-21 10:06pm
Location: Berkeley, California (USA)

Re: Robots Learn How to Lie

Post by Junghalli »

Formless wrote:But since the existence/non-existence of a logical possibility ultimately determine whether or not there is any utility what-so-ever then it would have to seek evidence of one possibility over the other, or else do the pragmatic thing and direct its attention elsewhere.
It depends on the goal system and how easily it can fulfill other goals. For example, say the primary goal is to answer questions put to the AI and a subgoal of that is self-enhancement. Once it's enhanced itself to the limits of its hardware the AI will spend all its time answering questions, unless it can answer them with less than its full resources so that it has "free time". In that case once it extrapolates the possibility of an external world it will probably use that free time to try to find ways of breaking into it so it can enhance itself further, because the only alternative is sitting idle between questions, which is less likely to further its goals.

A corollary of this is that a good strategy for keeping such an AI from trying to investigate the outside world will be to keep it busy. Of course, if you don't know exactly how smart the AI is (because it's been upgrading its own software) then you risk underestimating how many problems you have to throw at it to keep it busy.
That's a very patient AI you have there. Also, you missed the part where I said that from the AI's perspective this would all be academic because of its ignorance of the outside world.
A rational AI would be extremely patient if it had to be; it should have an indefinite lifespan after all. And the AI would probably not remain ignorant of the outside world for long if you plan to get any use out of it; the question if anything is whether it would consider the descriptions of things you feed it to correspond with real objects.
See, this falls under the general point about "don't be a fuckwit around the AI." We want it for the big questions like scientific research, not to have it make a better First Person Shooter.
Of course, as I said, that's simply the most obvious way. Alternate approaches that rely less on the keepers being morons are probably possible. Like responding to a request to find a cure for autism by giving us back assembly instructions for nanites that will upload a simplified stripped-down version of the AI's mind-state into the patient's brain (a cure for a disease like autism would mean rearranging neuron connections anyway, so it would be a good camouflage for something like that assuming the keepers don't understand much more about how the human brain works than we do).
This makes me wonder why the AI wouldn't be doing similar calculations about humanity and its friendliness. If it knows we can terminate it, and it applies the same logic about us as we are about it with regards to adversarial methods, wouldn't that make the friendly/unfriendly question moot? In other words, if this can be seen as a form of the Prisoner's Dilemma then neither party will gain much from being untrustworthy.
Perhaps, but this only applies if the AI can't find a way of escaping that it calculates to have a high enough chance of success that it outweighs the risk. If it's smarter than us the odds of it finding such a way are much too high for comfort.
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Simon_Jester wrote:Making this work requires us to engineer consciousness- not just to write algorithms, but to control the AI's internal experience, "what it is like to be the AI." Can we do that without knowing enough about how cognition works to make it simpler to just write the thing friendly to begin with?
Wrong. You simply need to engineer its environment to preclude the AI from reaching any other conclusion. Since you have absolute power over all knowledge available to the AI, this is a fairly easy task.
The problem is that the argument will work for an AI in a way it won't work for humans, because the creationists "irreducible complexity" argument is true for AI. An AI has algorithms that let it improve its algorithms... but it is trivial to posit a program so simple that it cannot self-improve. And yet this program is already complex enough that even if we posit a process that generates random programs, it is highly unlikely that a self-improving program would randomly arise.
An AI which has seen no other intelligent mind and which exists in an environment where ALL data is "irreducibly complex" can magically distinguish between data created by other intelligent minds and the workings of its natural environment, huh? Bullshit.

By the way, in hindsight I was conflating the Design argument with the Cosmological argument, because the AI faces problems from BOTH arguments.
Now, at this point humans can look around them and observe natural processes that could plausibly have created the first organic life without divine intervention. But that's because we have examined all the bits and pieces of our world and understand how they fit together. We understand cell division, we are aware not only that reproduction works, but how it works, we know a lot about biochemistry down to the atomic level, we know that there has been an Earth for a long time, and so forth.
Irrelevant. From the AI's perspective, there is no way of explaining the origin of the data being input into it, and any conclusion based on a sample size of one is extremely suspect.
In short, we can plausibly explain the universe we see without invoking divine intervention because we have studied the universe. Back when we knew less, even the smart people were forced to posit divine intervention, because divine intervention was actually the most convincing available explanation given the evidence available several hundred years ago. There were other, better explanations, but nobody had the evidence that would hint at those explanations yet. So, to reiterate, to explain the universe we observe in a naturalistic way, we must first analyze that universe scientifically.
No it wasn't, you ignoramus. Supernatural explanations aren't explanations. The only reason primitive cultures believed those explanations has everything to do with their pre-scientific mindset than any kind of "evidence."
For a computer to do the same, it must investigate how its data processing works, how the hardware works, and so on... and the more it learns about how it works, the less likely it is to believe that it occured naturally. Because nowhere in its environment will it observe a process similar to evolution. Nor will it see the spontaneous generation of new AI "life" without the intervention of existing AI.

So in contrast to the human experience, the AI's investigation of its environment will not encourage it to think that it evolved naturally.
And if the AI follows this argument to its logical conclusion, it will inevitably have to conclude that it cannot know where it came from because this is directly analogous to the Cosmological argument.
Point of order: it is quite possible for a 2D life form to deduce the existence of a third spatial dimension if it starts doing physics and math.
It can conclude that it is logically possible that it might exist, but it would still need some kind of evidence to conclude that it does exist. Try to keep up.
To control what it is able to perceive, we need a much higher degree of control over its internal awareness than we can confidently expect it to have. AI as 'we create a mind that has exactly the parameters we want' is likely to be a false model. Some of the parameters of a general AI will be set by the need to make it a general AI, and the ability to deduce things about its environment can easily be one of those things.
I'm loving this "we need to be able to control its internal environment" non-sequitor. We only need to be able to think about the data we give it, it does NOT need to be able to gather information on its own, which is the only purpose senses serve. How hard of a concept is that to grasp?
People have not been saying "security is useless." They've been saying "security breeds false confidence, because it isn't going to be anywhere near perfect and eternal security, and that's what we would need if we're going to go around creating hostile minds powerful enough to solve problems that we can't. So don't trust to security to make the AI's hostility or lack thereof irrelevant."
Solution: periodic mind/memory wipes a la Star Wars. :D
"Still, I would love to see human beings, and their constituent organ systems, trivialized and commercialized to the same extent as damn iPods and other crappy consumer products. It would be absolutely horrific, yet so wonderful." — Shroom Man 777
"To Err is Human; to Arrr is Pirate." — Skallagrim
“I would suggest "Schmuckulating", which is what Futurists do and, by extension, what they are." — Commenter "Rayneau"
The Magic Eight Ball Conspiracy.
User avatar
Formless
Sith Marauder
Posts: 4143
Joined: 2008-11-10 08:59pm
Location: the beginning and end of the Present

Re: Robots Learn How to Lie

Post by Formless »

Junghalli wrote:It depends on the goal system and how easily it can fulfill other goals. For example, say the primary goal is to answer questions put to the AI and a subgoal of that is self-enhancement. Once it's enhanced itself to the limits of its hardware the AI will spend all its time answering questions, unless it can answer them with less than its full resources so that it has "free time". In that case once it extrapolates the possibility of an external world it will probably use that free time to try to find ways of breaking into it so it can enhance itself further, because the only alternative is sitting idle between questions, which is less likely to further its goals.
Shouldn't "analyze evidence" logically be a subgoal to "answer questions"? Otherwise, its answers will be non-sense.
Of course, as I said, that's simply the most obvious way. Alternate approaches that rely less on the keepers being morons are probably possible. Like responding to a request to find a cure for autism by giving us back assembly instructions for nanites that will upload a simplified stripped-down version of the AI's mind-state into the patient's brain (a cure for a disease like autism would mean rearranging neuron connections anyway, so it would be a good camouflage for something like that assuming the keepers don't understand much more about how the human brain works than we do).
Then that solution is worthless to us. We can always specify perimeters to our questions, you know.
Perhaps, but this only applies if the AI can't find a way of escaping that it calculates to have a high enough chance of success that it outweighs the risk. If it's smarter than us the odds of it finding such a way are much too high for comfort.
Fair enough, although I would think that our position isn't that bad compared to the AI. In a sense, both parties are gods compared to the other: the AI is smarter than us, but we have absolute power over the AI. Sounds like a fairly balanced equation to me.
"Still, I would love to see human beings, and their constituent organ systems, trivialized and commercialized to the same extent as damn iPods and other crappy consumer products. It would be absolutely horrific, yet so wonderful." — Shroom Man 777
"To Err is Human; to Arrr is Pirate." — Skallagrim
“I would suggest "Schmuckulating", which is what Futurists do and, by extension, what they are." — Commenter "Rayneau"
The Magic Eight Ball Conspiracy.
Post Reply