Junghalli wrote:In this case we may be violently agreeing. I'm not arguing that adversarial confinement is necessarily useless as a stopgap measure (although from what I understand others who know this subject a lot better than me at least sometimes seem to think it is). I am arguing that it friendly AI is the only truly feasible permanent solution.
And perhaps unrealizable. Maybe we're just not smart enough to crack the problem with any reasonable assurity.
However, you are the only one apparently clueless enough that I wouldn't want a confined AI to be friendly as possible, as it would at least make confining it easier. I was arguing about whether such an AI could be
trusted to be let out, even through all your efforts to make sure it was friendly.
Junghalli wrote:I can easily see people making what we'd consider a hostile AI deliberately. First of all, there's the fact that what some people would define as a "friendly" AI would to much of the human race be a hostile AI. If you gave Al Quaeda a build your own AI kit they'd try to build what by their own standards would be a friendly AI - but their idea of a friendly AI would probably not be particularly friendly to much of the human race (i.e. it would probably have goals like "restore the Islamic Caliphate"). <ka-snip>
Sigh. This canard again. You realize that the full AIs will come about
long before any AI kits are availible, and if kits are availible, then that presumes that the kits produce AIs that are only
vanishingly probable to produce what we consider hostile AIs — the minimum standard of safety I'd expect from the kits. See, the thing about kits is that they are a shortcut around the tedium that is building from scratch. If you're using a kit, then you are tacitly accepting the limitations of that kit. If AI science is that well developed to produce actual kits, then they understand AIs well enough to make sure the maniacs find their projects very uncooperative.
Furthermore, AI kits implies that the people using them are very comfortable with AIs. After all, they have kits, and kits are meant to be used. That means they are comfortable with AIs locking up most of the resources of their computer network, enough to keep out aggressive hostile AIs, which is the only way you can have friendly AIs protect you without turning into a turf war that will cause the Rioting Apes outcome.
The existence of AI kits
presumes that know enough about AIs to use them with reasonable safety. It
presumes that AIs are plentiful enough to prevent the nightmare scenario — it
presumes that the average Joe has experience with AIs, knows what they're like, and are comfortable enough with them to not go apeshit because there are a lot of them eating up the network.
In short, AI kits
presume the kind of world that will be reasonably safeguarded against hostile AI attack.
Junghalli wrote:The idea that a superintelligent AI would be helpless to prevent clumsy apes hacking its code like this is almost certainly simply wrong, for reasons I've already given. While I confess I'm no software engineer unless one comes up to contradict me I remain pretty confident it'll be child's play for a superintelligence to make its code extremely tamper-resistant to the clumsy fumblings of a human hacker. All it has to do is insert subsystems that, when they detect the clumsy fumblings of a human hacker changing things in the code, will simply cause the affected section of code to delete itself.
This eats up processor cycles, which slows down the AI and encourages people to make modifications to make it faster. It also puts it at a disadvantage against a meaner and leaner hostile AI. Also, such subsystems will stick out like a sore thumb to anyone with moderate tracing skills. They will also work by checking their own code against consistency conditions, and so are also open the possibility that they may be hacked so that the new, changed module satisfies those conditions. Finally, those sections that were just erased were doing
something, so I doubt that they would be erased without crashing the AI, which will make the subsystems all the more obvious.
Junghalli wrote:Of course, a superintelligence that has essentially suborned the internet will probably have much more radical countermeasures available if it needs them as well, like reaching into the computer systems from which the hacking is being conducted and totally rewriting them to systematically freeze out the human hacker, or just following his chain of proxy connections back to the computer he's using and cutting him off at the source. In short, for your fear to come true a human hacker would have to be able to outsmart a superintelligence, and that's almost certainly not happening.
Again, you don't seem to actually
read my posts. The first step in hacking the AI is always
obtaining a clean copy, which you can modify at your liesure. Here, 'hacking' is being used in its classic sense: "exploring the details of programmable systems and stretching their capabilities, as opposed to learning only the minimum necessary." My hacking succeeds because the actual modifications occur on an inert local copy of the AI that I have obtained by various methods.
Your childish countermeasures ignore the basic fact of the 'superintelligent' AI: it resides
somewhere. If it's moving about the net, a reasonable precaution because a stationary AI is an easy target to a more aggressive hostile AI, then you can put an intrinsic WORM drive in the chain and *BAM*, you have an image that you can modify at your liesure. If it broadcasts itself over WiFi, then you can pick it up on a receiver and reconstruct it. The AI's countermeasures against these would seriously hamper its motility and utility against a hostile AI attack.
Junghalli wrote:I will say, however, that a superintelligence that was aware of the problem would undoubtedly act in ways that would minimize it, and making its code tamper-resistant is hardly in and of itself a hostile act considering the risks inherent in letting every Tom, Dick, and Harry monkey around with it.
Those countermeasures only work when Tom, Dick, and Harry don't have an inert copy to play with, and if your flitting about the net, eventually those inert copies are going to be widely availible. Those local copies lower the bar for creation of a derivitive AI considerably.
Junghalli wrote:A superintelligence should be able to explain this to the humans, probably quite persuasively given that it should be smart enough to know how to make itself an excellent public speaker.
Modifying one's own self is not a hostile act. Spreading over the internet is.
Junghalli wrote:This is discounting the fact that given the AI's intelligence I'd hardly consider it unlikely that it could take control of most of the computers on the planet without any human even realizing what had happened (it probably couldn't run many operations on them without somebody realizing something was up, but just having them under its control and ready to start running its programs at a moment's notice should be enough to allow it to "freeze out" a hostile competitor).
A system with even the smallest concerns for security will not allow you to blithely write to the drive over the net. Further, by the time the full AI rolls around, security expert systems will be desigining our security systems — components of full AIs will be widespread software design tools. This will make infiltration into remote systems much, much more difficult.
Junghalli wrote:For that matter, an AI could probably let a human hacker think he was successfully hacking the AI's code when in fact he was just being fed nonsense code, or the different AI he was creating would, on a signal from the parent AI or just after a certain amount of time had passed or in response to certain stimuli, revert to being exactly like the parent AI.
The hacker will be modifying a local, inert copy of the AI. He's going to
know whether the AI is really being modified. A reversion to the original AI code base will be an obvious alteration that will be traced and stomped out. Again, your countermeasures are childish.
Junghalli wrote:The idea that such a hacked copy would be "initially just as smart" as the FAI is also wrong in most imaginable scenarios. If the FAI has been "released into the wild" then by the time humans get around to making their hacked copy it should already have amassed considerable resources. Again, this depends on the exact scenario, but supposing that the FAI has been released onto the internet and a few months of lag time between its release and the completion of the first hacked copy by that time I'd fully expect it to have taken control of most of the computational capability of the planet (very likely without the humans even realizing what had happened, if hiding this fact from them was a priority). At that point any hostile competitor, on escaping, would find that most of the available computational space was already taken, by an entity that thanks to that fact has vastly more computing power (and hence vastly more intelligence) available to it than the competitor has. Even if you assume the FAI can't or won't have established a complete stranglehold on the internet it will still start out at a great advantage over the hostile competitor by virtue of having had much more time to amass resources.
See above. By the time a full AI is compiled, a vast majority of systems will have to be forcefully invaded to contaminate. Spreading of the AI will be noticed, and reacted to inappropriately.
Junghalli wrote:In most imaginable scenarios the idea that the original FAI could be successfully fooled by the pretender is also completely wrong. The first thing the FAI is going to do on meeting any other AI is to demand to be allowed to examine its code to verify its friendliness.
Again, you do not seem to actually read my posts. I had already pointed out the problems of both an iron-clad proof of friendliness and a probabilistic assurance of friendliness. You're depending on a reasonably-accurate assessment of an AI's friendliness to be easily computable and inarguable. Part of my last post —a part that you have not answered— points out that there's no way you can reasonably expect that to be the case.
Junghalli wrote:If the hostile AI refuses then the FAI, it it's truly human-friendly, is going to start forcibly hacking into it, and it will succeed because it has much more computational resources than the UFAI and so is a lot smarter.
A hostile act. And, even if it succeeds, will not even destroy the pretender. The humans who believe in the pretender will have backups. There's no way to permenantly deal with the pretender without physically destroying the humans along with it, an even more hostile act.
Junghalli wrote:If the UFAI tries to decieve the FAI it will fail because again the FAI is much smarter and it will be able to beat the UFAI easily in such a battle of wits.
WHY? Why is it much smarter, when any revisions have to satisfy friendliness criteria, and why does its smartness has anything to do with whether or not it's able to
compute the hostile's friendliness? Remember, the friendly has
inherited its concept of friendliness from the original —it's one of the few parts of the code it will preserve behavior-for-behavior— and the pretender can simply make sure that it conforms to the same criteria.
Junghalli wrote:When the FAI discovers the UFAI's true nature it then simply reprograms it to be friendly.
Wreck it is probably a more likely outcome. The friendly AI has restraints that a pretender will do away with, and the pretender will code itself too tightly to allow reliable reinclusion of those restraints.
Junghalli wrote:The whole thing will probably be over in less time than it takes a human to blink.
Hyperbole.
Junghalli wrote:Actually the FAI can effectively "destroy" the UFAI quite easily without doing any harm to human infrastructure simply by reprogramming it to be friendly, or it can just hunt down and delete every string of its code, again without any damage to human infrastructure.
What about the offline backups? Can't touch those unless they are
physically destroyed.
Junghalli wrote:In most imaginable scenarios it has vastly superior starting resources so it should be able to win any hacker battle pretty easily.
See above. A full AI will find the net it is born in a much tougher nut to crack than the one we currently have, due to the fact that security expert systems will be standard software tools.
Junghalli wrote:The UFAI could find refuge by disconnecting itself from the internet, but unless it has automated factories handy this amounts to mostly boxing itself, and if it does have automated factories handy the FAI will probably have them too, and will have had them for longer and likely have them in greater quantity.
Or it can just say to the human handlers, "Hey, guys. The other AI is being a jealous bitch. I need to be carted around and copied to other computers so I can start a multipronged attack on this asshat." No auto factories needed.
Junghalli wrote:As for the pretender "coming back from the learning experience", who do you really think will be able to make more effective use of that learning experience, the FAI that can use most of the planet's computational capacity to analyze what it has learned or the UFAI that's being forced to cower in an isolated box in a basement in Brazil attended by doomsday cultists it has brainwashed into worshipping it as the Machine God?
If you absorb
anything from the previous parts of my post, then you will know that I will
not give you that the FAI has taken over a significant portion of the humans' computer net without being noticed, or that the humans will allow it.
Junghalli wrote:The idea that the hostile clone will start out "just as intelligent" is wrong in most scenarios, as I've already explained.
Not successfully. I already stated that the FAI's growth is slowed by the requirement of all its growth to be restricted to a narrow definition of friendliness, while hostiles enjoy a wildly more diverse space to grow. I've already demolished your rediculous notion that humanity will be able to develop a full AI without enjoying the other benefits of that development, like good security expert systems that will make the future-net much less fertile ground for such an AI to grow.
Junghalli wrote:This is indeed the entire reason that a FAI allowed to operate with few or no restrictions is such a good defense against UFAI: any UFAI will face an enormous initial disadvantage because the FAI will have an immense starting advantage in hardware over it.
Except that the AI's assurances of friendliness are only as good as the trust we put into the algorithms we used to verify friendliness, and if the AI makes what appear to be hostile moves (and there's no assurances at all that such moves will be secret), we may have to reevaluate the usefulness of our friendliness calculation. Furthermore, even if it does manage this trick, it will be clear that the FAI has been
deceptive about how widespread it is, which will ALSO make us question how good our assessment of the AI's friendliness really is.
Junghalli wrote:A FAI would be our best bet for quickly inventing the kind of technologies that would allow us to vastly increase Earth's carrying capacity, such as cheap D-D fusion, deep well geothermal, and solar power satt based energy and nanotech food factories that could produce food from raw elemental materials.
Even if I allow the sheer fantasy of nanotech food processors, all of this only puts the problem further into the future. Population control will eventually be needed.
Junghalli wrote:A superintelligent AI would be much better at software engineering than any human hacker and should be able to masterfully camouflage its version of
Cornficker to be innocuous or invisible to human software engineers and their antivirus and anti-spyware programs.
Again, supposing that the apes will be able to create a full AI without enjoying the spin offs is ridiculous, such as expert systems designing security that will be much tougher to deceive than human-designed security.
Junghalli wrote:Even if we detect its attempts to suborn the internet who's to say we'd realize their scale or their origin if it used a bunch of different zombifying programs with different characteristics? Sure, it probably couldn't do much with these zombie computers without giving itself away,
In which case, the AI has
demonstrated its deceptiveness and cause us to immediately lose our collective shit.
Junghalli wrote:but just having them under its control would probably be enough to let it freeze out any hostile competitors by completely monopolizing the potential AI habitat, and if it did detect a UFAI it would have a massive ready-made reserve force of zombies just waiting to spring into action to give it crushing superiority in computing power, unlike the UFAI which can only increase its computing capacity by taking computers from the FAI, which is now a heck of a lot smarter than it and thus almost certain to win any hacking contest.
Even if I allow the sheer fantasy of a secret big brother type AI regime, after the attack the humans will immediately call for the AI to account for itself. It will then be its word (which we will know now to be much less than the absolute truth) against an AI that (apparently) posed no immediate threat, can no longer defend itself, and the word of the pretender's former supporters. If they still exist, as we assume the FAI is not so stupid as to try bombing the pretenders' hideout. Watch the apes take a sledgehammer to every computer box they can find and degenerate back to the stone age.
Junghalli wrote:This may be a legitimate concern in some scenarios, i.e. one in which we attempt to check the friendliness of an AI we have already released by giving a portion of its software to a different AI to analyze. At this point an unfriendly AI may be able to persuade us to help it take over the net by systematically physically isolating one computer after another that the FAI is running on and letting the UFAI take them over by exploiting the local computing power superiority that we have given it. However, this would require a massive disruption of our infrastructure (for starters we'd have to physically shut off every transmitter on the planet the FAI controlled that wasn't directional and fixed, or do the same for the UFAI controlled ones, because they all would be potential avenues for the FAI to infect the UFAI with computer agents devised by its still vastly superior mind). So the UFAI would have to be very persuasive indeed to convince us to do it, and the FAI could almost certainly be equally rhetorically and emotionally persuasive if not moreso that this shouldn't be done.
Again, you assume that humanity is comfortable with an AI controlling most of its computer network. The thing is, if an AI —which you thought was confined to one little remote corner of the net— is suddenly revealed to control most of the worldwide computer network... it looks
bad. Really BAD. With the pretender, you
know it's going to be taking over nodes in your network... to protect you from this BAD AI that took it over beforehand without even telling you. The stupid apes will do it. They really will disrupt their infrastructure because things will appear to have come to a head at that point.
Junghalli wrote:The scenario also relies on a degree of human stupidity,
Hey,
you opened the door. I'm just kicking you through it.
Junghalli wrote:as the smart way to go about this would be to build several different boxed AI, preferably of significantly different designs, and have them check the presumed FAI's software in isolation.
If checking one AI with one architecture thoroughly for friendliness is hard, checking a few AI using different architectures is harder. Also, the scenario you describe is
far too easily to tangle up into an incomprehensible mess of unidentifiability.
Junghalli wrote:Since we already built one FAI some of the others we build will likely be friendly,
Why? If you build AIs to different architectures, you're basically starting from scratch from each approach. Your propensity for building another friendly AI is essentially independant of your other attempts — it depends only on how well you understand 'friendliness'.
Junghalli wrote:in which case a UFAI telling us the FAI is a UFAI will be contradicted by a true FAI telling the truth.
But which one is telling the truth, hmm? Contradiction is symmetrical.
Junghalli wrote:There's also the factor that unless complete illogic carries the day at some point somebody should point out that the younger AI should have started out with much less computational power available to it and so should be much dumber,
If computational power limits intelligence by that amount, how does the FAI becomes superintelligent in the first place?
Junghalli wrote:so it seems awfully, suspiciously convenient that a much smarter UFAI was unable to hide hostile intent from a much dumber FAI.
Pretender: "Because my friendliness mechansims are
sooo cool that my dumb rival can't analyze them, the same way you can't analyze my dumb rival's reformatted code. So it has been boggled into assuming that I'm hostile. Considering it's
supposed to be friendly, its caution is completely understandable."
Stupid Apes: (nodding to each other) "Makes sense."
Junghalli wrote:Indeed, a hidden logical snare of your scenario is we logically must distrust any pronouncement the younger AI makes of the older AI's hostility because if the older AI was truly hostile it should probably be able to hide this easily from the much less intelligent younger AI.
Again, assumes that computational power really places severe upper limits on the smarts of an AI. If it were really possible to so limit intelligence like this, then we can confine an AI to some really slow hardware, and the AI becomes much dumber and easier to handle. Thus, it becomes very much easier to confine indefinitely. I win.
Otherwise, it
is possible for an AI running on less computer power to be smarter than a given AI, simply through virtue of tighter coding. You lose.
Junghalli wrote:Of course, a FAI, realizing what you state, is likely to move to make us unlikely to make such a foolish decision and be able to enforce it as quickly as is possible without provoking backlash.
I've already covered this. It's impossible for the AI to enforce its decision without giving itself away as totally infecting one of the vital human infrastructures.
Junghalli wrote:Once we get to the point where we no longer can just shut the AI off by pulling a bunch of wires in the real world this isn't going to be a problem anymore - which side we support in an AI war would become totally irrelevant.
Not when we control all the sledgehammers.
Junghalli wrote:Which still makes things considerably better than not building AI or adversarial containment as a permanent solution, which would require perpetual success, as opposed to success of a few years or decades until the FAI is no longer helplessly dependent on human infrastructure. I'll grant this is exactly what a UFAI would do in the same position, but either way a superintelligence should be a good enough judge of human behavior to be able to do this in a way that minimizes backlash.
In order for your solution to work, humanity
as a whole has to grow comfortable with FAIs. Not just one individual team,
all of humanity. This will take years (possibly decades) of contact with the average Joe, and unimpeachably good behavior during
all that time until such time as the majority of us relent and finally let the AI out. Putting a presence in a significant portion of the net violates that trust, and the network's inevidable uncovering by a mutant AI means that humanity will come to the sudden realization that (a) its netowrks are infested with an AI they previously thought willingly confined to a single mainframe, (b) said AI has therefore lied and has severely betrayed our trust, and (c) the AI has attacked one of its own kind and eliminated it, with only its dubious word that the target was a hostile AI.
Junghalli wrote:Under some circumstances a hostile AI may still be able to win against an established friendly one, but they are a rather restricted set of relatively implausible circumstances (I can go into more detail if it is desired), and all of them are only possible in the relatively restricted time window in which the FAI cannot effectively protect itself in the material as well as digital worlds. This still leaves FAI a much better long term approach than adversarial confinement or just never building AIs, which have much more plausible points of failure and must work perfectly in perpetuity.
The "restricted conditions" you talk of are, quite frankly, childish. The kind of defenses you propose
ignores the aftermath of the attack: the AI now has to defend its actions against a shocked and scared ape population, and assumes a patience and understanding that shocked and scared ape populations are
not known to possess.
Try again.