Bringing new meaning to the "blue screen of death."
=========
http://www.gcn.com/gcn/1998/July13/cov2.htm
GOVERNMENT NEWS
GCN July 13, 1998
Software glitches leave Navy Smart Ship dead in the water
By Gregory Slabodkin GCN Staff
The Navys Smart Ship technology may not be as smart as the service
contends.
Although PCs have reduced workloads for sailors aboard the Aegis missile
cruiser USS Yorktown, software glitches resulted in system failures and
crippled ship operations, according to Navy officials.
Navy brass have called the Yorktown Smart Ship pilot a success in
reducing manpower, maintenance and costs. The Navy began running
shipboard applications under Microsoft Windows NT so that fewer sailors
would be needed to control key ship functions.
But the Navy last fall learned a difficult lesson about automation: The
very information technology on which the ships depend also makes them
vulnerable. The Yorktown last September suffered a systems failure when
bad data was fed into its computers during maneuvers off the coast of
Cape Charles, Va.
The ship had to be towed into the Naval base at Norfolk, Va., because a
database overflow caused its propulsion system to fail, according to
Anthony DiGiorgio, a civilian engineer with the Atlantic Fleet Technical
Support Center in Norfolk.
We are putting equipment in the engine room that we cannot maintain and,
when it fails, results in a critical failure, DiGiorgio said. It took two
days of pierside maintenance to fix the problem.
The Yorktown has been towed into port after other systems failures, he
said.
Not officially
Atlantic Fleet officials acknowledged that the Yorktown last September
experienced what they termed an engineering local area network casualty,
but denied that the ships systems failure lasted as long as DiGiorgio
said. The Yorktown was dead in the water for about two hours and 45
minutes, fleet officials said, and did not have to be towed in.
This is the only time this casualty has occurred and the only propulsion
casualty involved with the control system since May 2, 1997, when
software configuration was frozen, Vice Adm. Henry Giffin, commander of
the Atlantic Fleets Naval Surface Force, reported in an Oct. 24, 1997,
memorandum.
Giffin wrote the memo to describe what really happened in hope of
clearing the scuttlebutt surrounding the incident, he noted.
The Yorktown lost control of its propulsion system because its computers
were unable to divide by the number zero, the memo said. The Yorktowns
Standard Monitoring Control System administrator entered zero into the
data field for the Remote Data Base Manager program. That caused the
database to overflow and crash all LAN consoles and miniature remote
terminal units, the memo said.
The program administrators are trained to bypass a bad data field and
change the value if such a problem occurs again, Atlantic Fleet officials
said.
But the Yorktowns failure in September 1997 was not as simple as
reported, DiGiorgio said.
If you understand computers, you know that a computer normally is immune
to the character of the data it processes, he wrote in the June U.S.
Naval Institutes Proceedings Magazine. Your $2.95 calculator, for
example, gives you a zero when you try to divide a number by zero, and
does not stop executing the next set of instructions. It seems that the
computers on the Yorktown were not designed to tolerate such a simple
failure.
The Navy reduced the Yorktown crew by 10 percent and saved more than $2.8
million a year using the computers. The ship uses dual 200-MHz Pentium
Pros from Intergraph Corp. of Huntsville, Ala. The PCs and server run NT
4.0 over a high-speed, fiber-optic LAN.
Blame it on the OS
But according to DiGiorgio, who in an interview said he has serviced
automated control systems on Navy ships for the past 26 years, the NT
operating system is the source of the Yorktowns computer problems.
NT applications aboard the Yorktown provide damage control, run the ships
control center on the bridge, monitor the engines and navigate the ship
when under way.
Using Windows NT, which is known to have some failure modes, on a warship
is similar to hoping that luck will be in our favor, DiGiorgio said.
Pacific and Atlantic fleets in March 1997 selected NT 4.0 as the standard
OS for both networks and PCs as part of the Navys Information Technology
for the 21st Century initiative. Current guidance approved by the Navys
chief information officer calls for all new applications to run under NT.
Ron Redman, deputy technical director of the Fleet Introduction Division
of the Aegis Program Executive Office, said there have been numerous
software failures associated with NT aboard the Yorktown.
Refining that is an ongoing process, Redman said. Unix is a better system
for control of equipment and machinery, whereas NT is a better system for
the transfer of information and data. NT has never been fully refined and
there are times when we have had shutdowns that resulted from NT.
Hauled in
The Yorktown has been towed into port several times because of the
systems failures, he said.
Because of politics, some things are being forced on us that without
political pressure we might not do, like Windows NT, Redman said. If it
were up to me I probably would not have used Windows NT in this
particular application. If we used Unix, we would have a system that has
less of a tendency to go down.
Although Unix is more reliable, Redman said, NT may become more reliable
with time.
The Navy is moving the services command and control applications from
Unix to NT as part of IT-21. Under IT-21, the Navy also plans to
modernize ships in the Atlantic and Pacific fleets with asynchronous
transfer mode LANs. Large ATM networks running NT have already been
installed on the USS Abraham Lincoln and USS Essex.
But DiGiorgio said the LANs might experience a chain reaction of computer
failures like those experienced on the Yorktown. That domino effect is
inherent to the system design of shipboard LANs, he said.
There is very little segregation of error when software shares bad data,
DiGiorgio said. Instead of one computer knocking off on the Yorktown,
they all did, one after the other. What if this happened in actual
combat?
Although the Yorktown did not have backup systems, Redman said that
future Smart Ships will have systems redundancy to ensure that ships can
continue to operate.
But DiGiorgio said that the Smart Ship project needs to do more
engineering up front.
Installing a control system on a warship and resolving problems as the
project progresses is a costly and naive process, DiGiorgio wrote in the
Proceedings article. Now, with the top people rotated off the Smart Ship
Project, it would be wise for the Navy to investigate this fiasco more
fully.
Redman has a different perspective. If it were me, I wouldnt say all the
things that Tony [DiGiorgio] has said out of discretion and consideration
for being a long-term employee, he said. But I will say this about Tony,
hes a very bright engineer.
Everybody plays the obedience role where you cannot criticize the system,
said DiGiorgio, a self-described whistle-blower. Im not that kind of guy.
Sidebar: Navy prepares to take Smart Ship full steam ahead
Despite the USS Yorktowns setbacks, the Navy plans to use Smart Ship
technology on other classes of ships. The Naval Sea Systems Command in
May awarded Litton Integrated Systems Corp. of Woodland Hills, Calif., a
$138.6 million contract to build Engineering Control System Equipment and
Integrated Bridge Systems for CG-47 Class Aegis cruisers. The Navy also
might install the equipment on DDG-51 class destroyers. Electronic Design
Inc. of Metairie, La., filed a protest of the award in late May with the
General Accounting Office. The Navy has issued a stop-work order that
will last until GAO rules on the protest. Smart Ship technology is also
on the amphibious ship USS Rushmore, Navy officials said.a