Deep Blue Branching Out
IBM fine-tunes Deep Blue for commercial arena
By Gail Robinson, EE Times
Yorktown Heights, N.Y. -- Now that the much ballyhooed chess match between
IBM's Deep Blue and Garry Kasparov is history, the technology responsible
for the outcome is being retargeted toward new and very different applications.
The IBM researchers who developed the custom chip at the heart of the
system--along with their counterparts at VLSI Technology Inc. (San Jose,
Calif.), which handled the implementation and manufacturing of the
technology--are examining its potential for financial modeling, pharmaceutical
and medical applications. Such fields might value the system's ability to
search huge data sets intelligently.
"The publicity of the match might cause people to lose sight of the practical
issues," said Tom Ebzery, director of VLSI's Eastern Region Technology Center
(Burlington, Mass.)."IBM was not just in this for the fun of it." While the
project garnered huge worldwide exposure for Deep Blue in particular and
for parallel processing in general, IBM's objective was to learn more about
applying the technology to real-world problems, Ebzery said. "Playing chess
is a nicely definable parallel-processing application, but the whole idea
is how can you then use it in other applications that are more lucrative,"
he said.
Potential areas under investigation match similar operations in which
goal-directed searches of huge combinatorial databases are needed. For example,
drug design could be a potential market because huge numbers of molecular
configurations must be tested before a new drug is identified. Another area
is financial modeling. "The investment community will eat all the Mips that
you can throw at them just doing economic analysis," said a spokesman for
VLSI.
The investment community is evaluating one example: an investor-oriented
model of the mini SP2, a smaller version of Deep Blue. The system uses a
variation of the chess-playing concept of anticipating outcomes three or
four moves ahead.
Deep Blue blends special-purpose hardware and general-purpose supercomputing.
The parallel processor was structured as 32 workstation-class machines, connected
to 16 VLSI custom chips. "These custom chips were designed to specifically
handle chess computations. They have the capability of generating chess moves
by themselves," said F.H. Hsu, a research staff member at IBM's T.J. Watson
Research Center. The chips could also evaluate chess positions based on database
searches of model games.
Essentially, each chip was a complete chess machine in itself, evaluating
2 million to 3 million chess positions per second. "If you compare that to
a chess program run on a Cray computer, each chip is roughly about the same
speed as the fastest Cray," noted Hsu. The design had to break new ground
in memory integration on chip as well as in coordinating memory and processors
at the system level.
The Deep Blue project required a high level of integration--about 2 million
transistors per chip--which was actually beyond the technology's capability
when the project was conceived in the early 1990s. "There were 512 chips
used in the full-up system--where each chip consisted of 1.7 million transistors,
breaking down to about 150,000 gates and about 300,000 bits of RAM and ROM,"
Hsu recalled. Overall, the chess-evaluation process makes high demands on
the memory-to-processor bottleneck, and the design team needed a lot of
flexibility when integrating memory and logic on chip. IBM used VLSI's
memory-compiler technology, which could target different-sized memories to
meet design requirements.
"The compiler would automatically generate those configurations per their
specification--a capability that VLSI brought to the table back in the early
'90s," Hsu said.
The hardware provided plenty of processing capability. "A lot of dedicated
hardware to do the chess specific calculations, data searches and what-if
decisions," said Ebzery. "If this had been done in software--an approach
taken by some other systems--it would never have had anywhere near the
performance that a dedicated hardware engine provides." The speedup resulted
in a two orders of magnitude difference for position analysis.
Because of the chip's size, one of the biggest challenges was its layout.
"There were about 65 separate blocks of memory that had to be wired together,
so it was a very complex physical design problem that also had to meet the
performance goals," noted Ebzery. "So, we had to work closely with the IBM
engineers and understand where their critical timing paths were to get through
the back end of the process."
While the chips were optimized for the basic search and evaluation operation,
an important innovation at the system level was the integration of expert
knowledge--in this case, chess experts--that would sort through the raw position
look-ahead information to decide on the best move. "You can do searches that
are quite different from what you are limited to in hardware," Hsu said.
The dedicated hardware algorithm could look up to eight moves ahead, covering
all possibilities, which turns out to be a huge combinatorial search problem.
"At the software level, you can do more selective searches. The idea is that
you can go much deeper than normal, based on intelligent evaluation of the
situation," Hsu said.
The software was written in C and runs on AIX, an IBM version of Unix. Some
extensions were added to map the chip data-search results directly into the
memory space."To be able to solve a part of the search at the system level
gives you flexibility," he said. "If you want to do a certain depth of search,
the operation can be adjusted on how much information you have at the moment."
Hsu called the scheme pretty sound and said the team may further optimize
the software in a future system. "In the end, it was software innovations
and built-in chess-expert knowledge that tipped the scales," he said. In
looking at the difference between match one and match two, the amount of
chess knowledge in the program was critical." The improvements resulted from
a detailed analysis of why the machine didn't play certain positions well.
One specific performance improvement simply involved reducing memory delays
by knowing what information to request.
The two groups are now looking at the hardware/software interface as a means
of getting better performance for the system. The position evaluation and
look-ahead chips interface directly with external RAM, which opens up the
possibility of bringing software flexibility down to the hardware level,
Hsu said.
The chips could also be enhanced. "There is a lot of room to move down the
process-technology road map to achieve better performance and higher clock
rates," noted Ebzery. "And there's certainly a lot of opportunity to add
more functionality. So we are poised and ready to support IBM if and when
they decide to schedule another match and build another machine."
(Next article.)
(c) 1997 CMP Media, Inc
[This article comes from EE Times in a joint cooperative effort
with the Motley Fool. For more articles like it, please look at Fool's Gold
every weekend or simply go to the Fool's Gold Mine and page through our back
issues, which all have clever and cool EE Times articles in
them.]
|