[GRASS5] GRASS on parallel CPUs

Wed Apr 26 15:34:02 EDT 2000

Hi all,

i am too not an expert on parallel computing.
But i will try to explain what i learned from the web and what i know
from my computer expirience.

Parallel Computing and Clustering is a very complex topic and as far as
i know the computer industry changed their attitude and the models many
times. 

Yo have to distinguish between SMP (symmetrical multiprocessing, several
CPUs share one RAM in one system), closely coupled Cluster Computers
(several CPUs have each their own Memory and are connected via a
High-Speed Bus system) and loosely coupled Clusters (many interconnected
Computers, like Beowulf Clusters which are standard industry PCs
connected via a 100MBit Backbone/Switch). 

There are many other, more complicated systems which i will skip for
now.

SMP-Workstations/Servers can run different processes on different CPUs
if they use threads and the software is thread-safe and written for a
library that enables this (pthread-library on linux for example). This
is comparatively easy to achieve if the software starts a new process
for every connection like web-servers or database-servers. Most
commercial software uses this, but the performance gain may be low if
the application itself is not suited for parallel computing (like
word-processing). The bottle-neck with this is in the bus (memory
access, graphics access, disk) of the computer. Intel PCs are very
limited in bus bandwith compared to unix machines (Sun, SGI, others).
Software that does not use threads runs on those machines, but not
faster than on a single CPU machine. 

For parallel cluster computers you will need a complete re-thinking of
the problem/algorithm and to rewrite the application to use special
message passing libraries. The main problem with parallel computing is
the communication of the different nodes in the cluster. If the problem
is very well suited to parallel computing (e. g. the famous rendering of
parts of a scene for the titanic movie) the performance will increase
extremly (in essence this means that simply parts of the image are
independently calculated on different computers). But if the problem
requires much communication and much data transport between the nodes,
the performance may sink below the performance of a single-CPU-system. 

The idea behind the commercial Compilers is that tricky optimization of
existing code for parallel computers will give _some_ increase in
processing speed. The Portland group claims that their Fortran-Compiler
gives a 30% increase in speed (as i understand regardless if running on
a parallel machine or not). 

For commercial use this may be an economic solution to speed up existing
applications. You can calculate how much money you have to spend on
programming to get the same increase.  

I suspect that the commercial compilers will do a comparatively bad job
on optimizing the GRASS code as the libraries are not developed with
parallel computing in mind. Most computing-intensive raster calculations
are done via a conventional loop over each row and the compiler can not
anticipate which neighborhood-calculations are done at run-time. Same
with the segment library. I think that in image processing and in
simulations based on GRASS raster data some remarkable speedup is
possible. 

The usual compilers do a very good job on optimizing code (unrolling
loops etc.) but this is different from an optimization for parallel
processing. I think the difference between gcc and the commercial
compilers is due to the closer relation to the machine/processor (but
the Solaris compiler produces only about 15 % faster binaries than
gcc!). 

The Beowulf clusters use a standard Red Hat distribution as base (with
gcc/egcs compiler etc.) and use a message passing library for software
development and tools to manage >16 PCs (nodes) and to control the
distribution of data/tasks. The software is developed on this platform
and i think that it would be very complicated even to interface the
GRASS GIS library to this setup. But you could simply split up your job
(e. g. subbasins, parts of a raster) and run the computation as a batch
job on different machines and patch together the results afterwards. But
this will give you a speedup with x times CPU (e. g. 6 nodes, 6 times
faster than on single processor). You have to subtract your time to
split, distribute and patch together the data. 

To sum up: 
I think that if you have a problem that requires _very_ much computing
power and you can not solve this with a conventional setup (800 MHz
Intel system or a really fast unix server) you should evaluate if a
cluster will solve it. But then you will have to invest in the machine
_and_ in programming (and/or in a professional compiler/library). 

If compared to nowadays PC prices $2500 looks much. But if you consider
that _real_ computers in the upper range still cost about $50 000 this
is not a real invest. And you should consider what program development
costs, $2500 spent for 30% increase in performance is IMHO a good
investment. 

But for GRASS i can see no real applicability, if using a commercial
compiler you would have to distribute binaries, for different
architectures/setups. Only a small proportion of the GRASS users (those
with a Beowulf cluster in their garage) would benefit from this.

Thats just what i think, 

cu

Andreas

-- 
Andreas Lange, 65187 Wiesbaden, Germany, Tel. +49 611 807850
Andreas.Lange at Rhein-Main.de, A.C.Lange at GMX.net

---------------------------------------- 
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo at geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 6805
max: 0