[geos-devel] Re: Some performance observations

Paul Ramsey pramsey at refractions.net
Wed Nov 22 15:00:53 EST 2006


Thanks Shengnan,

It is possible that our GEOS implementation (being a fairly naive port 
from Java) is still a cause of the gross inefficiencies (in particular 
the many small malloc/free cycles).

I think our priority order should be first removing the bottlenecks in 
our coding style (fix our memory management inefficiencies) and then 
look at using some of the CPU/GPU features available in new cores to eke 
out some more performance going forward.

Even when we do that, the bulk of the processing time will still sit on 
the GIS side, but at least it will be a smaller bulk :)

Paul

Cong, Shengnan wrote:
> Hi, Paul,
> 
> I have done some experiments with PostGIS using synthetic data from
> Andrew Rogers. The data sets (100k~10M) were raw data roughly restricted
> to the continental US, and the test queries were subsets restricted by a
> given bounding box, using a number of different bounding box sizes. The
> GIST geospatial index was used.
> 
> I used V-tune (Intel performance tool) to get profiling information and
> studied the source codes a bit. 
> 
> Here are some observations:
> 
> 1. The query processing is computation bound. There is little cache
> misses observed. The L2 cache miss ratio is below 1%. And there is no
> observation of bus saturation.
>  
> 2. The breakdown of computation time are mainly:   
> -- 47.5% in GEOS lib (spatial operations)
> -- 34.1% in system calls (around 60% in malloc/free)
> -- 5.4% in PgSQL server 
> -- 4.3% in PostGIS lib 
> 
> 3. The time spent in GEOS lib is not focused on some specific function,
> The time was evenly distributed among various tiny functions.
> 
> 4. The parallelization problem probably is more DBMS related than GIS
> related, since it may involve PgSQL internals more than PostGIS
> internals.
> 
> It shows that the GEOS lib and the memory management are the performance
> bottleneck of query processing, instead of PgSQL or I/O.
> 
> The GEOS lib contains tiny functions and dynamic linked. Probably if the
> lib is inlined, the performance could be improved. And also some spatial
> operations may be covered by Intel IPP (Integrated Performance
> Primitives), which may help to achieve better performance of the GEOS
> lib. In regard to memory management, using self-managed memory may
> reduce the overhead of malloc and free calls.
> 
> Thanks.
> 
> Shengnan




More information about the geos-devel mailing list