[geos-devel] Re: Some performance observations
Paul Ramsey
pramsey at refractions.net
Wed Nov 22 15:00:53 EST 2006
Thanks Shengnan,
It is possible that our GEOS implementation (being a fairly naive port
from Java) is still a cause of the gross inefficiencies (in particular
the many small malloc/free cycles).
I think our priority order should be first removing the bottlenecks in
our coding style (fix our memory management inefficiencies) and then
look at using some of the CPU/GPU features available in new cores to eke
out some more performance going forward.
Even when we do that, the bulk of the processing time will still sit on
the GIS side, but at least it will be a smaller bulk :)
Paul
Cong, Shengnan wrote:
> Hi, Paul,
>
> I have done some experiments with PostGIS using synthetic data from
> Andrew Rogers. The data sets (100k~10M) were raw data roughly restricted
> to the continental US, and the test queries were subsets restricted by a
> given bounding box, using a number of different bounding box sizes. The
> GIST geospatial index was used.
>
> I used V-tune (Intel performance tool) to get profiling information and
> studied the source codes a bit.
>
> Here are some observations:
>
> 1. The query processing is computation bound. There is little cache
> misses observed. The L2 cache miss ratio is below 1%. And there is no
> observation of bus saturation.
>
> 2. The breakdown of computation time are mainly:
> -- 47.5% in GEOS lib (spatial operations)
> -- 34.1% in system calls (around 60% in malloc/free)
> -- 5.4% in PgSQL server
> -- 4.3% in PostGIS lib
>
> 3. The time spent in GEOS lib is not focused on some specific function,
> The time was evenly distributed among various tiny functions.
>
> 4. The parallelization problem probably is more DBMS related than GIS
> related, since it may involve PgSQL internals more than PostGIS
> internals.
>
> It shows that the GEOS lib and the memory management are the performance
> bottleneck of query processing, instead of PgSQL or I/O.
>
> The GEOS lib contains tiny functions and dynamic linked. Probably if the
> lib is inlined, the performance could be improved. And also some spatial
> operations may be covered by Intel IPP (Integrated Performance
> Primitives), which may help to achieve better performance of the GEOS
> lib. In regard to memory management, using self-managed memory may
> reduce the overhead of malloc and free calls.
>
> Thanks.
>
> Shengnan
More information about the geos-devel
mailing list