[postgis-users] A PostGIS-Raster data proposal

Craig Miller craig.miller at spatialminds.com
Fri Oct 27 14:58:46 PDT 2006


Steve,

Thanks for reporting.  Was the database local?  Were the images loaded from
the filesystem local?

I'd think a good "real world" example would be to have both the images and
the database remotely located somewhere with access being done via NFS or
similar.  What I'm wondering is if at that point the network transfer time
will dwarf the file access times making file access times moot.

Wish I had some time to play with it right now.

--Craig
 

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
Marshall, Steve
Sent: Friday, October 27, 2006 2:32 PM
To: PostGIS Users Discussion
Subject: RE: [postgis-users] A PostGIS-Raster data proposal

Per Frank Warmerdam's suggestion, I've done a test of access performance
using internal  postgresql toast functions vs. normal file seeking.  

The test involved seeking in a toasted bytea column containing approximately
20 MB of binary data.  The TOAST column was set to EXTERNAL storage (i.e. in
separate TOAST table, but not compressed).
The test involved seeking through the data sequentially in chunks of 1000
bytes, and measuring the time to retrieve each chunk.  The code to do this
was encapsulated in a postgresql server-side function and invoked through
SQL.  I restarted the PostgreSQL server before the test to avoid having any
cacing of data in shared memory, which could artificially speed up the data
access.

As a comparison, I also wrote a program that would do the equivalent data
access from a file.  The file contained the same data as the bytea column,
and the access was replaced with fseek and fread calls.

The results of the test were that toast seeking was about 10 times more
expensive than seeking in a local file.  Each local file access averaged in
microseconds, while toast-seeks averaged 10's of microseconds.  The worst
case file seeking was in milliseconds, while worst case toast-seeking was in
10's of milliseconds.  The absolute values for toast-seeks don't seem too
bad to me, but it is a bit worrying that the values are an order of
magnitude worse than local file I/O.

I did play around with some parameters in the DB test.  Changing the chunk
size did not make a big difference, but it got a small boost by setting it
to the toast chunk size (1994 bytes). I did not vary the test to do seeking
around randomly instead of sequentially.  This might give a boost to the DB
implementation due to caching; I'm not sure what this would do to file I/O.

I also have not explored the performance of repeated access to the same data
segments.  Here PostgreSQL data caching might help DB access relative to
file I/O.

There are still more things to do here, but I thought I'd share some early
results.  I'm happy to provide the code and SQL definitions for the test, if
anyone else is interested in it.

Steve Marshall
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list