[postgis-users] WKB Geometry Column

James Marca jmarca at translab.its.uci.edu
Mon Mar 21 12:49:35 PST 2005


At approximately Fri, Mar 18, 2005 at 10:21:03AM -0700, Sean M. Montague wrote:
> Does this sound reasonable?  Given that pulling from the database is
> faster as WKB, would it be worth the effort to develop a client side app

how fast is the  client-side parse of binary data compared to letting
the db do it?  Javascript in a (web|svg) browser vs Perl/PHP/Java in a
webserver vs C++ on the db machine.  It's not an easy comparison to
make, but I decided that the Postgis guys are probably going to write
faster parsing code in C++ than I can write in Perl, so if there is a
one-to-one relationship between webserver queries and db queries
(which of course there is not), then I should let the db do the
conversion work.   

Some other comments on my most recent experience (Nov 2004).  First, I
use Class::DBI in perl to handle connecting to the DB.  While I think
this has been corrected recently, I had trouble getting binary data
out of the database without crashing Class::DBI.  My base class for
all things PostGIS related contains the lines

__PACKAGE__->columns('Essential'=>('CAST( AsBinary(geom) as bytea) as wkb', 'srid(geom) as srid'));
__PACKAGE__->columns('SegFaultBait'=>qw/geom/); 

although last time I tried it, accessing object->geom no longer causes
a segfault.

I found it more difficult to *write* binary data through the
Class::DBI interface to PostgreSQL.  I never go it to work.  While it
was probably my fault in not correctly manipulating the Class::DBI
interface and/or this may have been corrected recently, I ended up
having to revert to using my constructed binary sequences to build a
WKT representation of the data.

As to handling the data in perl, I always ended up having to unpack
the data at some point.  For example, I did the following to create a
PDL object from a binary line

  ## $binary_sequence contains just the line data, not the header stuff
  # make a pdl line,                                                       
  my @pairs;
  my @data = unpack('d*',$binary_sequence);
  while ( @data ){
      push @pairs, [shift @data, shift @data] or croak "ran out of points";
  }
  # load coords into pdl                                                   
  my $line = pdl(@pairs);

So to return to my above comment, is unpack('d*') in perl faster than
whatever is happening in the postgis 1.0 code, given equal resources
to each process.  Then given actual loads, which process is faster.
For a typical query, my webserver is usually the bottleneck, while the
db is usually pretty idle, so I try to let the DB do as much work as
possible.  However, I've never actually timed anything, and the above
work was all done with 0.9 code, so any evidence that doing it the
other way with 1.0 code is better would be helpful.

--- James

> that parses the binary data, and appends the node of an empty SVG
> document.  I know this is possible with JS, appending nodes of svg
> documents once they're client side.  What are the possible pros and cons
> of doing this?  Thanks.
> 
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Paul
> Ramsey
> Sent: Thursday, March 17, 2005 3:17 PM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] WKB Geometry Column
> 
> Stephen Woodbridge wrote:
> 
> > I have follow up questions.
> > What applications on the client side use the WKB format now?
> 
> Mapserver, Geotools (Geoserver / uDig), for a few.
> 
> > Where is the WKB format documented?
> 
> In the OpenGIS SGSQL standard. Probably we should copy that description 
> into our own documentation.
> 
> > As I think it would make sense to build a PHP and Perl Class the can 
> > parse the WKB into data structures if they don't already exist. Do
> these 
> > already exist?
> 
> Not sure. I have done half-assed perl WKB handlers in the past, but they
> 
> are long gone into the bit bucket...
> 
> Paul
> 
> > 
> > -Steve
> > 
> > Paul Ramsey wrote:
> > 
> >> Sean,
> >>
> >> You are confusing storage format with inout/output format. No matter 
> >> how you get data *into* postgis, it will be stored the same way,
> using 
> >> the same internal structure. You have three choices for getting data 
> >> into and out of postgis:
> >>
> >> - Use the "canonical format". This is a string representation of the 
> >> internal format. It might change, if we want to change the internal 
> >> format. It currently looks a lot like WKB, but it is not. You can see
> 
> >> the canonical format by going "select geom_column from geom_table".
> >> - Use WKT. This is a human readable form of the geometry, using 
> >> numbers and brackets and whatnot.  You can see the WKT format by
> going 
> >> "select astext(geom_column) from geom_table".
> >> - Use WKB. This is a computer readable form of hte geometry, a 
> >> byte-stream.  You can see the canonical format by going "select 
> >> asbinary(geom_column) from geom_table".
> >>
> >> The only thing that is "faster" about WKB is your handling of it on 
> >> the client side. It is much faster for client software to parse the 
> >> byte-stream than it for it to parse WKT. PostGIS itself will not have
> 
> >> noticably different performance if you use WKB or WKT on output.
> >>
> >> Paul
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list