[postgis-users] optimal textbased geometry format

Willy-Bas Loos willybas at gmail.com
Tue Jul 29 10:11:42 PDT 2008


funny, the EWKB is exactly half the size of the "PostGIS binary notation",
in all cases i tested.
there is a small performance impact, which fluctuates somewhat. The internal
format was a factor 1.01 to 1.07 faster (94 to 99% of the time needed).

cheers,

WBL

On Tue, Jul 29, 2008 at 6:48 PM, Willy-Bas Loos <willybas at gmail.com> wrote:

> thanks.
> yes it helps :)
>
> On Tue, Jul 29, 2008 at 4:46 PM, Obe, Regina <robe.dnd at cityofboston.gov>wrote:
>
>>  Haven't done any benchmarks, but I know the following
>>
>> the ::text display you get is basically the string representation of the
>> native binary format that PostGIS actually stores the geometries.
>>
>> It is not WKB nor WKT.  Its a PostGIS proprietary format if you will.  I
>> would guess casting from that format is the most efficient since in theory
>> there is very little processing that PostgreSQL needs to do to reconstitute
>> from that format.
>>
>> Keep in mind EWKT and WKT also may result in floating point errors since
>> they will round the decimals so be careful using those.
>>
>> So your best formats to maintain precision (and not have lossiness) are
>> EWKB and the ::text representation (which I shall dubb the native PostGIS
>> binary notation (which I guess we can call light-weight something or
>> other)).
>>
>> Hope that helps,
>> Regina
>>
>>  ------------------------------
>> *From:* postgis-users-bounces at postgis.refractions.net [mailto:
>> postgis-users-bounces at postgis.refractions.net] *On Behalf Of *Willy-Bas
>> Loos
>> *Sent:* Tuesday, July 29, 2008 8:43 AM
>> *To:* postgis-users at postgis.refractions.net
>> *Subject:* [postgis-users] optimal textbased geometry format
>>
>>  Sometimes i build queries inside plpgsql scripts, that i then run using
>> EXECUTE.
>> Is there a preferred way to include geometries in those?
>>
>> I use postgis 1.1 (for now). Are there are any changes in later versions
>> that influence this aspect of PostGIS?
>> The SRID is a factor for me, so asText(the_geom) is no option.
>> I eiter use asEWKT(the_geom), or just the_geom::text .
>> Just casting the geometry directly into text results in a very long text
>> value (what is the name of this format?). The length of the string might be
>> a network transfer drawback.
>> asEWKT results in human readable format, possibly shorter, but it might
>> require more processing power (back and forth!).
>>
>> I've tested both difference in length and performance, all on a limited
>> dataset.
>> LENGTH:
>> The difference in length varies greatly. I use SRID 28992, which is a
>> national grid that has coordinates in meters, so it is possible to have
>> coordinates with 0 decimal places (1 meter precision). Simple geometries
>> seem to benefit most from the WKT format. The EWKT is 40% of the number of
>> characters for some cases (square polygons from coordinates without
>> decimals). Most polygons score 50-60% of length, but in some cases the WKT
>> representation is up to 10% longer than the direct cast (geom::text).
>> code:
>>   --select length(the_geom::text), length(asewkt(the_geom)),
>> round(((length(asewkt(the_geom))::float)/(length(the_geom::text))::float)*100)
>> as perc
>>   select
>> avg(((length(asewkt(the_geom))::float)/(length(the_geom::text))::float)*100)
>> as perc
>>   from atable
>>   order by perc
>>
>> PERFORMANCE:
>> I've tested these "warm" (not the first run), and averaged the scores from
>> 3 measurements each. The fluctuations were minimal.
>> The diference in performance is small for converting from geometries to a
>> text representation. Casting directly is a factor 1.1 faster (90% of the
>> time needed)
>> The other way around is a diferent story. Converting a directly casted
>> geometry back to a geometry is a factor 6 faster than a EWKT string (16% of
>> the time needed)!
>> I hope my method (code below) is adequate?
>>
>> code:
>>   select area(the_geom::text::geometry)
>>   --select area(asewkt(the_geom)::geometry)
>>   from atable
>>
>> Regarding my tests, i would say that the format that results from a direct
>> cast is the better string representation for intra-application
>> communication. Are there any drawbacks?
>> Of course my test was limited to a small dataset on a windows pc (mem 2GB,
>> athlon 64 3200+) with postgis 1.1. Does anyone else have different
>> results/ideas about this?
>>
>> Cheers,
>>
>> WBL
>>
>>
>> ------------------------------
>>
>> * The substance of this message, including any attachments, may be
>> confidential, legally privileged and/or exempt from disclosure pursuant to
>> Massachusetts law. It is intended solely for the addressee. If you received
>> this in error, please contact the sender and delete the material from any
>> computer. *
>>
>> ------------------------------
>>
>> * Help make the earth a greener place. If at all possible resist printing
>> this email and join us in saving paper. *
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20080729/e9f1ad4c/attachment.html>


More information about the postgis-users mailing list