[postgis-users] Storage efficiency of point and line data

Paul Ramsey pramsey at refractions.net
Mon Nov 4 13:31:52 PST 2002


typedef struct
{
   int32  size;     // postgres variable-length type requirement
   int32  SRID;     // spatial reference system id
   double offsetX;  // for precision grid (future improvement)
   double offsetY;  // for precision grid (future improvement)
   double scale;    // for precision grid (future improvement)
   int32  type;     // this type of geometry
   bool   is3d;     // true if the points are 3d (only for output)
   BOX3D  bvol;     // bounding volume of all the geo objects
   int32  nobjs;    // how many sub-objects in this object
   int32  objType[1];   // type of object
   int32  objOffset[1]; // offset (in bytes) into this structure where
                        // the object is located
   char   objData[1];   // store for actual objects

} GEOMETRY;

There's the structure, so above and beyond the actual ordinates, we are 
storing about 100 bytes of metadata. A bit more fluffy than a shapefile, 
but not alot. Admittedly though, when storing single points (or two 
point lines), it is a pretty massive overhead.

Michael Graff wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> It seems there is a large overhead to storing point and line data in
> a geometry type.  mem_size() returns 172 bytes for a two-point line,
> and goes up by 24 bytes per additional point.  Returning the data in
> binary form seems to show only 6 bytes per point, so perhaps this
> is twice the actual storage.
> 
> I thought about storing only the bounding boxes in a table, and
> storing the actual shape in a flat binary file (probably storing
> each lat/long pair as a pair of 32-bit signed integers) but it
> turns out that wouldn't be a huge win, as most of the data I have
> consists of 2 points:
> 
>    cnt    | points | size  
> - ----------+--------+-------
>  23333966 |      2 |   172
>   6789516 |      3 |   196
>   3712433 |      4 |   220
>   2438493 |      5 |   244
>   1749440 |      6 |   268
>   1346119 |      7 |   292
>    976198 |      8 |   316
>    806865 |      9 |   340
>    658199 |     10 |   364
> 
> Is the storage format fairly efficient, and I'm simply storing a whole
> lot of data?
> 


-- 
       __
      /
      | Paul Ramsey
      | Refractions Research
      | Email: pramsey at refractions.net
      | Phone: (250) 885-0632
      \_





More information about the postgis-users mailing list