[postgis-users] Integer geometry type
Steve Grey
stevegrey78 at gmail.com
Mon Jan 21 00:13:32 PST 2013
Have you tried creating a table with x,y,z fields and creating a functional
index for the geometry?
Some experiments:
-- populate a table of 1m records, index, cluster and update:
CREATE TABLE lidar_geom
AS SELECT
generate_series(1,1000) as x
, y
, (random() * 100)::integer as z
FROM
(SELECT
generate_series(1,1000) as y
) AS ys;
CREATE INDEX idx_st_makepoint_x_y_z_lidar_geom ON lidar_geom USING GIST
(st_makepoint(x,y,z));
CLUSTER idx_st_makepoint_x_y_z_lidar_geom ON lidar_geom;
VACUUM ANALYZE lidar_geom;
-- check sizes (your numbers may vary):
SELECT pg_size_pretty(pg_total_relation_size('lidar_geom') +
pg_total_relation_size('idx_st_makepoint_x_y_z_lidar_geom'));
--"155 MB"
-- now, create another table (with the same values as the first) but with a
concrete geometry:
CREATE TABLE
lidar_geom_concrete(x,y,z,the_geom) AS
SELECT
x
, y
, z
, st_makepoint(x,y,z)
FROM
lidar_geom;
CREATE INDEX idx_lidar_geom_concrete_the_geom ON lidar_geom_concrete USING
GIST (the_geom);
CLUSTER idx_lidar_geom_concrete_the_geom ON lidar_geom_concrete;
VACUUM ANALYZE lidar_geom_concrete;
-- check sizes:
SELECT pg_size_pretty(pg_total_relation_size('lidar_geom_concrete') +
pg_total_relation_size('idx_lidar_geom_concrete_the_geom'));
--"169 MB"
So you win with a decrease in storage. My numbers for the query planner
were all over the place, but my installation isn't tuned. Actual query
performance was broadly comparable (within the tolerances of my PC):
EXPLAIN ANALYZE
SELECT
COUNT(*)
FROM
lidar_geom
WHERE
st_intersects(st_makepoint(x,y,z), st_geomfromtext('POLYGON((500 500,500
503,503 503,503 500,500 500))'));
"Aggregate (cost=25.55..25.56 rows=1 width=0) (actual time=0.593..0.593
rows=1 loops=1)"
" -> Bitmap Heap Scan on lidar_geom (cost=4.59..25.54 rows=2 width=0)
(actual time=0.528..0.586 rows=16 loops=1)"
" Recheck Cond: (st_makepoint((x)::double precision, (y)::double
precision, (z)::double precision) &&
'010300000001000000050000000000000000407F400000000000407F400000000000407F400000000000707F400000000000707F400000000000707F400000000000707F400000000000407F400000000000407F400000000000407F40'::geometry)"
" Filter: _st_intersects(st_makepoint((x)::double precision,
(y)::double precision, (z)::double precision),
'010300000001000000050000000000000000407F400000000000407F400000000000407F400000000000707F400000000000707F400000000000707F400000000000707F400000000000407F400000000000407F400000000000407F40'::geometry)"
" -> Bitmap Index Scan on idx_st_makepoint_x_y_z_lidar_geom
(cost=0.00..4.59 rows=5 width=0) (actual time=0.492..0.492 rows=16
loops=1)"
" Index Cond: (st_makepoint((x)::double precision, (y)::double
precision, (z)::double precision) &&
'010300000001000000050000000000000000407F400000000000407F400000000000407F400000000000707F400000000000707F400000000000707F400000000000707F400000000000407F400000000000407F400000000000407F40'::geometry)"
"Total runtime: 0.699 ms"
EXPLAIN ANALYZE
SELECT
COUNT(*)
FROM
lidar_geom_concrete
WHERE
st_intersects(the_geom, st_geomfromtext('POLYGON((500 500,500 503,503
503,503 500,500 500))'));
"Aggregate (cost=50.30..50.31 rows=1 width=0) (actual time=0.406..0.406
rows=1 loops=1)"
" -> Bitmap Heap Scan on lidar_geom_concrete (cost=4.60..50.29 rows=4
width=0) (actual time=0.362..0.399 rows=16 loops=1)"
" Recheck Cond: (the_geom &&
'010300000001000000050000000000000000407F400000000000407F400000000000407F400000000000707F400000000000707F400000000000707F400000000000707F400000000000407F400000000000407F400000000000407F40'::geometry)"
" Filter: _st_intersects(the_geom,
'010300000001000000050000000000000000407F400000000000407F400000000000407F400000000000707F400000000000707F400000000000707F400000000000707F400000000000407F400000000000407F400000000000407F40'::geometry)"
" -> Bitmap Index Scan on idx_lidar_geom_concrete_the_geom
(cost=0.00..4.60 rows=11 width=0) (actual time=0.348..0.348 rows=16
loops=1)"
" Index Cond: (the_geom &&
'010300000001000000050000000000000000407F400000000000407F400000000000407F400000000000707F400000000000707F400000000000707F400000000000707F400000000000407F400000000000407F400000000000407F40'::geometry)"
"Total runtime: 0.485 ms"
It looks like any difference came in the recheck, although the planner
expected the functional index (non-concrete geometry) to be faster.
This is based on Postgres 9.1, I have no idea what index-only scans in 9.2
would produce. Incidentally, playing with FILLFACTOR on the tables and
indices didn't help me.
As another suggestion, I read elsewhere - and some time ago - that it can
help to cluster points, perhaps 50 (tbd), into a single record so the
bounding box intersection test has much less work to do and you explode
etc. the results from there.
Steve
On Wed, Jan 16, 2013 at 1:49 PM, Bruce Rindahl <bruce.rindahl at gmail.com>
wrote:
> I am currently building a large PostGIS database of lidar data and going
> through several performance tweaks to make 1.5 billion records respond
> better. One of the big ones was to convert the z data by multiplying by
100
> and storing it as an integer instead of a double (I know - pretty
obvious).
> The resulting reduction in storage space is significant as is query
> performance. The x and y values are stored as a point value in a
geometry
> type which i believe is stored like two double precision values. Would
it
> be possible to specify a special geometry type using integer values
instead
> of doubles? You could either add a factor (and a special SRID) or just
> round the values (possible in my case - are the lidar points really
accurate
> to less than a foot?). D3.js is doing something very similar to this in
its
> topoJSON format.
> How much brain damage would this cause? Just thinking out loud...
> Bruce
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20130121/23f43c8e/attachment.html>
More information about the postgis-users
mailing list