[postgis-tickets] [PostGIS] #2976: geometry_columns performance on large databases

PostGIS trac at osgeo.org
Fri Oct 24 11:45:58 PDT 2014


#2976: geometry_columns performance on large databases
-----------------------------------------+----------------------------------
 Reporter:  rnewton                      |       Owner:  pramsey      
     Type:  enhancement                  |      Status:  new          
 Priority:  medium                       |   Milestone:  PostGIS 2.1.5
Component:  postgis                      |     Version:  2.1.x        
 Keywords:  pgsql2shp, geometry_columns  |  
-----------------------------------------+----------------------------------
 I discovered this performance issue when using pgsql2shp. Our environment
 uses thousands of tables in a single database where the general
 interaction is to not perform lookups via the postgres tables (pg_class,
 pg_namespace, etc), but rather directly by name.

 The query that retrieves the projection string in pgsql2shp uses the
 geometry_columns view to find the proper projection. In our environment,
 this lookup is very slow because it has to enumerate all the tables in
 order to find every one that includes a geometry column.

 This had the effect of causing our shapefile queries to timeout in some
 cases. To export the shape for a single city was taking about 15 minutes.

 As a work-around, we recreated the geometry_columns view as a materialized
 view. Even without indexes, this greatly improves the performance, in our
 environment. The same single-city shapefile will be generated in about 1
 second. We can schedule the sync of the materialized view on a nightly
 basis or resync when adding tables with new geometry columns, which is
 infrequent for us.

 Ours might be a special case due to the huge number of tables in our
 database, ability to use PostgreSQL 9.3+ and stable number of tables with
 geometry columns.

 A possible, related fix would be to include a flag to pgsql2shp for
 specifying the projection. This wouldn't replace the geometry_columns
 check, but rather provide a short cut.

 My suggestion would be to use a materialized view for geometry_columns (as
 we're doing now) and triggering the resync when a new geometry column is
 added to a table (the same hook that used to exist when geometry_columns
 was a table).

-- 
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2976>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list