[postgis-devel] geometry_columns improving performance for constraint columns

Paragon Corporation lr at pcorp.us
Fri Apr 24 22:23:39 PDT 2015


Slight correction.  My SOLUTION 2 test metrics were wrong.  So solution 2
does much better than I had stated. Not sure if that was a fluke or I was
accidentally testing the old view for the POINT test. 
Also the LINESTRING test is a mix of typmod and constraint (I forgot my
script also creates a constraint linestring column) and alsways has a tymod
column.

SOLUTION 2 corrections
-------------------------------

-- -- 84,570 - 85,371 ms, 30,000 rows (slow for some reason)
SELECT *  FROM  geometry_columns;

-- 410 - 420 ms  - 5,000 rows (constraint columns) 
SELECT * FROM  geometry_columns WHERE type = 'POINT';

-- 1490ms- 1500 ms   - 20,000 rows (mix of constraint and  columns) 
SELECT * FROM  geometry_columns WHERE type = 'LINESTRING';

-- 220 - 240 ms
SELECT * from geometry_columns
WHERE  f_table_name = 'constraint_test_linestring_1382';

-----Original Message-----
From: Paragon Corporation [mailto:lr at pcorp.us] 
Sent: Saturday, April 25, 2015 12:54 AM
To: 'PostGIS Development Discussion'
Subject: geometry_columns improving performance for constraint columns

The geometry_columns table is very slow for large numbers of geometry
columns when querying constraint based columns.
BTW this idea probably applies to raster_columns as well. Though I don't
think raster_columns view is used quite as much as geometry_columns.

There are 3 revisions to this view that I have tested that are a significant
improvement to the existing view.

Detailed here: http://trac.osgeo.org/postgis/ticket/3092 

To test I ran this script that created 30,000 geometry columns (still need
to test on 9.3 and 9.2). I was testing on 9.4

-- do this and then restart service (can't create that many columns or even
drop that many tables in one transaction without this) ALTER SYSTEM SET
max_locks_per_transaction = 100000; 


-- build multigeom column tables one with typmod and one with constraint -
30,000 columns total

DO language plpgsql
$$
DECLARE var_sql text;
BEGIN
  SELECT string_agg('CREATE TABLE test.constraint_test_' || lower(f.geom) ||
'_' || i::text || '(gid serial, geom2 geometry(LINESTRING,2249) );
  SELECT addgeometrycolumn(''test'', ' 
  || quote_literal('constraint_test_' || lower(f.geom) || '_' || i::text) ||
' , ''geom'', 
	' || f.srid::text || ',' || quote_literal(f.geom) || ', 2, false);',
'') INTO var_sql
  FROM (VALUES ('POINT'::text, 4326), ('MULTIPOLYGON'::text, 4269),
('LINESTRING'::text, 4326)) f(geom,srid) 
		, generate_series(1,5000) As i;
  EXECUTE var_sql;
END;
$$ 

To test I generated 15,000 tables each with 2 geometry columns (1 constraint
and 1 typmod)  for a total of 30,000 columns script futher down And then ran
these tests:


One Paul has already committed to trunk.  The 3 solutions and issues with
each are as follows:

SOLUTION 1
-----------------
 One committed introduces 3 new functions (albeit they are only used by the
view) so one could argue it could go in a micro release because there is no
API change.
That's a borderline argument and I hate having 3 new functions whose only
purpose is to be used in a view.  

This solution is a mid-range
-- 2520ms - 2830 ms - 30,000 rows
SELECT *  FROM  geometry_columns;
-- 1260 ms -- 5,000 rows
SELECT * FROM  geometry_columns WHERE  type = 'POINT';

-- 1840 ms - 20,000 rows (this is the typmod row) SELECT * FROM
geometry_columns WHERE type = 'LINESTRING';

-- 500 - 510 ms
SELECT * from geometry_columns
WHERE  f_table_name = 'constraint_test_linestring_1382';


SOLUTION 2
----------------
This one has no extra functions -- just changes the geometry_columns view,
so no issue with pushing to a micro (and we could deprecate the 3 functions
we are currently using).  Also doesn't use any constructs we don't have
available in PostgreSQL 9.0 It's slower than solution 1 for a full column
list  pull or query against constraint based columns, but faster for a
single column pull or typmod columns.

-- -- 85,371 ms, 30,000 rows
SELECT *  FROM  geometry_columns;

-- 154,480 ms - 5,000 rows (constraint columns) SELECT * FROM
geometry_columns WHERE type = 'POINT';

-- 1500 ms - 20,000 rows (typmod columns) SELECT * FROM  geometry_columns
WHERE type = 'LINESTRING';

-- 220 ms
SELECT * from geometry_columns
WHERE  f_table_name = 'constraint_test_linestring_1382';


SOLUTION 3
---------------------
This one is my favorite solution (wins on all contests)  but would require
putting in some conditional logic since it uses the LATERAL construct only
available in PostgreSQL 9.3 or above.  For lower we'd have to fall back on
solutions 1 or 2 (I would choose solution 2 even though it's slower for full
table list because has no extra functions and it is very similar in
structure so just requires a change to the JOIN with constraint catalog
table.

-- -- 2080ms - 2090 ms, 30,000 rows
SELECT *  FROM  geometry_columns;

-- 450 - 460 ms  (constraint columns) 5000 rows SELECT * FROM
geometry_columns WHERE type = 'POINT';

-- 1400 ms - 20,000 rows (typmod columns) SELECT * FROM  geometry_columns
WHERE type = 'LINESTRING';

-- 220 ms- 2 records
SELECT * FROM  geometry_columns
WHERE  f_table_name = 'constraint_test_linestring_1382';



Anyone have thoughts?

Thanks,
Regina
http://www.postgis.us
http://postgis.net


-





More information about the postgis-devel mailing list