[postgis-devel] geometry_columns improving performance for constraint columns

Paragon Corporation lr at pcorp.us
Sat Apr 25 12:23:45 PDT 2015


Okay I think I found a 4th solution that is just as good as the lateral one,
doesn't use any additional functions, and no new syntax.  So this one seems
to be a CLEAR winner.

IT's the obvious solution, but one I didn't expect to fair so well.  That is
to do 3 LEFT JOINs on constraint instead of 1.

I'll post to the ticket and commit if we are all in agreement, but the stats
for this are as follows:

Thanks,
Regina



-----Original Message-----
From: postgis-devel-bounces at lists.osgeo.org
[mailto:postgis-devel-bounces at lists.osgeo.org] On Behalf Of Paragon
Corporation
Sent: Saturday, April 25, 2015 2:09 PM
To: 'PostGIS Development Discussion'
Subject: Re: [postgis-devel] geometry_columns improving performance for
constraint columns

Nothing against regex.  I just used what was there before and was too lazy
to lookup the manual proper regex way to do it.  I tried to using regex
match you have but that didn't work because it's a set returning function
and when it returns no set, it just blows up.  I was going to change to use
regexp replace.

Your CTE is not quite as a performance killer as I thought it would be so
not so much against it if it makes gimmee all the tables faster.

If you can get rid of use of extra functions, then I'd go with your approach
:).  

I think for mapserver filtering is very common no?  Doesn't it look for
specific table, column? To get the srid?  But agree gimme the whole table is
very common -- probably more so with things like QGIS.


This brought up another issue I observed.  The fact we cast table name,
column name etc. to varchar(256) means we can't use an index for those.  If
I strip off the casting, I get a 6 fold increase in performance for exact
equality matches (e.g. WHERE f_table_name =
'constraint_test_linestring_1382'; )

It's too bad we are stuck with that though maybe we'd want to consider
changing that in 2.2.

Thanks,
Regina

-----Original Message-----
From: postgis-devel-bounces at lists.osgeo.org
[mailto:postgis-devel-bounces at lists.osgeo.org] On Behalf Of Paul Ramsey
Sent: Saturday, April 25, 2015 11:57 AM
To: PostGIS Development Discussion
Subject: Re: [postgis-devel] geometry_columns improving performance for
constraint columns

I could probably strip the CTE out of mine, and maybe even the functions, I
was just too intellectually lazy to try. This is why I should have left it
to the SQL master (Although apparently not the regex master? What is your
objection to regular expressions?)

I m  fine w/ whatever you like, though I think filtering is probably not as
common a use case as just  gimmee the list of tables 

P.


--
http://postgis.net
http://cleverelephant.ca


On April 24, 2015 at 10:23:36 PM, Paragon Corporation (lr at pcorp.us) wrote:
> Slight correction. My SOLUTION 2 test metrics were wrong. So solution
> 2 does much better than I had stated. Not sure if that was a fluke or 
> I was accidentally testing the old view for the POINT test.
> Also the LINESTRING test is a mix of typmod and constraint (I forgot 
> my script also creates a constraint linestring column) and alsways has 
> a tymod column.
>  
> SOLUTION 2 corrections
> -------------------------------
>  
> -- -- 84,570 - 85,371 ms, 30,000 rows (slow for some reason) SELECT * 
> FROM geometry_columns;
>  
> -- 410 - 420 ms - 5,000 rows (constraint columns) SELECT * FROM 
> geometry_columns WHERE type = 'POINT';
>  
> -- 1490ms- 1500 ms - 20,000 rows (mix of constraint and columns) 
> SELECT * FROM geometry_columns WHERE type = 'LINESTRING';
>  
> -- 220 - 240 ms
> SELECT * from geometry_columns
> WHERE f_table_name = 'constraint_test_linestring_1382';
>  
> -----Original Message-----
> From: Paragon Corporation [mailto:lr at pcorp.us]
> Sent: Saturday, April 25, 2015 12:54 AM
> To: 'PostGIS Development Discussion'
> Subject: geometry_columns improving performance for constraint columns
>  
> The geometry_columns table is very slow for large numbers of geometry 
> columns when querying constraint based columns.
> BTW this idea probably applies to raster_columns as well. Though I 
> don't think raster_columns view is used quite as much as geometry_columns.
>  
> There are 3 revisions to this view that I have tested that are a 
> significant improvement to the existing view.
>  
> Detailed here: http://trac.osgeo.org/postgis/ticket/3092
>  
> To test I ran this script that created 30,000 geometry columns (still 
> need to test on 9.3 and 9.2). I was testing on 9.4
>  
> -- do this and then restart service (can't create that many columns or 
> even drop that many tables in one transaction without this) ALTER 
> SYSTEM SET max_locks_per_transaction = 100000;
>  
>  
> -- build multigeom column tables one with typmod and one with 
> constraint -
> 30,000 columns total
>  
> DO language plpgsql
> $$
> DECLARE var_sql text;
> BEGIN
> SELECT string_agg('CREATE TABLE test.constraint_test_' ||
> lower(f.geom) || '_' || i::text || '(gid serial, geom2
> geometry(LINESTRING,2249) ); SELECT addgeometrycolumn(''test'', '
> || quote_literal('constraint_test_' || lower(f.geom) || '_' ||
> || i::text) ||
> ' , ''geom'',
> ' || f.srid::text || ',' || quote_literal(f.geom) || ', 2, false);',
> '') INTO var_sql
> FROM (VALUES ('POINT'::text, 4326), ('MULTIPOLYGON'::text, 4269), 
> ('LINESTRING'::text, 4326)) f(geom,srid) , generate_series(1,5000) As 
> i; EXECUTE var_sql; END; $$
>  
> To test I generated 15,000 tables each with 2 geometry columns (1 
> constraint and 1 typmod) for a total of 30,000 columns script futher 
> down And then ran these tests:
>  
>  
> One Paul has already committed to trunk. The 3 solutions and issues 
> with each are as follows:
>  
> SOLUTION 1
> -----------------
> One committed introduces 3 new functions (albeit they are only used by 
> the
> view) so one could argue it could go in a micro release because there 
> is no API change.
> That's a borderline argument and I hate having 3 new functions whose 
> only purpose is to be used in a view.
>  
> This solution is a mid-range
> -- 2520ms - 2830 ms - 30,000 rows
> SELECT * FROM geometry_columns;
> -- 1260 ms -- 5,000 rows
> SELECT * FROM geometry_columns WHERE type = 'POINT';
>  
> -- 1840 ms - 20,000 rows (this is the typmod row) SELECT * FROM 
> geometry_columns WHERE type = 'LINESTRING';
>  
> -- 500 - 510 ms
> SELECT * from geometry_columns
> WHERE f_table_name = 'constraint_test_linestring_1382';
>  
>  
> SOLUTION 2
> ----------------
> This one has no extra functions -- just changes the geometry_columns 
> view, so no issue with pushing to a micro (and we could deprecate the
> 3 functions we are currently using). Also doesn't use any constructs 
> we don't have available in PostgreSQL 9.0 It's slower than solution 1 
> for a full column list pull or query against constraint based columns, 
> but faster for a single column pull or typmod columns.
>  
> -- -- 85,371 ms, 30,000 rows
> SELECT * FROM geometry_columns;
>  
> -- 154,480 ms - 5,000 rows (constraint columns) SELECT * FROM 
> geometry_columns WHERE type = 'POINT';
>  
> -- 1500 ms - 20,000 rows (typmod columns) SELECT * FROM 
> geometry_columns WHERE type = 'LINESTRING';
>  
> -- 220 ms
> SELECT * from geometry_columns
> WHERE f_table_name = 'constraint_test_linestring_1382';
>  
>  
> SOLUTION 3
> ---------------------
> This one is my favorite solution (wins on all contests) but would 
> require putting in some conditional logic since it uses the LATERAL 
> construct only available in PostgreSQL 9.3 or above. For lower we'd 
> have to fall back on solutions 1 or 2 (I would choose solution 2 even 
> though it's slower for full table list because has no extra functions 
> and it is very similar in structure so just requires a change to the 
> JOIN with constraint catalog table.
>  
> -- -- 2080ms - 2090 ms, 30,000 rows
> SELECT * FROM geometry_columns;
>  
> -- 450 - 460 ms (constraint columns) 5000 rows SELECT * FROM 
> geometry_columns WHERE type = 'POINT';
>  
> -- 1400 ms - 20,000 rows (typmod columns) SELECT * FROM 
> geometry_columns WHERE type = 'LINESTRING';
>  
> -- 220 ms- 2 records
> SELECT * FROM geometry_columns
> WHERE f_table_name = 'constraint_test_linestring_1382';
>  
>  
>  
> Anyone have thoughts?
>  
> Thanks,
> Regina
> http://www.postgis.us
> http://postgis.net
>  
>  
> -
>  
>  
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>  

_______________________________________________
postgis-devel mailing list
postgis-devel at lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel


_______________________________________________
postgis-devel mailing list
postgis-devel at lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel





More information about the postgis-devel mailing list