[postgis-users] CPU tuning

Rémi Cura remi.cura at gmail.com
Mon Jan 9 06:21:01 PST 2017


Hey,
option 2 is the fastest one ;-)
, option 1 is simply a pure SQL rewrite of your query.
The machine I used for test is way weaker than yours (I'd say 2x weaker).


postgres 9.4 is non-parallelised, what you see is that postgres use 100% of
a core, but the core beig used is rotated,
which mean the average usage of your cpu is 100% / 4 (your number of core),
which might be the explaination for your 30%.

What you see is expected : postgres is going to use the max amount of ram,
then use the harddrive when no more ram is available.
This is not the case with option 2, as very few ram is needed.

Don't forget to create the appropriate indexes on your table t_y.

Cheers,
Rémi-C

2017-01-09 13:56 GMT+01:00 Olivier Leprêtre <o.lepretre at gmail.com>:

> Hi,
>
>
>
> Thanks for your answer and this way of calculation. I don't know which
> machine you have but on mine, Option 1 launched from pgadmin takes
> 2,564,423 ms  (about 45 mn) a bit less than my original code (2,823,272
> ms). So if this takes less than 1 second on yours, there is definitively a
> problem on my configuration or my machine which is nevertheless a very slow
> one : Hp Envy (Windows 10, i7-6500 CPU 2,5 Ghz 8go). I'm using
> postgresql/postgis 9.4 with its standard configuration.
>
>
>
> During the option 1 query, the processor used again 30/35% CPU during all
> time. Difference is about memory, it grows up to 1Gb (shared buffer is set
> to this value) and then begin using disk about 10 Mo/s.  So I'm still
> wondering why the cpu doesn't go further than 30% and why this take such a
> difference on your machine. Is it a linux OS ?
>
>
>
> Olivier
>
>
>
>
>
> *De :* postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] *De
> la part de* Rémi Cura
> *Envoyé :* lundi 9 janvier 2017 10:52
> *À :* PostGIS Users Discussion
> *Objet :* Re: [postgis-users] CPU tuning
>
>
>
> Hey,
>
> I
>
> ​'m afraid you may not use the most efficient approach.
>
> Assuming you want to find for each node of table x the closest node of
> table y,
>
> it takes less than 1 second on my computer.
>
>
>
>
> DROP TABLE  IF EXISTS  test_1;
> CREATE TABLE test_1 AS
> SELECT s AS gid, ST_makePoint(random()*1000,random()*1000) AS geom
> FROM generate_series(1,12000) AS s ;
> CREATE INDEX ON test_1 USING GIST(geom) ;
> ALTER TABLE test_1 ADD PRIMARY KEY (gid) ;
>
>
> DROP TABLE  IF EXISTS  test_2;
> CREATE TABLE test_2 AS
> SELECT s AS gid, ST_makePoint( random()*1000,random()*1000) AS geom
> FROM generate_series(1,50000) AS s ;
> CREATE INDEX ON test_2 USING GIST(geom) ;
> ALTER TABLE test_2 ADD PRIMARY KEY (gid) ;
>
> -- option 1 : you have lots of ram : brute force
> DROP TABLE IF EXISTS test_results ;
> CREATE TABLE test_results AS
> SELECT DISTINCT ON (test_1.gid ) test_1.gid AS gid1, test_2.gid AS gid2--
> , ST_MakeLine(test_1.geom,test_2.geom) AS sline
> FROM test_1, test_2
> ORDER BY test_1.gid ,ST_Distance(test_1.geom,test_2.geom) ASC ;
>
>
> --option2  : you have limited ram : snipping : force index use
>
> DROP FUNCTION IF EXISTS test_sdist(geom1 geometry) ;
> CREATE OR REPLACE FUNCTION test_sdist(geom1 geometry, OUT gid2 bigint, OUT
> geom2 geometry) AS
>  $$
>     --given a point, find the closes one in test_2
>     DECLARE
>     BEGIN
>         SELECT gid, geom INTO gid2, geom2
>         FROM test_2
>         ORDER BY ST_Distance(geom2,test_2.geom) ASC
>         LIMIT 1  ;
>         RETURN ;
>     END;
> $$ LANGUAGE plpgsql;
>
> DROP TABLE IF EXISTS test_result_2 ;
> CREATE TABLE IF NOT EXISTS test_result_2 AS
> SELECT test_1.gid AS gid1, f.gid2 AS gid2
> FROM test_1, test_sdist(test_1.geom) AS f
>
> --option 3 : you could use a recursive CTE to avoid creating a dedicated
> dummy function,
>
> --           this would be a pain to write though
>
> Cheers,
>
> Rémi-C​
>
>
>
>
>
> 2017-01-08 16:36 GMT+01:00 Olivier Leprêtre <o.lepretre at gmail.com>:
>
> Hi,
>
>
>
> I have a pgsql postgis function that last about an hour on an Hp Envy
> (W10, i7-6500 CPU 2,5 Ghz 8go). This function is calculating minimum
> distance between each nodes of table x (12000 nodes) and table y (42000
> nodes)
>
>
>
> FOR row IN
>
>                    SELECT code,ST_AsEwkt(ST_StartPoint(geom)) as geom1
>  FROM t_x
>
> LOOP
>
>                    FOR row1 IN
>
>                             SELECT code, ST_Distance(ST_AsEwkt(geom)::text,row.geom1)
> FROM t_y order by ST_Distance(ST_AsEwkt(geom)::text,row.geom1) LIMIT 1
>
>
>
>                    LOOP
>
>                    END LOOP;
>
> ....
>
>
>
>
>
> What I don't understand is that postgresql server process CPU is stable
> during all that time at 30/31% with no other process running. Details of
> each processor 1 to 3 shows that all 4 uses 30%. Disk and network usage are
> 0% (database is local) Ram usage is 47% and Postgresql server uses process
> only 7,4 Mo.
>
>
>
> My question is "Why postgres process uses such a precise and constant cpu
> and doesn't grow up to more than 30% ?" I didn't find any parameter forcing
> such a limit in order to reduce duration of this calculation.
>
>
>
> Thanks for any answer,
>
>
>
> Olivier
>
>
> ------------------------------
>
> [image: Avast logo]
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
> L'absence de virus dans ce courrier électronique a été vérifiée par le
> logiciel antivirus Avast.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
>
> ------------------------------
> [image: Avast logo]
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
> L'absence de virus dans ce courrier électronique a été vérifiée par le
> logiciel antivirus Avast.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20170109/65588dba/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 1588 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20170109/65588dba/attachment.png>


More information about the postgis-users mailing list