[postgis-users] Help optimizing query

Wed Dec 21 07:43:15 PST 2022

> Hi all!
> 
> I have a table "elevation", defined as:
> 
> gis=# \d+ elevation
>                                        Table "public.elevation"
>   Column   |       Type       | Collation | Nullable | Default | Storage
> | Stats target | Description
>
-----------+------------------+-----------+----------+---------+---------+--
------------+----------
> ---
>  latlng    | geography        |           |          |         | main
> |              |
>  lat       | double precision |           |          |         | plain
> |              |
>  lng       | double precision |           |          |         | plain
> |              |
>  elevation | real             |           |          |         | plain
> |              |
> 
> Hier I saved data about elevation of the terrain...
> It works and I can use the data, but this is very slow if I'd like to get
the
> elevation graph of a path...
> 
> Currently I use this function:
> 
> CREATE OR REPLACE FUNCTION getElevationsOnLine(startPoint geometry,
> endPoint geometry, pointsInterval int) RETURNS TABLE (
>   lat DOUBLE PRECISION,
>   lng DOUBLE PRECISION,
>   elevation REAL,
>   dist DOUBLE PRECISION
> )
> LANGUAGE plpgsql
> AS $$
> DECLARE
>   point geometry;
> BEGIN
>   FOR point IN
>     SELECT (ST_DumpPoints(ST_Segmentize(ST_MakeLine(startPoint,
> endPoint)::geography, pointsInterval)::geometry)).geom
>   LOOP
>     RETURN QUERY
>       SELECT elevation.lat, elevation.lng, elevation.elevation,
elevation.latlng <->
> point AS dist
>       FROM elevation
>       ORDER BY dist LIMIT 1;
>   END LOOP;
> END;
> $$;
> 
> The problem is the SELECT elevation.... query.
> Using explain analyze I see:
> 
>                                                                    QUERY
PLAN
>
----------------------------------------------------------------------------
---------------------------
> ------------------------------------------
>  Limit  (cost=165202.56..165202.68 rows=1 width=28) (actual
> time=2939.661..2946.685 rows=1 loops=1)
>    ->  Gather Merge  (cost=165202.56..297446.23 rows=1133438 width=28)
> (actual time=2939.658..2946.680 rows=1 loops=1)
>          Workers Planned: 2
>          Workers Launched: 2
>          ->  Sort  (cost=164202.54..165619.33 rows=566719 width=28)
(actual
> time=2828.064..2828.066 rows=1 loops=3)
>                Sort Key: ((latlng <->
> '0101000020E6100000E17A14AE47412C40BC74931804A64940'::geography))
>                Sort Method: top-N heapsort  Memory: 25kB
>                Worker 0:  Sort Method: top-N heapsort  Memory: 25kB
>                Worker 1:  Sort Method: top-N heapsort  Memory: 25kB
>                ->  Parallel Seq Scan on elevation  (cost=0.00..161368.94
> rows=566719 width=28) (actual time=3.910..2415.377 rows=453375 loops=3)
> Planning Time: 0.199 ms  Execution Time: 2946.763 ms
> 
> the given coordinate is just to check...
> 
> I already tried to define an index over latlng, but it does not help.
> 
> Can someone help me to speed up the query?
> 
> Thanks a lot
> Luca Bertoncello
> (lucabert at lucabert.de)

Can you output 

\dS+ elevation

Your query is not using an index, but hard to tell with \dt output if you
have one.
Even if you do have an index, it wouldn't be used, because your geography is
probably being autocast to a geometry in the <-> call.  I forget how that
precedence is done.

1) Make sure you do have a gist index on latlng

2) Try rewriting your function like below, which is the same as your
original, but casting the geom back to geography when it comes out of
ST_DumpPoints.

CREATE OR REPLACE FUNCTION getElevationsOnLine(startPoint geometry, endPoint
geometry, pointsInterval int) RETURNS TABLE (
  lat DOUBLE PRECISION,
  lng DOUBLE PRECISION,
  elevation REAL,
  dist DOUBLE PRECISION
)
LANGUAGE plpgsql
AS $$
DECLARE
  point geometry;
BEGIN
  FOR point IN
    SELECT (ST_DumpPoints(ST_Segmentize(ST_MakeLine(startPoint,
endPoint)::geography, pointsInterval)::geometry)).geom::geography
  LOOP
    RETURN QUERY
      SELECT elevation.lat, elevation.lng, elevation.elevation,
elevation.latlng <-> point AS dist
      FROM elevation
      ORDER BY dist LIMIT 1;
  END LOOP;
END;
$$;