[pgpointcloud] PC_Intersects and indexing

Regina Obe lr at pcorp.us
Thu May 3 01:26:26 PDT 2018


One addition, as Eric mentioned, the function you use in the index should be
the same as what the underlying function uses.

 

That said replace  where I have PC_EnvelopeGeometry with
PC_BoundingDiagonalGeometry in the functions below.
 
>From the docs, looks like PC_BoundingDiagonalGeometry returns an ND object
whereas I think PC_EnvelopeGeometry returns a 2D. So that makes more sense
now to me why PC_Intersects is using ST_Intersects.
 
 

Anyway would be interested in seeing the different in performance of what
Eric proposed (note his is building a 2D index) vs. keeping your ND index
and using the 3D intersect functions.

 

From: Regina Obe [mailto:lr at pcorp.us] 
Sent: Thursday, May 03, 2018 1:42 AM
To: 'Lars' <laasunde at hotmail.com>; 'pgpointcloud at lists.osgeo.org'
<pgpointcloud at lists.osgeo.org>
Subject: RE: [pgpointcloud] PC_Intersects and indexing

 

Lars,

 

Looking at the definition of pc_intersects, it is piggy backing on PostGIS
ST_Intersects, which I can tell you right off the bat CAN NOT use an ND
index, thus no surprise you aren't seeing index usage.  On top of that
ST_Intersects only looks at 2D part.  To consider the Z you'd need
ST_3DIntersects.

 

PostGIS team had discussed putting &&& in the ST_3DIntersects to use the 3D
spatial operator &&& but we were worried about performance as we'd still
need the && since most folks just have 2 d indexes so haven't settled on
that yet. So it too ironically only uses && (the 2d spatial operator)

 

I'm not really sure why pc_patch opted for ST_Intersects since that doesn't
care about the 3rd dimension at all and will only use X and Y.

 

ST_3DIntersects would have seemed like a better choice.  

Try creating functions like the below and see if you do better – note I put
a U in front to distinguish it from pc patched shipped functions

 

CREATE OR REPLACE FUNCTION  upc_3dintersects(

    pcpatch,

    geometry)

  RETURNS boolean AS

$$

                              SELECT $2 &&& PC_EnvelopeGeometry($1)) AND
ST_3DIntersects($2, PC_EnvelopeGeometry($1))

$$

  LANGUAGE sql VOLATILE

  COST 100;

 

CREATE OR REPLACE FUNCTION upc_3dintersects(

    geometry,

    pcpatch)

  RETURNS boolean AS

$BODY$

                              SELECT UPC_Intersects($2, $1)

               $BODY$

  LANGUAGE sql VOLATILE

  COST 100;

 

Then run

explain analyze select id from import where UPC_3DIntersects(pa,
st_geomfromtext('POLYGON ((6.9821 25.2525......))', 4326));

 

 

 

From: pgpointcloud [mailto:pgpointcloud-bounces at lists.osgeo.org] On Behalf
Of Lars
Sent: Wednesday, May 02, 2018 9:06 AM
To: pgpointcloud at lists.osgeo.org <mailto:pgpointcloud at lists.osgeo.org> 
Subject: Re: [pgpointcloud] PC_Intersects and indexing

 

Forgot to mention that we did run vacuum analyze without seeing much
improvement.

 

Using PostgresSQL 10.3 and Postgis 2.4.4 on Windows 7.

  _____  

Fra: pgpointcloud <pgpointcloud-bounces at lists.osgeo.org
<mailto:pgpointcloud-bounces at lists.osgeo.org> > på vegne av Lars
<laasunde at hotmail.com <mailto:laasunde at hotmail.com> >
Sendt: 2. mai 2018 14:01
Til: pgpointcloud at lists.osgeo.org <mailto:pgpointcloud at lists.osgeo.org> 
Emne: [pgpointcloud] PC_Intersects and indexing 

 

Hello,

 

Used PDAL (1.7.1) to import xyz data into a PostgresSQL "pointcloud" table
called "import" with chipper capacity set to 4000. The table has about
50,000 rows with more than 2GB of data.

 

The "import" table contains column id and pa which are of type integer and
pcpatch respectively. The table is configured with btree indexing on the id
column.

 

Running the following query takes 776,952 ms and returns two records.
select id from import where PC_Intersects(pa, st_geomfromtext('POLYGON
((6.9821 25.2525......))', 4326));

 

The documentation at https://github.com/pgpointcloud/pointcloud mention
using GIST as indexing;
CREATE INDEX ON import USING GIST(PC_BoundingDiagonalGeometry(pa)
gist_geometry_ops_nd);

 

So we drop btree and add gist indexing, run the same query takes 796,384 ms
and return the two same records.

 

We expected btree on a integer to be relative slow but gist on a spatial
type to be relative quick but the actual performance appears to be quiet
similar.

 

How does PC_Intersects function work? Does it need to decode the
WellKnownBinary data inside pcpatch in order to determine the boundary? How
does it use index?

 

Running "explain analyze select id from import where PC_Intersects(pa,
st_geomfromtext('POLYGON ((6.9821 25.2525......))', 4326));"
QUERY_PLAN | Seq Scan on "import" (cost=10000000000..10000014803 rows=3539
width=4)

 

It was interesting to see that the query used sequential scan of the table.
We expected the query to use indexing and wanted to understand more about
why this is happening. 

 

Maybe the structure of our SQL query is suboptimal or we have configured
PostgresSQL poorly or the planner takes into account number of rows or
amount of data.

 

Appreciate any input

 

kind regards, Lars

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pgpointcloud/attachments/20180503/bd76105d/attachment-0001.html>


More information about the pgpointcloud mailing list