[pdal] Oracle PDAL queries not scaling
Oscar Martinez Rubi
o.rubi at esciencecenter.nl
Wed Aug 5 03:26:44 PDT 2015
Hi,
I did a test to see how good Oracle with PDAL scale with bigger data
sets. I had 3 datasets that are self-contained with 20M, 210M and 2201M
points. I loaded them in different Oracle DBs with PDAL and laz-perf.
And, for each of them I ran 7 queries (via a pdal pipeline that
preselects blocks, applies a crop and then write to a LAS file)
The results are in the attached file.
Regarding the loading, for the 20M I only used one core (it is only one
file) while for the others I used 16 cores, i.e. 16 simult. PDAL
instances loading data to Oracle. I opened an issue in GitHub because I
noticed that in some of the runs the size that I got was too large, and
I do not know what caused that. The attached numbers are when everything
seemed to work and the sizes were as expected.
This message, though, is about the queries. Each query is run twice in
each DB. As you can see in the results file, for 10x more points in the
data set the queries are 10x slower, at least for the first run (with
the 2201M the second run is much faster but this does not happen with
the 210M).
Find also attached one of the XML that i used for the queries (example
is for query1). Note that the geometry is previously inserted in oracle
so I can use to pre-filter blocks with the query option in oci reader
First I though that maybe the query option in the oci reader in the XML
was ignored and that all the blocks of the dataset were being processed
by PDAL (that would explain 10x more points 10x slower queries) but I
ran a pdal pipeline for query1 with verbose and I saw that the crop
filter "only" processed 120000 points which makes sense taking into
account that region of query 1 only has 74818 points. Or maybe the crop
still process all the blocks extents but only opens and decompress the
points of the overlapping ones?
Any idea what is happening?
Regards,
O.
-------------- next part --------------
LOAD
####
Approach Total[s] Init.[s] Load[s] Close[s] Total[MB] Index[MB] Points Points/s Points/MB
---------- ---------- ---------- --------- ---------- ----------- ----------- ---------- ---------- -----------
pdal20M 36.2 0.91 34.9 0.39 82 0.23 20165862 557068 245925
pdal210M 54.61 0.72 52.86 1.03 700 0.48 210631597 3857015 300902
pdal2201M 371.24 6.05 360.2 4.99 7165 3.43 2201135689 5929145 307207
QUERY
#####
Time[s] pdal20M pdal210M pdal2201M
--------- --------- ---------- -----------
01_0 0.8 3.4 33.23
01_1 0.52 3.44 0.21
02_0 1.38 4.16 34.22
02_1 1.25 4.14 0.96
03_0 0.55 4.06 39.15
03_1 0.61 4.14 0.18
04_0 2.03 9.78 90.38
04_1 1.95 9.67 1.1
05_0 0.86 3.64 32.3
05_1 0.81 3.67 0.51
06_0 2.44 13.43 123.03
06_1 2.33 13.57 1.23
07_0 1.71 4.52 1.41
07_1 1.63 4.68 1.43
NumPts pdal20M pdal210M pdal2201M
-------- --------- ---------- -----------
01_0 74818 74818 74818
01_1 74818 74818 74818
02_0 717869 717869 717869
02_1 717869 717869 717869
03_0 34667 34667 34667
03_1 34667 34667 34667
04_0 563013 563013 563013
04_1 563013 563013 563013
05_0 182861 182861 182861
05_1 182861 182861 182861
06_0 387134 387135 387134
06_1 387134 387135 387134
07_0 45813 45813 45813
07_1 45813 45813 45813
-------------- next part --------------
A non-text attachment was scrubbed...
Name: query1.xml
Type: text/xml
Size: 948 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20150805/dbee18ff/attachment-0001.xml>
More information about the pdal
mailing list