[pdal] PDAL Oracle small benchmark
Albert Godfrind
albert.godfrind at oracle.com
Fri Apr 17 11:36:25 PDT 2015
Great tests, Oscar.
One remark: to check the actual storage space used it is best to sum the length of the values in each BLOB rather than the allocations. Those will be rounded up to the next extent (whose size may vary and get larger for large segments). Also unless you truncate the target table(s) the database will reuse existing allocations.
You get a more accurate result by getting the length of each blob (using LENGTH() ) and summing them.
Albert
--
Albert Godfrind
+33 6 09 97 27 23
Sent from my iPhone
> On 17 avr. 2015, at 18:33, Oscar Martinez Rubi <o.martinezrubi at tudelft.nl> wrote:
>
> Hi,
>
> After the latest fixs (thanks Howard, Connor, Andrew and the rest of PDAL guys!) in OCI writer and reader and the fact that I found out about laz-perf I have done this new small test with PDAL and Oracle to see how the two systems behave with different configurations. I tried with/without laz-perf, with point/dimension major, with all columns or only xyz, with/without BLOB compression and with/without offsets and scales (i.e. 64 or 32 bit per coord).
>
> There are 32 combinations, but since lazperf requires dimension major, there 24 valid "only" combinations. For each one I have loaded a single LAS file with 20M points (size 380 MB, 38MB in LAZ) with the PDAL OCI writer and I have queried a small rectangle (80000 points) with PDAL OCI reader. I have done the query twice.
>
> From the attached table is clear that using lazperf is very good in size terms (almost factor 2 compared to LAZ), loading time and query time!. The best approach is to use lazperf, scales and offsets and without BLOB compression.
>
> Regarding the loading times:
> - Adding BLOB compression generally makes the loading slower.
> - Using all columns instead of only xyz also makes it slower.
> - Adding offsets makes it faster.
> - Using lazperf does not seem to have a visible effect in loading time.
> - The point/dimension major doe snot have any visible effect
>
> Regarding the queries the only difference seems to be if using all columns or only xyz. In general by having only xyz instead of all columns the queries are faster.
>
> Regarding the size/storage the are some strange issues in the numbers of the table:
>
> - When lazperf is used and offsets are used it does not matter in storage terms whether I specify BLOB compression or not (I guess that BLOB compression just can not squeeze the data anymore, right? or maybe it is somehow ignored?)
>
> - When lazperf is used and offsets are not used the BLOB compression actually increases the size (Seems like the BLOB compression messes up what lazperf did, strange though...)
>
> - When lazperf is used it does not matter in storage terms whether I specify only x,y,z or all columns. Why is this?
>
> - When lazperf is not used, the difference in size between point and dimension orientation is only visible when using all the columns, BLOB compression and offsets
>
> - When lazperf is not used and only xyz are used the BLOB compression offsers same compression factor that using offsets but in twice the time, both 278MB. Combining both gives 212MB.
>
> - When lazperf is not used and all columns are used the BLOB compression gives better compression factor that using offsets (343MB vs 538MB).
>
> - If BLOB compression is used and offset are not used the lazperf compression adds nothing. In fact in this case and having only xyz using lazperf is actually worse than not using it (which makes me thing that in lazperf all the columns are stored even if you do not want it)
>
> I am aware that the estimation of the size in oracle can be tricky. I sum the size of the user segments related to my "blocks" table, i.e.:
>
> SELECT sum(bytes/1024/1024) size_in_MB
> FROM user_segments
> WHERE (segment_name LIKE : 'blocks%'
> OR segment_name in (
> SELECT segment_name
> FROM user_lobs
> WHERE table_name LIKE : 'blocks%'
> UNION
> SELECT index_name
> FROM user_lobs
> WHERE table_name LIKE : 'blocks%'
> )
> );
>
> Kind Regards,
>
> O.
> <RESULTS>
More information about the pdal
mailing list