[pgpointcloud] RLE and SIGBITS heuristics
Oscar Martinez Rubi
o.martinezrubi at tudelft.nl
Mon Apr 20 06:03:50 PDT 2015
Hi,
On 20-04-15 13:13, Rémi Cura wrote:
> Hey Oscar,
>
> I'm a really big fan of Lidar for archeological use, and integrating
> time into it is especially trendy and challenging. Registetring all
> point cloud together from different sources must have been really
> difficult.
That is really tricky indeed! At NLeSC we worked on an automatic
open-source alignment tool
(https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py)
which works for some cases when aligning point clouds from
archaeological monuments (from photogrametry) with a lidar dataset. For
other cases we have a manual alignment tool that is a 3D desktop viewer
using based on OpenSceneGraph (where also meshes and pictures can be
displayed).
>
>
> I contacted Potree developper one year ago to ask him if it was
> possible to modify it to read points in a DBMS (actually patch with LOD).
> He said it was possible and not too difficult.
> I don't know how much point output you get, but we demonstrated around
> 20kpts/s streaming to browser (with a lot of
> serialization/deserialization). Currently the upper limit for such
> output would be in the few hundred kpts/s if you send points, and in
> the few Million pts/s if you stream compressed patches.
Currently we are getting around 200kpoints/sec using LAS format (not
remember how much we got with LAZ) but we also have a no-so-good
server...so I think same solution could give a bit more in other
situations. Anyway if you say compressed patches in DB could deliver few
millions/sec that should be more than enough! And would be nice to try!
>
> Cheers
Regards,
O.
>
>
> 2015-04-20 11:58 GMT+02:00 Oscar Martinez Rubi
> <o.martinezrubi at tudelft.nl <mailto:o.martinezrubi at tudelft.nl>>:
>
> Whoa!
>
> Thanks guys for all the material! I am now busy reading it all!
>
> Remi: I had to read your mail a few times ;) Great slides (I
> actually looked at all of them, very well done!) Very interesting
> topics you are researching!
>
> Howard: About the Greyhoud+s3, but what storage solution do you
> use, it is not clear...mongodb? I mean where are the points
> stored? file-based, dbms?
>
> Paul+Nouri: The geohash tree that Noury mentions is the ght
> compression in pgpointclouid, right? I tried it once but there was
> some limitation with the type of coordinates, they had to be long
> and lat so I guess there need to be a reference system
> transformation in between, right? Any place where I can find an
> example on how to use this?
>
>
> At NLeSC for the visualization of data we are using a system based
> on the potree visualization (so, file-based) but I am very very
> interested on the stuff you are guys doing and I would love to be
> convinced that DBMS solutions can be really efficient for
> visualization as well (i think it is close now!). We choose
> file-based and potree because of the initial lack of LoD support
> in DBMS, the speed the file-based approach and the super
> compressed LAZ storage.
>
> To see what we have done so far:
>
> https://github.com/NLeSC/PattyVis
> https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d
> (see the video from 1:40 for the potree based visualization)
>
> One of the many reasons I would loved to be convinced that DBMS is
> that now we are considering how to visualize the 640B AHN2
> dataset, and in a pure file-based solution (like the potree) I
> fear that when restructuring the data to octree we would need a
> number of octree nodes/files probably larger than what ext4 can
> handle!. We will try, I let you know how that goes ;), but it
> would be really nice to have a efficient and fast DBMS-based
> alternative!
>
> I am very happy though with all the different work you are all
> doing and excited to see how fast things improve and evolve!!
>
> Keep on like this guys!
>
> Regards,
>
> O.
>
>
>
> On 17-04-15 19:01, Sabo, Nouri wrote:
>>
>> Hi,
>>
>> Thank you for sharing these ideas. Many of the ideas can make
>> improvements. In the prototype we have developed at RNCan and
>> that we mentioned in the paper in attachment we have implemented
>> some of these concepts. For example, in the prototype we are
>> sorting points according to the Morton pattern before creating
>> blocks. And each block is composed only of points that are
>> spatially close, thereby improving the level of compression. We
>> also use the properties of the Morton curve (Z pattern) to do
>> spatial queries using Geohash as BBox. Usually, in Geohash based
>> system the more the Geohash prefixes for two points resemble one
>> another, the more they are spatially close to each other.
>> Unfortunately, this property is not always complied with two
>> points located on either side of a subdivision line. For this
>> reason we implemented a neighbourhood based strategy to allow
>> spatial query based on the hash string.
>>
>> Also to improve the compression and performance we can change the
>> encoding of Geohash. Currently, the hashes are encoded as base 32
>> strings, which causes a lot of overhead (5 bits are inflated in 8
>> bits of character). Unfortunately, the current libght does not
>> include all the concepts of GeoHashTree.
>>
>> Oscar, I will read your paper and get you back so we could
>> continue to exchange.
>>
>> Kind regards!
>>
>> Nouri,
>>
>> *From:*Paul Ramsey [mailto:pramsey at cleverelephant.ca]
>> *Sent:* 17 avril 2015 06:56
>> *To:* pgpointcloud at lists.osgeo.org
>> <mailto:pgpointcloud at lists.osgeo.org>; Peter van Oosterom; Oscar
>> Martinez Rubi; Howard Butler; Rémi Cura
>> *Cc:* Sabo, Nouri
>> *Subject:* Re: [pgpointcloud] RLE and SIGBITS heuristics
>>
>> Hi Oscar,
>>
>> This sounds like a slightly more sophisticated version of the
>> work done at Natural Resources Canada for what they call “geohash
>> tree”. They did find that they got pretty good compression (even
>> with the simple ascii-based key!) using the scheme, and it did
>> allow easy random access to subsets of the data.
>>
>> http://2013.foss4g.org/conf/programme/presentations/60/
>>
>> The downside was of course the cost of sorting things in the
>> first place, but for a one-time cost on frequently accessed data,
>> it’s not a bad thing. The “libght” soft dependency in
>> pgpointcloud is to a (not so great) implementation of the scheme
>> that I did for them a couple years ago. As a scheme, I think it
>> cuts against the idea of having small patches that is core to the
>> pgpointcloud concept. It makes more and more sense the larger
>> your file is, in that it gets greater and greater leverage for
>> random access.
>>
>> ATB,
>>
>> P.
>>
>> --
>> Paul Ramsey
>> http://cleverelephant.ca
>>
>> http://postgis.net
>>
>> On April 17, 2015 at 11:02:47 AM, Oscar Martinez Rubi
>> (o.martinezrubi at tudelft.nl <mailto:o.martinezrubi at tudelft.nl>) wrote:
>>
>> Hi,
>>
>> About the XYZ binding for better compression. In our research
>> in the NL escience center and TU Delft we have been thinking
>> (not testing yet though) about one possible approach for this.
>>
>> It is based on using space filling curves. So, once you have
>> the points that go in a block you could compute the
>> morton/hilbert code of the XYZ. Since all the points are
>> close together such codes will be extremely similar, so one
>> could store only the increments which could fit in many few
>> bits. We have not tested or compared this with any of the
>> other compressions but we just wanted to share it with you
>> just in case you find it useful!
>>
>> An additional improvement would be to sort the points within
>> the blocks according to the morton code. Then, when doing
>> crop/filter operations in the blocks one can use the morton
>> codes for the queries similarly to what we presented in our
>> papers with the flat table (without blocks), I attach one of
>> them (see section 5.2). In a nutshell: You convert the query
>> region into a set of quadtree/octree nodes which can be also
>> converted to morton code ranges (thanks to relation between
>> morton/hilbert curve and a quadtree/octree). You scale down
>> the ranges to increments (like you did when storing the point
>> of the block) and then you simply do range queries in sorted
>> data with a binary algorithm. In this way you avoid the
>> decompression of the morton code for most of the block. This
>> filtering is equivalent to a bbox filter so it still requires
>> a point in polygon check for some of the points.
>>
>> Kind Regards,
>>
>> Oscar.
>>
>> On 16-04-15 18:15, Rémi Cura wrote:
>>
>> epic fail ! I had avoided html just for you
>>
>>
>> Dataset |subset size | compressing | decompressing |
>> |(Million pts)|(Million pts/s)|(Million pts/s)|
>> Lidar | 473.3 | 4,49 | 4,67 |
>>
>> 21-atributes | 105.7 | 1,11 | 2,62 |
>>
>> Stereo | 70 | 2,44 | 7,38 |
>>
>> Cheers
>>
>> 2015-04-16 17:42 GMT+02:00 Sandro Santilli
>> <strk at keybit.net <mailto:strk at keybit.net>>:
>>
>> On Thu, Apr 16, 2015 at 05:30:12PM +0200, Rémi Cura wrote:
>> > OUps
>> >
>> > Dataset | subset size(Million pts) |
>> compressing (Million pts/s) |
>> > decompressing (Million pts/s)
>> > Lidar | 473.3 | 4,49
>> > | __4,67__
>> > 21 attributes | 105.7 |
>> > 1,11 | 2,62
>> > Stereo | 70 | 2,44
>> > | 7,38
>>
>> These tables aren't really readable here.
>> Could you make sure to use a fixed-width font to write
>> those tables
>> and to keep lines within 70 columns at most ?
>>
>> --strk;
>>
>>
>>
>>
>> _______________________________________________
>>
>> pgpointcloud mailing list
>>
>> pgpointcloud at lists.osgeo.org <mailto:pgpointcloud at lists.osgeo.org>
>>
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud
>>
>> ------------------------------------------------------------------------
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pgpointcloud/attachments/20150420/b116ab57/attachment-0001.html>
More information about the pgpointcloud
mailing list