[pgpointcloud] RLE and SIGBITS heuristics

Mon Apr 20 06:03:50 PDT 2015

Hi,

On 20-04-15 13:13, Rémi Cura wrote:
> Hey Oscar,
>
> I'm a really big fan of Lidar for archeological use, and integrating 
> time into it is especially trendy and challenging. Registetring all 
> point cloud together from different sources must have been really 
> difficult.

That is really tricky indeed! At NLeSC we worked on an automatic 
open-source alignment tool 
(https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py) 
which works for some cases when aligning point clouds from 
archaeological monuments (from photogrametry) with a lidar dataset. For 
other cases we have a manual alignment tool that is a 3D desktop viewer 
using based on OpenSceneGraph (where also meshes and pictures can be 
displayed).

>
>
> I contacted Potree developper one year ago to ask him if it was 
> possible to modify it to read points in a DBMS (actually patch with LOD).
> He said it was possible and not too difficult.
> I don't know how much point output you get, but we demonstrated around 
> 20kpts/s streaming to browser (with a lot of 
> serialization/deserialization). Currently the upper limit for such 
> output would be in the few hundred kpts/s if you send points, and in 
> the few Million pts/s if you stream compressed patches.

Currently we are getting around 200kpoints/sec using LAS format (not 
remember how much we got with LAZ) but we also have a no-so-good 
server...so I think same solution could give a bit more in other 
situations. Anyway if you say compressed patches in DB could deliver few 
millions/sec that should be more than enough! And would be nice to try!
>
> Cheers
Regards,

O.
>
>
> 2015-04-20 11:58 GMT+02:00 Oscar Martinez Rubi 
> <o.martinezrubi at tudelft.nl <mailto:o.martinezrubi at tudelft.nl>>:
>
>     Whoa!
>
>     Thanks guys for all the material! I am now busy reading it all!
>
>     Remi: I had to read your mail a few times ;) Great slides (I
>     actually looked at all of them, very well done!) Very interesting
>     topics you are researching!
>
>     Howard: About the Greyhoud+s3, but what storage solution do you
>     use, it is not clear...mongodb? I mean where are the points
>     stored? file-based, dbms?
>
>     Paul+Nouri: The geohash tree that Noury mentions is the ght
>     compression in pgpointclouid, right? I tried it once but there was
>     some limitation with the type of coordinates, they had to be long
>     and lat so I guess there need to be a reference system
>     transformation in between, right? Any place where I can find an
>     example on how to use this?
>
>
>     At NLeSC for the visualization of data we are using a system based
>     on the potree visualization (so, file-based) but I am very very
>     interested on the stuff you are guys doing and I would love to be
>     convinced that DBMS solutions can be really efficient for
>     visualization as well (i think it is close now!). We choose
>     file-based and potree because of the initial lack of LoD support
>     in DBMS, the speed the file-based approach and the super
>     compressed LAZ storage.
>
>     To see what we have done so far:
>
>     https://github.com/NLeSC/PattyVis
>     https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d
>     (see the video from 1:40 for the potree based visualization)
>
>     One of the many reasons I would loved to be convinced that DBMS is
>     that now we are considering how to visualize the 640B AHN2
>     dataset, and in a pure file-based solution (like the potree) I
>     fear that when restructuring the data to octree we would need a
>     number of octree nodes/files probably larger than what ext4 can
>     handle!. We will try, I let you know how that goes ;), but it
>     would be really nice to have a efficient and fast DBMS-based
>     alternative!
>
>     I am very happy though with all the different work you are all
>     doing and excited to see how fast things improve and evolve!!
>
>     Keep on like this guys!
>
>     Regards,
>
>     O.
>
>
>
>     On 17-04-15 19:01, Sabo, Nouri wrote:
>>
>>     Hi,
>>
>>     Thank you for sharing these ideas. Many of the ideas can make
>>     improvements. In the prototype we have developed at RNCan and
>>     that we mentioned in the paper in attachment we have implemented
>>     some of these concepts. For example, in the prototype we are
>>     sorting points according to the Morton pattern before creating
>>     blocks. And each block is composed only of points that are
>>     spatially close, thereby improving the level of compression. We
>>     also use the properties of the Morton curve (Z pattern) to do
>>     spatial queries using Geohash as BBox. Usually, in Geohash based
>>     system the more the Geohash prefixes for two points resemble one
>>     another, the more they are spatially close to each other.
>>     Unfortunately, this property is not always complied with two
>>     points located on either side of a subdivision line. For this
>>     reason we implemented a neighbourhood based strategy to allow
>>     spatial query based on the hash string.
>>
>>     Also to improve the compression and performance we can change the
>>     encoding of Geohash. Currently, the hashes are encoded as base 32
>>     strings, which causes a lot of overhead (5 bits are inflated in 8
>>     bits of character). Unfortunately, the current libght does not
>>     include all the concepts of GeoHashTree.
>>
>>     Oscar, I will read your paper and get you back so we could
>>     continue to exchange.
>>
>>     Kind regards!
>>
>>     Nouri,
>>
>>     *From:*Paul Ramsey [mailto:pramsey at cleverelephant.ca]
>>     *Sent:* 17 avril 2015 06:56
>>     *To:* pgpointcloud at lists.osgeo.org
>>     <mailto:pgpointcloud at lists.osgeo.org>; Peter van Oosterom; Oscar
>>     Martinez Rubi; Howard Butler; Rémi Cura
>>     *Cc:* Sabo, Nouri
>>     *Subject:* Re: [pgpointcloud] RLE and SIGBITS heuristics
>>
>>     Hi Oscar,
>>
>>     This sounds like a slightly more sophisticated version of the
>>     work done at Natural Resources Canada for what they call “geohash
>>     tree”. They did find that they got pretty good compression (even
>>     with the simple ascii-based key!) using the scheme, and it did
>>     allow easy random access to subsets of the data.
>>
>>     http://2013.foss4g.org/conf/programme/presentations/60/
>>
>>     The downside was of course the cost of sorting things in the
>>     first place, but for a one-time cost on frequently accessed data,
>>     it’s not a bad thing. The “libght” soft dependency in
>>     pgpointcloud is to a (not so great) implementation of the scheme
>>     that I did for them a couple years ago. As a scheme, I think it
>>     cuts against the idea of having small patches that is core to the
>>     pgpointcloud concept. It makes more and more sense the larger
>>     your file is, in that it gets greater and greater leverage for
>>     random access.
>>
>>     ATB,
>>
>>     P.
>>
>>     -- 
>>     Paul Ramsey
>>     http://cleverelephant.ca
>>
>>     http://postgis.net
>>
>>     On April 17, 2015 at 11:02:47 AM, Oscar Martinez Rubi
>>     (o.martinezrubi at tudelft.nl <mailto:o.martinezrubi at tudelft.nl>) wrote:
>>
>>         Hi,
>>
>>         About the XYZ binding for better compression. In our research
>>         in the NL escience center and TU Delft we have been thinking
>>         (not testing yet though) about one possible approach for this.
>>
>>         It is based on using space filling curves. So, once you have
>>         the points that go in a block you could compute the
>>         morton/hilbert code of the XYZ. Since all the points are
>>         close together such codes will be extremely similar, so one
>>         could store only the increments which could fit in many few
>>         bits. We have not tested or compared this with any of the
>>         other compressions but we just wanted to share it with you
>>         just in case you find it useful!
>>
>>         An additional improvement would be to sort the points within
>>         the blocks according to the morton code. Then, when doing
>>         crop/filter operations in the blocks one can use the morton
>>         codes for the queries similarly to what we presented in our
>>         papers with the flat table (without blocks), I attach one of
>>         them (see section 5.2). In a nutshell: You convert the query
>>         region into a set of quadtree/octree nodes which can be also
>>         converted to morton code ranges (thanks to relation between
>>         morton/hilbert curve and a quadtree/octree). You scale down
>>         the ranges to increments (like you did when storing the point
>>         of the block) and then you simply do range queries in sorted
>>         data with a binary algorithm. In this way you avoid the
>>         decompression of the morton code for most of the block. This
>>         filtering is equivalent to a bbox filter so it still requires
>>         a point in polygon check for some of the points.
>>
>>         Kind Regards,
>>
>>         Oscar.
>>
>>         On 16-04-15 18:15, Rémi Cura wrote:
>>
>>             epic fail ! I had avoided html just for you
>>
>>
>>                Dataset   |subset size  | compressing   | decompressing |
>>                          |(Million pts)|(Million pts/s)|(Million pts/s)|
>>             Lidar        |   473.3     | 4,49       |     4,67      |
>>
>>             21-atributes |   105.7     |    1,11       | 2,62      |
>>
>>             Stereo       |    70       |    2,44     |     7,38      |
>>
>>             Cheers
>>
>>             2015-04-16 17:42 GMT+02:00 Sandro Santilli
>>             <strk at keybit.net <mailto:strk at keybit.net>>:
>>
>>             On Thu, Apr 16, 2015 at 05:30:12PM +0200, Rémi Cura wrote:
>>             > OUps
>>             >
>>             > Dataset        |  subset size(Million pts) |
>>             compressing (Million pts/s) |
>>             > decompressing (Million pts/s)
>>             > Lidar           | 473.3                |    4,49
>>             >               |  __4,67__
>>             > 21 attributes |  105.7                 |
>>             > 1,11                     |          2,62
>>             > Stereo         | 70                  |   2,44
>>             >                |  7,38
>>
>>             These tables aren't really readable here.
>>             Could you make sure to use a fixed-width font to write
>>             those tables
>>             and to keep lines within 70 columns at most ?
>>
>>             --strk;
>>
>>
>>
>>
>>             _______________________________________________
>>
>>             pgpointcloud mailing list
>>
>>             pgpointcloud at lists.osgeo.org  <mailto:pgpointcloud at lists.osgeo.org>
>>
>>             http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud
>>
>>         ------------------------------------------------------------------------
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pgpointcloud/attachments/20150420/b116ab57/attachment-0001.html>