<div dir="ltr">> <span style="font-size:12.8000001907349px">Howard: About the Greyhoud+s3, but what storage solution do you use, it is not clear...mongodb? I mean where are the points stored? file-based, dbms? </span><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">The storage for <a href="http://iowalidar.com" target="_blank">iowalidar.com</a> is S3 - but could be any key/value storage system (filesystem, database, a back-end web server supporting PUT/GET, etc.).</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">There is a single "base" chunk containing some well-known number (N) of compressed points, which is stored as key "0". </span><span style="font-size:12.8000001907349px">After that, there is a well-known chunk size (C). So the next key after "0" is "N", containing C compressed points, so the subsequent keys are stringified integers following the form "N + C*x" for x >= 0. The key for each chunk is the ID of the first point of that chunk, and the value is the compressed binary point data starting at that ID.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Currently the chunk size is determined by the ending level of the base depth split into quadrants. For example say the base contains depth levels [0, 8), non-inclusive. Then the chunk size will be level 8 of the 'tree' split into 4 chunks, so (4^8)/4 points. Those chunks contain the four quadrants of the bounds of the entire set, at a single level of detail. Continuing on with that chunk size creates a store where each subsequent tre</span><span style="font-size:12.8000001907349px">e level is split into 4 times the number of chunks as the previous level.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">From there, given a point ID, we can easily figure out the ID of its chunk and fetch it from S3 - typically the entire chunk will be used in a query since the client is traversing its own virtual tree by splitting the bounds and climbing upward in tree depth. We are running Greyhound on an EC2 instance so the fetches from S3 are very fast. The client specifies its queries via a bounding box and a depth level (or range of levels), from which we can, in parallel, fetch all chunks selected by this query and start streaming out the points within range. A client could also do a bit more work and fetch directly by chunk ID, but we like the abstraction of using a depth level and bounding box to decouple the physical storage from the client queries.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">- Connor</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 20, 2015 at 9:03 AM, Oscar Martinez Rubi <span dir="ltr"><<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Hi,<span class=""><br>
<br>
<div>On 20-04-15 13:13, Rémi Cura wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:monospace,monospace">Hey Oscar,<br>
<br>
</div>
<div class="gmail_default" style="font-family:monospace,monospace">I'm a really big fan
of Lidar for archeological use, and integrating time into it
is especially trendy and challenging. Registetring all point
cloud together from different sources must have been really
difficult.<br>
</div>
</div>
</blockquote>
<br></span>
That is really tricky indeed! At NLeSC we worked on an automatic
open-source alignment tool
(<a href="https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py" target="_blank">https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py</a>)
which works for some cases when aligning point clouds from
archaeological monuments (from photogrametry) with a lidar dataset.
For other cases we have a manual alignment tool that is a 3D desktop
viewer using based on OpenSceneGraph (where also meshes and pictures
can be displayed).<span class=""><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:monospace,monospace"><br>
<br>
</div>
<div class="gmail_default" style="font-family:monospace,monospace">I contacted Potree
developper one year ago to ask him if it was possible to
modify it to read points in a DBMS (actually patch with LOD).<br>
</div>
<div class="gmail_default" style="font-family:monospace,monospace">He said it was
possible and not too difficult.<br>
</div>
<div class="gmail_default" style="font-family:monospace,monospace">I don't know how much
point output you get, but we demonstrated around 20kpts/s
streaming to browser (with a lot of
serialization/deserialization). Currently the upper limit for
such output would be in the few hundred kpts/s if you send
points, and in the few Million pts/s if you stream compressed
patches.<br>
</div>
</div>
</blockquote>
<br></span>
Currently we are getting around 200kpoints/sec using LAS format (not
remember how much we got with LAZ) but we also have a no-so-good
server...so I think same solution could give a bit more in other
situations. Anyway if you say compressed patches in DB could deliver
few millions/sec that should be more than enough! And would be nice
to try! <br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:monospace,monospace"><br>
</div>
<div class="gmail_default" style="font-family:monospace,monospace">Cheers<br>
</div>
</div>
</blockquote>
Regards,<br>
<br>
O.<div><div class="h5"><br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:monospace,monospace"><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">2015-04-20 11:58 GMT+02:00 Oscar
Martinez Rubi <span dir="ltr"><<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Whoa!<br>
<br>
Thanks guys for all the material! I am now busy reading it
all!<br>
<br>
Remi: I had to read your mail a few times ;) Great slides
(I actually looked at all of them, very well done!) Very
interesting topics you are researching!<br>
<br>
Howard: About the Greyhoud+s3, but what storage solution
do you use, it is not clear...mongodb? I mean where are
the points stored? file-based, dbms? <br>
<br>
Paul+Nouri: The geohash tree that Noury mentions is the
ght compression in pgpointclouid, right? I tried it once
but there was some limitation with the type of
coordinates, they had to be long and lat so I guess there
need to be a reference system transformation in between,
right? Any place where I can find an example on how to use
this?<br>
<br>
<br>
At NLeSC for the visualization of data we are using a
system based on the potree visualization (so, file-based)
but I am very very interested on the stuff you are guys
doing and I would love to be convinced that DBMS solutions
can be really efficient for visualization as well (i think
it is close now!). We choose file-based and potree because
of the initial lack of LoD support in DBMS, the speed the
file-based approach and the super compressed LAZ storage.<br>
<br>
To see what we have done so far:<br>
<br>
<a href="https://github.com/NLeSC/PattyVis" target="_blank">https://github.com/NLeSC/PattyVis</a><br>
<a href="https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d" target="_blank">https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d</a><br>
(see the video from 1:40 for the potree based
visualization)<br>
<br>
One of the many reasons I would loved to be convinced that
DBMS is that now we are considering how to visualize the
640B AHN2 dataset, and in a pure file-based solution (like
the potree) I fear that when restructuring the data to
octree we would need a number of octree nodes/files
probably larger than what ext4 can handle!. We will try, I
let you know how that goes ;), but it would be really nice
to have a efficient and fast DBMS-based alternative!<br>
<br>
I am very happy though with all the different work you are
all doing and excited to see how fast things improve and
evolve!! <br>
<br>
Keep on like this guys!<br>
<br>
Regards,<br>
<br>
O.
<div>
<div><br>
<br>
<br>
<div>On 17-04-15 19:01, Sabo, Nouri wrote:<br>
</div>
<blockquote type="cite">
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi,</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Thank
you for sharing these ideas. Many of the ideas
can make improvements. In the prototype we
have developed at RNCan and that we mentioned
in the paper in attachment we have implemented
some of these concepts. For example, in the
prototype we are sorting points according to
the Morton pattern before creating blocks. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">And each block is composed only
of points that are spatially close, thereby
improving the level of compression. We also
use the properties of the Morton curve (Z
pattern) to do spatial queries using Geohash
as BBox. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Usually,
in Geohash based system the more the Geohash
prefixes for two points resemble one another,
the more they are spatially close to each
other. Unfortunately, this property is not
always complied with two points located on
either side of a subdivision line. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">For this reason we implemented a
neighbourhood based strategy to allow spatial
query based on the hash string. </span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Also to improve the compression
and performance we can change the encoding of
Geohash. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Currently,
the hashes are encoded as base 32 strings,
which causes a lot of overhead (5 bits are
inflated in 8 bits of character). </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Unfortunately, the current libght
does not include all the concepts of
GeoHashTree. </span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Oscar, I will read your paper and
get you back so we could continue to exchange.</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Kind
regards!</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Nouri,</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" lang="EN-CA"> </span></p>
<p class="MsoNormal"><span lang="EN-CA"> </span></p>
<div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" lang="EN-US">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" lang="EN-US"> Paul Ramsey [<a href="mailto:pramsey@cleverelephant.ca" target="_blank">mailto:pramsey@cleverelephant.ca</a>]
<br>
<b>Sent:</b> 17 avril 2015 06:56<br>
<b>To:</b> <a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a>;
Peter van Oosterom; Oscar Martinez Rubi;
Howard Butler; Rémi Cura<br>
<b>Cc:</b> Sabo, Nouri<br>
<b>Subject:</b> Re: [pgpointcloud] RLE and
SIGBITS heuristics</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi
Oscar, </span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">This
sounds like a slightly more sophisticated
version of the work done at Natural
Resources Canada for what they call “geohash
tree”. They did find that they got pretty
good compression (even with the simple
ascii-based key!) using the scheme, and it
did allow easy random access to subsets of
the data.</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><a href="http://2013.foss4g.org/conf/programme/presentations/60/" target="_blank">http://2013.foss4g.org/conf/programme/presentations/60/</a></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">The
downside was of course the cost of sorting
things in the first place, but for a
one-time cost on frequently accessed data,
it’s not a bad thing. The “libght” soft
dependency in pgpointcloud is to a (not so
great) implementation of the scheme that I
did for them a couple years ago. As a
scheme, I think it cuts against the idea of
having small patches that is core to the
pgpointcloud concept. It makes more and more
sense the larger your file is, in that it
gets greater and greater leverage for random
access.</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">ATB,</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">P.</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">-- <br>
Paul Ramsey<br>
<a href="http://cleverelephant.ca" target="_blank">http://cleverelephant.ca</a></span></p>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><a href="http://postgis.net" target="_blank">http://postgis.net</a> </span></p>
</div>
</div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
<p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif";color:black">On
April 17, 2015 at 11:02:47 AM, Oscar Martinez
Rubi (<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>)
wrote:</span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi,<br>
<br>
About the XYZ binding for better
compression. In our research in the NL
escience center and TU Delft we have
been thinking (not testing yet though)
about one possible approach for this.<br>
<br>
It is based on using space filling
curves. So, once you have the points
that go in a block you could compute the
morton/hilbert code of the XYZ. Since
all the points are close together such
codes will be extremely similar, so one
could store only the increments which
could fit in many few bits. We have not
tested or compared this with any of the
other compressions but we just wanted to
share it with you just in case you find
it useful!<br>
<br>
An additional improvement would be to
sort the points within the blocks
according to the morton code. Then, when
doing crop/filter operations in the
blocks one can use the morton codes for
the queries similarly to what we
presented in our papers with the flat
table (without blocks), I attach one of
them (see section 5.2). In a nutshell:
You convert the query region into a set
of quadtree/octree nodes which can be
also converted to morton code ranges
(thanks to relation between
morton/hilbert curve and a
quadtree/octree). You scale down the
ranges to increments (like you did when
storing the point of the block) and then
you simply do range queries in sorted
data with a binary algorithm. In this
way you avoid the decompression of the
morton code for most of the block. This
filtering is equivalent to a bbox filter
so it still requires a point in polygon
check for some of the points.<br>
<br>
Kind Regards,<br>
<br>
Oscar.<br>
<br>
</span></p>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">On
16-04-15 18:15, Rémi Cura wrote:</span></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal"><span>epic fail !
I had avoided html just for you</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
</div>
<div>
<p class="MsoNormal"><span><br>
Dataset |subset size |
compressing | decompressing |<br>
|(Million
pts)|(Million pts/s)|(Million
pts/s)|<br>
Lidar | 473.3 |
4,49 | 4,67 |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
</div>
<p class="MsoNormal"><span>21-atributes
| 105.7 | 1,11 |
2,62 |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span>Stereo
| 70 | 2,44
| 7,38 |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
</div>
<div>
<p class="MsoNormal"><span>Cheers</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">2015-04-16
17:42 GMT+02:00 Sandro Santilli
<<a href="mailto:strk@keybit.net" target="_blank">strk@keybit.net</a>>:</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">On
Thu, Apr 16, 2015 at 05:30:12PM
+0200, Rémi Cura wrote:<br>
> OUps<br>
><br>
> Dataset | subset
size(Million pts) | compressing
(Million pts/s) |<br>
> decompressing (Million pts/s)<br>
> Lidar |
473.3 |
4,49<br>
> |
__4,67__<br>
> 21 attributes |
105.7 |<br>
> 1,11 |
2,62<br>
> Stereo |
70 |
2,44<br>
> |
7,38<br>
<br>
These tables aren't really
readable here.<br>
Could you make sure to use a
fixed-width font to write those
tables<br>
and to keep lines within 70
columns at most ?<br>
<br>
--strk;</span></p>
</div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
</div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><br>
<br>
<br>
</span></p>
<pre>_______________________________________________</pre>
<pre>pgpointcloud mailing list</pre>
<pre><a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a></pre>
<pre><a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud</a></pre>
</blockquote>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
<div class="MsoNormal" style="text-align:center" align="center"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
<hr align="center" size="2" width="100%">
</span></div>
</div>
</div>
</blockquote>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></div></div>
<br>_______________________________________________<br>
pgpointcloud mailing list<br>
<a href="mailto:pgpointcloud@lists.osgeo.org">pgpointcloud@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud</a><br></blockquote></div><br></div>