<div dir="ltr">> <span style="font-size:12.8000001907349px">Howard: About the Greyhoud+s3, but what storage solution do you use, it is not clear...mongodb? I mean where are the points stored? file-based, dbms? </span><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">The storage for <a href="http://iowalidar.com" target="_blank">iowalidar.com</a> is S3 - but could be any key/value storage system (filesystem, database, a back-end web server supporting PUT/GET, etc.).</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">There is a single "base" chunk containing some well-known number (N) of compressed points, which is stored as key "0".  </span><span style="font-size:12.8000001907349px">After that, there is a well-known chunk size (C).  So the next key after "0" is "N", containing C compressed points, so the subsequent keys are stringified integers following the form "N + C*x" for x >= 0.  The key for each chunk is the ID of the first point of that chunk, and the value is the compressed binary point data starting at that ID.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Currently the chunk size is determined by the ending level of the base depth split into quadrants.  For example say the base contains depth levels [0, 8), non-inclusive.  Then the chunk size will be level 8 of the 'tree' split into 4 chunks, so (4^8)/4 points.  Those chunks contain the four quadrants of the bounds of the entire set, at a single level of detail.  Continuing on with that chunk size creates a store where each subsequent tre</span><span style="font-size:12.8000001907349px">e level is split into 4 times the number of chunks as the previous level.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">From there, given a point ID, we can easily figure out the ID of its chunk and fetch it from S3 - typically the entire chunk will be used in a query since the client is traversing its own virtual tree by splitting the bounds and climbing upward in tree depth.  We are running Greyhound on an EC2 instance so the fetches from S3 are very fast.  The client specifies its queries via a bounding box and a depth level (or range of levels), from which we can, in parallel, fetch all chunks selected by this query and start streaming out the points within range.  A client could also do a bit more work and fetch directly by chunk ID, but we like the abstraction of using a depth level and bounding box to decouple the physical storage from the client queries.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">- Connor</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 20, 2015 at 9:03 AM, Oscar Martinez Rubi <span dir="ltr"><<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    Hi,<span class=""><br>
    <br>
    <div>On 20-04-15 13:13, Rémi Cura wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div class="gmail_default" style="font-family:monospace,monospace">Hey Oscar,<br>
          <br>
        </div>
        <div class="gmail_default" style="font-family:monospace,monospace">I'm a really big fan
          of Lidar for archeological use, and integrating time into it
          is especially trendy and challenging. Registetring all point
          cloud together from different sources must have been really
          difficult.<br>
        </div>
      </div>
    </blockquote>
    <br></span>
    That is really tricky indeed! At NLeSC we worked on an automatic
    open-source alignment tool
    (<a href="https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py" target="_blank">https://github.com/NLeSC/PattyAnalytics/blob/master/scripts/registration.py</a>)
    which works for some cases when aligning point clouds from
    archaeological monuments (from photogrametry) with a lidar dataset.
    For other cases we have a manual alignment tool that is a 3D desktop
    viewer using based on OpenSceneGraph (where also meshes and pictures
    can be displayed).<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_default" style="font-family:monospace,monospace"><br>
           <br>
        </div>
        <div class="gmail_default" style="font-family:monospace,monospace">I contacted Potree
          developper one year ago to ask him if it was possible to
          modify it to read points in a DBMS (actually patch with LOD).<br>
        </div>
        <div class="gmail_default" style="font-family:monospace,monospace">He said it was
          possible and not too difficult.<br>
        </div>
        <div class="gmail_default" style="font-family:monospace,monospace">I don't know how much
          point output you get, but we demonstrated around 20kpts/s
          streaming to browser (with a lot of
          serialization/deserialization). Currently the upper limit for
          such output would be in the few hundred kpts/s if you send
          points, and in the few Million pts/s if you stream compressed
          patches.<br>
        </div>
      </div>
    </blockquote>
    <br></span>
    Currently we are getting around 200kpoints/sec using LAS format (not
    remember how much we got with LAZ) but we also have a no-so-good
    server...so I think same solution could give a bit more in other
    situations. Anyway if you say compressed patches in DB could deliver
    few millions/sec that should be more than enough! And would be nice
    to try! <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_default" style="font-family:monospace,monospace"><br>
        </div>
        <div class="gmail_default" style="font-family:monospace,monospace">Cheers<br>
        </div>
      </div>
    </blockquote>
    Regards,<br>
    <br>
    O.<div><div class="h5"><br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_default" style="font-family:monospace,monospace"><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">2015-04-20 11:58 GMT+02:00 Oscar
          Martinez Rubi <span dir="ltr"><<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>></span>:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"> Whoa!<br>
              <br>
              Thanks guys for all the material! I am now busy reading it
              all!<br>
              <br>
              Remi: I had to read your mail a few times ;) Great slides
              (I actually looked at all of them, very well done!) Very
              interesting topics you are researching!<br>
              <br>
              Howard: About the Greyhoud+s3, but what storage solution
              do you use, it is not clear...mongodb? I mean where are
              the points stored? file-based, dbms? <br>
              <br>
              Paul+Nouri: The geohash tree that Noury mentions is the
              ght compression in pgpointclouid, right? I tried it once
              but there was some limitation with the type of
              coordinates, they had to be long and lat so I guess there
              need to be a reference system transformation in between,
              right? Any place where I can find an example on how to use
              this?<br>
              <br>
              <br>
              At NLeSC for the visualization of data we are using a
              system based on the potree visualization (so, file-based)
              but I am very very interested on the stuff you are guys
              doing and I would love to be convinced that DBMS solutions
              can be really efficient for visualization as well (i think
              it is close now!). We choose file-based and potree because
              of the initial lack of LoD support in DBMS, the speed the
              file-based approach and the super compressed LAZ storage.<br>
              <br>
              To see what we have done so far:<br>
              <br>
              <a href="https://github.com/NLeSC/PattyVis" target="_blank">https://github.com/NLeSC/PattyVis</a><br>
              <a href="https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d" target="_blank">https://www.esciencecenter.nl/project/mapping-the-via-appia-in-3d</a><br>
              (see the video from 1:40 for the potree based
              visualization)<br>
              <br>
              One of the many reasons I would loved to be convinced that
              DBMS is that now we are considering how to visualize the
              640B AHN2 dataset, and in a pure file-based solution (like
              the potree) I fear that when restructuring the data to
              octree we would need a number of octree nodes/files
              probably larger than what ext4 can handle!. We will try, I
              let you know how that goes ;), but it would be really nice
              to have a efficient and fast DBMS-based alternative!<br>
              <br>
              I am very happy though with all the different work you are
              all doing and excited to see how fast things improve and
              evolve!! <br>
              <br>
              Keep on like this guys!<br>
              <br>
              Regards,<br>
              <br>
              O.
              <div>
                <div><br>
                  <br>
                  <br>
                  <div>On 17-04-15 19:01, Sabo, Nouri wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi,</span></p>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Thank

                          you for sharing these ideas. Many of the ideas
                          can make improvements. In the prototype we
                          have developed at RNCan and that we mentioned
                          in the paper in attachment we have implemented
                          some of these concepts. For example, in the
                          prototype we are sorting points according to
                          the Morton pattern before creating blocks. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">And each block is composed only
                          of points that are spatially close, thereby
                          improving the level of compression. We also
                          use the properties of the Morton curve (Z
                          pattern) to do spatial queries using Geohash
                          as BBox. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Usually,

                          in Geohash based system the more the Geohash
                          prefixes for two points resemble one another,
                          the more they are spatially close to each
                          other. Unfortunately, this property is not
                          always complied with two points located on
                          either side of a subdivision line. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">For this reason we implemented a
                          neighbourhood based strategy to allow spatial
                          query based on the hash string. </span></p>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Also to improve the compression
                          and performance we can change the encoding of
                          Geohash. </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Currently,

                          the hashes are encoded as base 32 strings,
                          which causes a lot of overhead (5 bits are
                          inflated in 8 bits of character). </span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Unfortunately, the current libght
                          does not include all the concepts of
                          GeoHashTree. </span></p>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"" lang="EN-CA">Oscar, I will read your paper and
                          get you back so we could continue to exchange.</span></p>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Kind

                          regards!</span></p>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Nouri,</span></p>
                      <p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" lang="EN-CA"> </span></p>
                      <p class="MsoNormal"><span lang="EN-CA"> </span></p>
                      <div>
                        <div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
                          <p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" lang="EN-US">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" lang="EN-US"> Paul Ramsey [<a href="mailto:pramsey@cleverelephant.ca" target="_blank">mailto:pramsey@cleverelephant.ca</a>]
                              <br>
                              <b>Sent:</b> 17 avril 2015 06:56<br>
                              <b>To:</b> <a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a>;
                              Peter van Oosterom; Oscar Martinez Rubi;
                              Howard Butler; Rémi Cura<br>
                              <b>Cc:</b> Sabo, Nouri<br>
                              <b>Subject:</b> Re: [pgpointcloud] RLE and
                              SIGBITS heuristics</span></p>
                        </div>
                      </div>
                      <p class="MsoNormal"> </p>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi

                            Oscar, </span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">This

                            sounds like a slightly more sophisticated
                            version of the work done at Natural
                            Resources Canada for what they call “geohash
                            tree”. They did find that they got pretty
                            good compression (even with the simple
                            ascii-based key!) using the scheme, and it
                            did allow easy random access to subsets of
                            the data.</span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><a href="http://2013.foss4g.org/conf/programme/presentations/60/" target="_blank">http://2013.foss4g.org/conf/programme/presentations/60/</a></span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">The

                            downside was of course the cost of sorting
                            things in the first place, but for a
                            one-time cost on frequently accessed data,
                            it’s not a bad thing. The “libght” soft
                            dependency in pgpointcloud is to a (not so
                            great) implementation of the scheme that I
                            did for them a couple years ago. As a
                            scheme, I think it cuts against the idea of
                            having small patches that is core to the
                            pgpointcloud concept. It makes more and more
                            sense the larger your file is, in that it
                            gets greater and greater leverage for random
                            access.</span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">ATB,</span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">P.</span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">-- <br>
                            Paul Ramsey<br>
                            <a href="http://cleverelephant.ca" target="_blank">http://cleverelephant.ca</a></span></p>
                        <div>
                          <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><a href="http://postgis.net" target="_blank">http://postgis.net</a> </span></p>
                        </div>
                      </div>
                      <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                      <p><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif";color:black">On

                          April 17, 2015 at 11:02:47 AM, Oscar Martinez
                          Rubi (<a href="mailto:o.martinezrubi@tudelft.nl" target="_blank">o.martinezrubi@tudelft.nl</a>)
                          wrote:</span></p>
                      <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
                        <div>
                          <div>
                            <p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">Hi,<br>
                                <br>
                                About the XYZ binding for better
                                compression. In our research in the NL
                                escience center and TU Delft we have
                                been thinking (not testing yet though)
                                about one possible approach for this.<br>
                                <br>
                                It is based on using space filling
                                curves. So, once you have the points
                                that go in a block you could compute the
                                morton/hilbert code of the XYZ. Since
                                all the points are close together such
                                codes will be extremely similar, so one
                                could store only the increments which
                                could fit in many few bits. We have not
                                tested or compared this with any of the
                                other compressions but we just wanted to
                                share it with you just in case you find
                                it useful!<br>
                                <br>
                                An additional improvement would be to
                                sort the points within the blocks
                                according to the morton code. Then, when
                                doing crop/filter operations in the
                                blocks one can use the morton codes for
                                the queries similarly to what we
                                presented in our papers with the flat
                                table (without blocks), I attach one of
                                them (see section 5.2). In a nutshell:
                                You convert the query region into a set
                                of quadtree/octree nodes which can be
                                also converted to morton code ranges
                                (thanks to relation between
                                morton/hilbert curve and a
                                quadtree/octree). You scale down the
                                ranges to increments (like you did when
                                storing the point of the block) and then
                                you simply do range queries in sorted
                                data with a binary algorithm. In this
                                way you avoid the decompression of the
                                morton code for most of the block. This
                                filtering is equivalent to a bbox filter
                                so it still requires a point in polygon
                                check for some of the points.<br>
                                <br>
                                Kind Regards,<br>
                                <br>
                                Oscar.<br>
                                <br>
                              </span></p>
                            <div>
                              <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">On

                                  16-04-15 18:15, Rémi Cura wrote:</span></p>
                            </div>
                            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
                              <div>
                                <div>
                                  <p class="MsoNormal"><span>epic fail !
                                      I had avoided html just for you</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
                                </div>
                                <div>
                                  <p class="MsoNormal"><span><br>
                                         Dataset   |subset size  |
                                      compressing   | decompressing |<br>
                                                   |(Million
                                      pts)|(Million pts/s)|(Million
                                      pts/s)|<br>
                                      Lidar        |   473.3     |   
                                      4,49       |     4,67      |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
                                </div>
                                <p class="MsoNormal"><span>21-atributes
                                    |   105.7     |    1,11       |    
                                    2,62      |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
                                <div>
                                  <div>
                                    <p class="MsoNormal" style="margin-bottom:12.0pt"><span>Stereo
                                              |    70       |    2,44  
                                            |     7,38      |</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
                                  </div>
                                  <div>
                                    <p class="MsoNormal"><span>Cheers</span><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""></span></p>
                                  </div>
                                </div>
                              </div>
                              <div>
                                <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                                <div>
                                  <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">2015-04-16

                                      17:42 GMT+02:00 Sandro Santilli
                                      <<a href="mailto:strk@keybit.net" target="_blank">strk@keybit.net</a>>:</span></p>
                                  <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">On

                                      Thu, Apr 16, 2015 at 05:30:12PM
                                      +0200, Rémi Cura wrote:<br>
                                      > OUps<br>
                                      ><br>
                                      > Dataset        |  subset
                                      size(Million pts) | compressing
                                      (Million pts/s) |<br>
                                      > decompressing (Million pts/s)<br>
                                      > Lidar           |           
                                      473.3                |           
                                         4,49<br>
                                      >               |           
                                       __4,67__<br>
                                      > 21 attributes |         
                                       105.7                 |<br>
                                      > 1,11                     |   
                                               2,62<br>
                                      > Stereo         |             
                                      70                  |             
                                        2,44<br>
                                      >                |           
                                       7,38<br>
                                      <br>
                                      These tables aren't really
                                      readable here.<br>
                                      Could you make sure to use a
                                      fixed-width font to write those
                                      tables<br>
                                      and to keep lines within 70
                                      columns at most ?<br>
                                      <br>
                                      --strk;</span></p>
                                </div>
                                <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                              </div>
                              <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""><br>
                                  <br>
                                  <br>
                                </span></p>
                              <pre>_______________________________________________</pre>
                              <pre>pgpointcloud mailing list</pre>
                              <pre><a href="mailto:pgpointcloud@lists.osgeo.org" target="_blank">pgpointcloud@lists.osgeo.org</a></pre>
                              <pre><a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud</a></pre>
                            </blockquote>
                            <p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif""> </span></p>
                            <div class="MsoNormal" style="text-align:center" align="center"><span style="font-size:10.0pt;font-family:"Helvetica","sans-serif"">
                                <hr align="center" size="2" width="100%">
                              </span></div>
                          </div>
                        </div>
                      </blockquote>
                    </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div></div></div>

<br>_______________________________________________<br>
pgpointcloud mailing list<br>
<a href="mailto:pgpointcloud@lists.osgeo.org">pgpointcloud@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/pgpointcloud</a><br></blockquote></div><br></div>