<html>
<head>
<title></title>
<meta name="GENERATOR" content="MSHTML 8.00.6001.18852"></meta>
<meta name="SKYPE_FRAMEID" content="UKSDOXEVME"></meta>
<meta id="skype_v3_tb_marker_id" name="SKYPE_PARSING_HAS_FINISHED" content="metacontent"></meta>
</head>
<body>Yes, I think you are right that my mistake was to use malloc.
<div align="left">/Nicklas<br />
<br />
2009-12-01 Paul Ramsey wrote:<br />
<br />
Nick,<br />
><br />
>You asked if maybe you did something naive and the answer is "possibly<br />
>yes". For example, if you did your allocations using malloc() you got<br />
>a much more expensive heap allocation than if you used palloc() (or<br />
>lwalloc() in the context of a pgsql build).<br />
><br />
>But anyhow, yes, knowing the number of edges before-hand, and knowing<br />
>our tree building scheme, we could easily allocate the whole segment<br />
>in one go at the start (~1.5 * npoints * sizeof(node)).<br />
><br />
>For 2.0, I've been thinking about the way to make this stuff "real",<br />
>and I'm pretty sure the answer is to stick an extra handle on<br />
>POINTARRAY for holding these optional structures. Then, an<br />
>lwgeom_index(LWGEOM *g) function would walk into an LWGEOM, ensure all<br />
>the *bbox pointers are filled in, and build the trees on all the<br />
>POINTARRAYs. Then the whole thing could be neatly cached, if<br />
>necessary, etc.<br />
><br />
>P.<br />
><br />
>On Tue, Dec 1, 2009 at 10:16 AM, Nicklas Avén<br />
>
<nicklas.aven@jordogskog.no></nicklas.aven@jordogskog.no> wrote:<br />
>> aul<br />
>> If it is as I suspect that the number of memory allocations is the most<br />
>> costly part of this, wouldn't it be possible to allocate all of it once and<br />
>> for all. If you allocate one list of nodes for each level, the whole<br />
>> structure of the tree is given. You know where the children and parents is<br />
>> related to the start of the "level lists" the only thing missing is the min<br />
>> and max x any y for each level.<br />
>><br />
>> /Nicklas<br />
>><br />
>> 2009-12-01 Nicklas Avén wrote:<br />
>><br />
>> :-) ><br />
>><br />
>><br />
>>><br />
>> I think I see what you are doing<br />
>>><br />
>> Had to take another look...<br />
>>><br />
>><br />
>>><br />
>> The comarasion part is the same as I used for subgeometries and as you say<br />
>> (I saw the blog too) the natural ordering of the edges should make the tree<br />
>> very effective.<br />
>>><br />
>><br />
>>><br />
>> But I think you have to cache the tree to beat my calculation :-)<br />
>>><br />
>> That belive comes from some trying I did when writing the code, before I had<br />
>> found the qsort function. Then I build a tree as described in that C-bible I<br />
>> can't remember the name, just to get the measure-values ordered. I think it<br />
>> was the number of memory-allocations that showed to be very expensive. I<br />
>> might have done som elementary mistake but when I tried that tree with one<br />
>> node per vertex on the largest polygon of Alaska (about 70000 vertexes) it<br />
>> used almost 7 seconds just to build the tree and read it ordered. The whole<br />
>> calculation for distance between the largest polygons in Alaska and Texas<br />
>> now uses less than 100 ms. But it is quite likely that I had done som<br />
>> mistake in my treebuilding.<br />
>>><br />
>> (That calculation Alaska Texas is extremly fast because Texas has that sharp<br />
>> edge aginst Alaska which is easy for my algoritm to catch.)<br />
>>><br />
>><br />
>>><br />
>> /Nicklas<br />
>>><br />
>>><br />
>><br />
>>><br />
>><br />
>>><br />
>>> 2009-12-01 Nicklas Avén wrote:<br />
>>><br />
>>> ><br />
>>><br />
>> wow, I think postgis will be hard to beat in the long run :-)<br />
>>> ><br />
>><br />
>>> ><br />
>> I'm not used to study those tree things, but it will be very interesting to<br />
>> follow this and try to understand it.<br />
>>> ><br />
>> Will it work on overlapping bboxes too?<br />
>>> ><br />
>><br />
>>> ><br />
>> One thing I think is nice about "my" way of doing it, is that there is no<br />
>> indexes included. The gain is that it can be nested in subqueries and CTE's<br />
>> without loosing the speed. But I haven't really understood when that problem<br />
>> occur of not using the indexes.<br />
>>> ><br />
>><br />
>>> ><br />
>> I have been thinking some in 3D terms too, totally without knowledge about<br />
>> reality. But it shouldn't be too impossible to do the same tricks in 3D and<br />
>> calculating point to surface instead of point to segment. But that would<br />
>> need a 3D type. I don't know how that use to be built, but I have imagined a<br />
>> list of 3D points and a related list with pointidentifiers in groups of<br />
>> three, defining triangels. Then it would be quite fast to do like my<br />
>> distance trick on the points and finding the surfaces to calculate distances<br />
>> to with some index on the pointidentifiers list...<br />
>>> ><br />
>><br />
>>> ><br />
>> /Nicklas<br />
>>> ><br />
>><br />
>>> ><br />
>><br />
>>> ><br />
>>> ><br />
>>> > 2009-12-01 Paul Ramsey wrote:<br />
>>> ><br />
>>> > Niklas,<br />
>>> > ><br />
>>> > >Your cunning sorting trick got into my head and I thought and thought<br />
>>> > >on it and ended up borrowing the magic tree idea from Mark's prepared<br />
>>> > >p-i-p code, and added another dimension to that, so the trees are<br />
>>> > >actually in 2d space, and then starting thinking about how, given two<br />
>>> > >of these trees you could quickly find things like intersections or<br />
>>> > >even (*gasp*) distances...<br />
>>> > ><br />
>>> > >geog=# select distance(buffer('POINT(3 3)', 2),buffer('POINT(0 0)',<br />
>>> > > 2));<br />
>>> > > distance<br />
>>> > >-------------------<br />
>>> > > 0.242640687119285<br />
>>> > >(1 row)<br />
>>> > ><br />
>>> > >geog=# select npoints(buffer('POINT(3 3)', 2));<br />
>>> > > npoints<br />
>>> > >---------<br />
>>> > > 33<br />
>>> > >(1 row)<br />
>>> > ><br />
>>> > >geog=# select 33*33;<br />
>>> > > ?column?<br />
>>> > >----------<br />
>>> > > 1089<br />
>>> > >(1 row)<br />
>>> > ><br />
>>> > >These two circles have 33 points each, so the brute force method<br />
>>> > >requires 1089 distance calculations to come up with an answer. The<br />
>>> > >magic trees took 29. It's also possible to *cache* the trees and<br />
>>> > >re-use them, both for distance calculations and intersections and<br />
>>> > >containments.<br />
>>> > ><br />
>>> > >Very exciting stuffs!<br />
>>> > ><br />
>>> > >However, we're almost to 1.5 and adding all the features you already<br />
>>> > >have in your code (maximums, points and lines of shortness, etc) would<br />
>>> > >take way too long. And I'm not sure my code is actually faster. Just<br />
>>> > >that it's much faster than the brute force ways.<br />
>>> > ><br />
>>> > >It's all in lwtree.c in liblwgeom if you want to look.<br />
>>> > ><br />
>>> > >P.<br />
>>> > >_______________________________________________<br />
>>> > >postgis-devel mailing list<br />
>>> > >postgis-devel@postgis.refractions.net<br />
>>> > >postgis.refractions.net/mailman/listinfo/postgis-devel<br />
>>> > ><br />
>>> > ><br />
>> _______________________________________________<br />
>> postgis-devel mailing list<br />
>> postgis-devel@postgis.refractions.net<br />
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel<br />
>><br />
>><br />
>_______________________________________________<br />
>postgis-devel mailing list<br />
>postgis-devel@postgis.refractions.net<br />
>http://postgis.refractions.net/mailman/listinfo/postgis-devel<br />
><br />
></div>
</body>
</html>