[postgis-users] Crosses Performance

Tue Sep 20 12:51:03 PDT 2005

I think the key point here is your first one:

"As the lines are long and the area of interest is relatively small,  
the && and geometry index don't do much"

So you have these long lines with big bounding rectangles that  
intersect lots and lots of your cells, even though the lines  
themselves only intersect a few.

How about this:

Take your lines table and bust it up into a table with a lot more  
rows, and cut up all your lines into two-vertex lines, with an id  
that links back to the parent line.

So if you had a line (call it 1) and it went from point A to B to C  
to D, in your new table you would have three rows, A to B, B to C,  
and C to D, and each row would still reference back to the line 1  
identifier.  Now you can still quickly get the answer, what cells  
does line 1 cross, but your bounding boxes are much more efficient,  
so your query should use the index more effectively.  As you noted  
above, you could go so far as reduce the problem to just the points,  
though you would miss a few cells that are crossed by lines but do  
not include end points.

Paul

On 20-Sep-05, at 10:35 AM, David Bitner wrote:

> I'm looking at optimizing a query of mine that takes a uniform set  
> of grid cells and determines how many aircraft flight tracks cross  
> each grid cell.  As the lines are long and the area of interest is  
> relatively small, the && and geometry index don't do much, most of  
> the index speed comes from filtering the flight tracks based on  
> time (as of now there are over half a million flight track records  
> and this is likely to increase to over 2 million).
>
> I have two approaches that I have thought of that might speed up  
> these queries that I wanted to see if anyone could comment on  
> whether they (or any other ideas) are likely to result in any  
> significant performance gain.
>
> The first approach:
>
> The flight track data points that make up the line are taken by  
> radar at an interval of 4 seconds, as most tracks are fairly  
> straight, a simplify operation would decrease the size drastically,  
> but I really need to maintain the high accuracy of the crossing.   
> My idea is to maintain two geometry columns, the original and a  
> majorly simplified geometry.  I would then do a first pass using  
> the distance(simple_geom,grid)<tolerance where tolerance is the  
> value that I used for simplification.  Then from those results do a  
> distance(first_result_geom,grid)=0.  This adds a whole other step,  
> but I believe the distance query agains a much simpler (I think  
> between one tenth and one third the number of points) might be fast  
> enough to jusitify the additional step.
>
> The second approach:
>
> Maintain all of the points from the lines as a point dataset as  
> well and just do a count on the unique id's of points that fall  
> within the time period and the grid geometry.
>
> The point of this query is to provide a density map of flight  
> tracks over an area during a period of time for display in  
> Mapserver so performance is critical, right now I limit the time  
> period to a single day and I make my grid cells rather large to  
> keep my draw time reasonably fast, but I am really tring to get a  
> nice surface to display for time periods orders of magnitude longer.
>
> Thanks for any pointers or advice,
> David
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>