[gdal-dev] Fast Pixel Access

Even Rouault even.rouault at mines-paris.org
Sun Feb 9 04:36:06 PST 2014


Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit :
> Evan,
> 
> I am not sure how to profile as I do not have access to the code to
> profile.  I did do a timing test...
> 
> vrt file = 22,970 KB
> bil file = 35,180 KB * 55,501
> 
> I piped five locations from the loc.txt file:
> -96.0 36.0
> -98.0 37.0
> -100.0 38.0
> -99.0 39.0
> -101.0 35.0
> 
> gdallocationinfo -valonly -geoloc intermap.vrt < loc.txt
> 189.841857910156        25.5 sec
> 384.857452392578        22.6 sec
> 762.015930175781        22.9 sec
> 550.719116210938        23.6 sec
> 883.637023925781        22.9 sec
> 
> Note: I used a lap timer on my iPhone to capture the split times as the
> results appeared in the console window.  Does this give any insight?

Woo I agree that's utterly slow ! When you mentionned slow I thought it was 
more in the order of 0.1 second ! We can already exclude the parsing time of 
the VRT since you do that in the same gdallocationinfo session and that there 
will be just one parsing.
And I can't believe that the intersection test for 55 000 rectangles takes ~ 
20 seconds, unless you have an old i386 at 5 MHz ;-)
My usual way of profiling stuff that is slow in the order of more than one 
second is to run under gdb, break with Ctrl+C, display the stack trace, 
continue the run, break again, display the stack trace, etc.. If you end up 
breaking in the same function, then you've found the bottleneck.

I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was 
suggested and seems to improve things significantly. Perhaps we should try to 
cache the result of the initial readdir so it can benefits to later attempts, 
but I haven't checked how easily that could be miplemented. Or perhaps we 
should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it 
causes problem from time to time.
But generally filesystems don't behave very well when there are a lot of files 
in the same directory. You'd better organizing your tiles in subdirectories.
But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could 
try the above suggestion to identify where the time is spent.

Even

> 
> David
> 
> -----Original Message-----
> From: gdal-dev-bounces at lists.osgeo.org
> [mailto:gdal-dev-bounces at lists.osgeo.org] On Behalf Of Even Rouault Sent:
> Saturday, February 01, 2014 1:28 AM
> To: Brian Case
> Cc: gdal-dev at lists.osgeo.org
> Subject: Re: [gdal-dev] Fast Pixel Access
> 
> Le samedi 01 février 2014 00:23:13, Brian Case a écrit :
> > evenr
> > 
> > 
> > what about the use of a tileindex?
> 
> You really mean a tileindex as produced by gdaltindex ? Well, that's not
> exactly the same beast as a VRT, but yes if it was recognized as a GDAL
> dataset then you could potentially save the cost of XML parsing. One could
> imagine that the VRT driver would accept a tileindex as an altenate
> connection string.
> 
> Anyway it would be interesting to first profile where the time is spent in
> David use case. If it's in the XML parsing, then I can't see what could be
> easily improved in that area. If it's the intersection, then there's
> potential for improvement.
> 
> > seems an intersection with a set of
> > polys first would be quick
> > 
> > 
> > 
> > brian
> > 
> > On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote:
> > > Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit :
> > > > Dev's,
> > > > 
> > > > I have a set of 55,501 bil files in a single directory.  They are
> > > > DEMS data that cover the US in 7.5 minute tiles.  I would like to
> > > > randomly access elevations at a given lat/lon's from the whole
> > > > dataset.  I created a vrt file from the directory of bil files, and
> > > > have been able to access the elevation at a given lat/lon using
> > > > gdallocationinfo, but because of the size of the dataset, this
> > > > operation is somewhat slow. Can the vrt be indexed?
> > > 
> > > No, it isn't currently, although I think it could be improved to have a
> > > in- memory index with moderate effort.
> > > 
> > > But are you sure the slowness is due to the lack of index ? 55,000 is a
> > > big number, but not that big. Maybe the slowness just comes from the
> > > opening time (XML parsing) of such a big VRT. That would need to be
> > > profiled to be sure where the bottleneck is.
> > > 
> > > > Or, is there a faster, better way to access the pixels?  I would
> > > > first like to do this with the utilities before diving into code
> > > > (C#). The files are regularly named base on their location within a
> > > > 1 arc-second grid.
> > > > 
> > > > Thanks,
> > > > David
> > > > 
> > > > David M. Baker
> > > > Senior Advisor - Geoscience Technology
> > > > Chesapeake Energy Corporation
> > > > david.m.baker at chk.com<mailto:david.m.baker at chk.com>
> > > > 
> > > > 
> > > > ________________________________
> > > > 
> > > > This email (and attachments if any) is intended only for the use of
> > > > the individual or entity to which it is addressed, and may contain
> > > > information that is confidential or privileged and exempt from
> > > > disclosure under applicable law. If the reader of this email is not
> > > > the intended recipient, or the employee or agent responsible for
> > > > delivering this message to the intended recipient, you are hereby
> > > > notified that any dissemination, distribution or copying of this
> > > > communication is strictly prohibited. If you have received this
> > > > communication in error, please notify the sender immediately by
> > > > return email and destroy all copies of the email (and attachments if
> > > > any).
> 
> --
> Geospatial professional services
> http://even.rouault.free.fr/services.html
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> ________________________________
> 
> This email (and attachments if any) is intended only for the use of the
> individual or entity to which it is addressed, and may contain information
> that is confidential or privileged and exempt from disclosure under
> applicable law. If the reader of this email is not the intended recipient,
> or the employee or agent responsible for delivering this message to the
> intended recipient, you are hereby notified that any dissemination,
> distribution or copying of this communication is strictly prohibited. If
> you have received this communication in error, please notify the sender
> immediately by return email and destroy all copies of the email (and
> attachments if any).

-- 
Geospatial professional services
http://even.rouault.free.fr/services.html


More information about the gdal-dev mailing list