[gdal-dev] Request for comments on Rasterio documentation

Even Rouault even.rouault at spatialys.com
Fri Sep 9 02:40:36 PDT 2016


Sean,

I've actually a question along Howard's yesterday tweet regarding mutual 
incompatiblity between osgeo.gdal and rasterio. From my understanding of the 
doc and look at rasterio code ( 
https://github.com/mapbox/rasterio/blob/master/rasterio/_drivers.pyx and 
https://github.com/mapbox/rasterio/blob/master/rasterio/env.py) , it seems the 
potential issues would be about driver registration, config options and error 
handlers, right ?

But as rasterio.Env() seems to take care about restoring the environment to 
what it was about, I would have thought that folks could do safely things like

1: do osgeo.gdal stuff
2:  with rasterio.Env(): do rasterio stuff
3 : do osgeo.gdal stuff

Actually taking a look at rasterio/_drivers.pyx , it seems that the error 
handler isn't restored, so could be an issue if folks in 1. would install a 
custom global one. I guess the reason for not restoring the previous handler 
is for workflows like

with rasterio.Env()
     src = rasterio.open()
do stuff with src

where you cannot restore the error handler at a precise point.

To address this, a possibility would be for rasterio to use 
CPLPushErrorHandler() / CPLPopErrorHandler() around each call, or group of 
calls, it makes to GDAL.

GDALEnv.start() calls GDALAllRegister, so potentially if folks had de-
registered a driver in 1., then will find it registered again in step 3., but 
that's a pretty uncommon use case (I'm not aware of anyone using 
driver.Deregister() / driver.Register() except in GDAL autotest suite to make 
sure the right jpeg2000 driver is used)

Apart from that, it isn't obvious to me why things couldn't be mixed in the 
same process. I guess that even in 2. you could call osgeo.gdal stuff in a 
reasonably safe way (with the environment & error handler set by rasterio of 
course)

Reading again, I'm not sure that "Additionally, gdal and Rasterio register 
conflicting error handlers and thus the propagation of exceptions and warnings 
may depend on which module was imported last." is true. Importing gdal.osgeo 
only calls GDALAllRegister() and as far as I understood importing rasterio is 
a no-op regarding calls to the GDAL API.


Regarding the example 'with rasterio.Env(GDAL_CACHEMAX=512):', there's a 
potential pitfall in that the GDAL_CACHEMAX config option is read *only* the 
first time GDALGetCacheMax()/GDALGetCacheMax64() is called, so changing it 
afterwards will have no effect.

So if you do things like

with rasterio.Env(GDAL_CACHEMAX=512):
	rasterio.open(...)

with rasterio.Env(GDAL_CACHEMAX=1024):
	rasterio.open(...)  <-- will only see the effect GDAL_CACHEMAX=512

If you want to actively change the value you must call GDALSetCacheMax() / 
GDALSetCacheMax64().

More about a rasterio design choice: I see that the master representation used 
in rasterio for a crs is the proj.4 string (or +init=epsg:XXXX when detected), 
right ? So "src.crs.wkt" will not give you back the original WKT string coming 
from GDAL but one that is reconstructed by using OSRImportFromProj4() (or 
OSRImportFromEPSG()) and then OSRExportToWKT(). This might work well in most 
use cases where the CRS is not too exotic, but you might lose for example the 
actual datum name, which could be a problem if folks want to select another 
datum shift than the one proposed by default, or think about CRS for imagery 
of other planets.


Even

> Hi all,
> 
> I've been working on an open source project called Rasterio,
> https://github.com/mapbox/rasterio, which is an attempt to unite the
> unique, best parts of GDAL with modern Python language features. Rasterio
> is a package of Python C extension modules. It loads libgdal like the SWIG
> bindings does, but does not import osgeo.gdal at all.
> 
> As a part of my push to release Rasterio 1.0 this fall, I'm writing a lot
> of documentation. One particular document is called "Switching from GDAL’s
> Python bindings."
> 
> https://mapbox.github.io/rasterio/switch.html
> 
> It's not a manifesto for why a developer should switch, but a comparison of
> the osgeo.gdal and rasterio modules and a summary of the gotchas and
> lifelines. I would be very grateful for comments on the doc. Does it
> explain the similarities between the SWIG bindings and Rasterio adequately?
> Does it answer any questions that you've had about Rasterio?
> 
> Comments may be left at https://github.com/mapbox/rasterio/issues/872,
> addressed to me personally, or here on gdal-dev as long as they're on topic
> (primarily about the GDAL data model and implementation).
> 
> Thanks very much in advance,

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list