Fwd: Re: [gdal-dev] FWTools and GDAL 1.7.0

Jason Roberts jason.roberts at duke.edu
Thu Jan 6 10:25:00 EST 2011


Tamas,

 

Apologies in advance for repeating stuff below that you already know.

 

Assuming we have the scenario of a Windows Python programmer wants to use
GDAL from Python like he uses other Python packages, and is not interested
in running GDAL command-line utilities or accessing GDAL by other means,
then all he needs is the GDAL Python bindings coupled with the GDAL DLLs and
associated supporting files. He doesn't really care about those DLLs and
supporting files; he just wants the Python bindings to work with a minimum
of steps. Ideally, he would just be able to start the GDAL installer, click
Next the next button four times, accepting all the default choices like he
does with any other Python package installer, watch the progress bar, and
then be able to import the osgeo modules from Python when it is complete.

 

In this scenario, Python is already installed. Multiple versions of it may
be installed. Python supports side-by-side installation of any number and
combination of versions. The default installation location is C:\PythonXY
where XY are the major and minor version numbers. If the user accepted the
defaults when he installed Python itself, that's where it will be. (That
somewhat violates Microsoft's best practice of storing programs in
C:\Program Files. I'm sure there's a long thread on that somewhere in the
Python mailing list archives, but I've never seen it.) Certain applications
that embed Python may choose a different location. For example, ArcGIS 10
includes a copy of Python 2.6 and ESRI decided to store it in
C:\Python26\ArcGIS10.0. I'm not sure what that was done, but it doesn't
really matter.

 

Python's installer maintains a list of which versions are installed and the
installation locations in the Windows Registry (a system-wide database of
config info). When a package installation program runs, assuming it was
built with the standard Python distutils technology I mentioned, it prompts
the user with a list of installed Python instances and asks which one he
wants to install he package to. The user picks one and the package files are
installed to the "site-packages" directory within that Python instance. This
will be, for example, C:\Python26\Lib\site-packages.

 

If a Python package includes extension modules, i.e. Python modules written
in C/C++ rather than Python, they are compiled by Python to .pyd files.
These are actually DLLs; they just have the extension .pyd rather than .dll
and must contain a certain entry point that Python can call after loading
them. GDAL includes several of these, such as _gdal.pyd, _gdal_array.pyd,
and so on. These modules "implicltly" link to the GDAL DLLs such as
gdal17.dll. Therefore, in order for Python to successfully load _gdal.pyd
when the Python program imports the osgeo.gdal package, gdal17.dll has to be
locatable by Windows, along with the other modules that gdal17.dll itself
depends on.

 

There are several ways that gdal17.dll might be locatable. Here is what
Windows does:
http://msdn.microsoft.com/en-us/library/7d83bc18%28v=vs.80%29.aspx.
Unfortunately, none of those are optimal for GDAL's Python bindings. Under
the first option, the executable module will typically but not always be
C:\PythonXY\python.exe. If an embedding application loads the Python
interpreter, it will be whatever executable that program is (e.g. C:\Program
Files\ArcGIS\bin\ArcMap.exe). So this is not a good choice.

 

The second choice, the current directory, might be something to try. The
GDAL Python bindings, e.g. gdal.py, could be modified to call a Python
function to change the current directory to wherever the GDAL DLLs are, then
import the _gdal.pyd, then change directory back.

 

The Windows system directory (C:\Windows\system32) and Windows directory
(C:\Windows) are probably not good. You guys don't want to put all of your
DLLs in there.

 

The directories listed in the PATH environment variable are the typical
solution. This is probably what most Python programmers do these days, based
on instructions from the GDAL team. They put GDAL in C:\gdal17 or something
and use the Windows Control Panel to modify the system PATH variable, so
that all Windows processes will have C:\gdal17 in the PATH. But in our
scenario, the Python programmer doesn't really care about the GDAL DLLs.
He's not planning to build something in another language that links to them.
He just wants his Python stuff to work with a minimum of fuss. So this is
less than optimal from his point of view.

 

Here are some possible alternatives:

 

1.    The minimum case: build an installation package for the Python
bindings but still require the user to manually store the GDAL binaries
someplace and set PATH, PROJ_LIB, GDAL_DATA, etc. This will at least give a
GUI installer to get the bindings installed, even if they still have to
manually install GDAL.

 

2.    Build an installation package as above. Have it install the GDAL DLLs
as a subdirectory of the osgeo directory, e.g.
C:\PythonXY\Lib\site-packages\osgeo\bin. Modify gdal.py to set
os.environ['PATH'] = os.environ['PATH'] + ';' gdalInstallDir to modify the
PATH to include that directory prior to importing _gdal.pyd. The PATH will
be modified for the running process only, for the duration of that process.

 

3.    Same as #2 but rather than modifying gdal.py to set the PATH variable,
instead create a new Python extension module called _gdal_dll_helper.pyd.
The job of this C extension module is simply to get gdal.dll and other DLLs
loaded without resorting to modifying the system PATH which can sometimes
have weird consequences (I can explain more if needed). The extension module
would call the Windows SetDllDirectory
<http://msdn.microsoft.com/en-us/library/ms686203%28v=vs.85%29.aspx>
function, call LoadLibrary to explicitly load gdal17.dll into the current
process, then call SetDllDirectory again to set the DLL directory back to
what it was previously. Then, when gdal.py wants to load _gdal.pyd,
gdal17.dll is already loaded and the binding succeeds.

 

4.    Statically link all necessary code to _gdal.pyd, _gdal_array.pyd, etc.
This would eliminate the need to store the GDAL DLLs on the machine at all,
as all code would be in the bindings .pyd DLLs. This would likely be a big
job, and probably not even possible given the variety of libraries that GDAL
leverages.

 

I know #2 and #3 sound scary but they can be done cleanly. I currently use a
variation of #3 in my own project that embeds GDAL and its Python bindings.

 

You also asked:

 

The individual packages may specify the required minimum version of the
referred packages loaded by the .NET runtime. Is this something that can be
done with the python environment as well?

 

As far as I know, there is not really a clean, standardized way to do that.
There are some innovations related to the "easy_install" thing I mentioned
earlier, but nothing has become standard in Python itself, at least with the
2.x releases.

 

Jason

 

 

From: Tamas Szekeres [mailto:szekerest at gmail.com] 
Sent: Thursday, January 06, 2011 7:24 AM
To: Jason Roberts
Cc: gdal-dev at lists.osgeo.org; Christopher Barker
Subject: Re: Fwd: Re: [gdal-dev] FWTools and GDAL 1.7.0

 

Jason,

I appreciate the expertise for all of you along with this thread, I could
already gather quite some useful information from here for this reason. I
must mention that my programming practice in Python can be considered as
zero, this is the main reason that my issues may have trivial solutions for
the hardcore pythonists but not trivial to me. Apologies for this
inconvenience :-)

Getting back to the original topic, you mention that the gdal binaries
should be installed somewhere an set PATH, GDAL_DATA, PROJ_LIB and
GDAL_DRIVER_PATH as a systemwide setting. This is where the problems of mine
are starting. Modifying the PATH globally is a bad practice in 99% of the
cases. The only case I'm aware of which may not be a problem when we make
sure that only one version of such files (dll-s and executables) will ever
be installed to a particular system. But this is not the case with the gdal
binaries as I would expect at least a development or a stable version (and
their x86/x64 variants) to coexist which should be used by the same user.
The same problem may arise when we would like to install multiple versions
to the site packages directory, how the versions of the files are maintained
by the python runtime? In this regard I could mention something like what
have been done with the .NET framework with the multiple versions of the
packages installed simultaneously in the global assembly cache. The
individual packages may specify the required minimum version of the referred
packages loaded by the .NET runtime. Is this something that can be done with
the python environment as well?

(As opposed to this, the dumb solution of having a starting script to open a
command prompt (and setting PYTHONPATH properly) would ensure multiple
versions to be used at the same time, since those settings are applyed to
the cmd process solely.)

Best regards,

Tamas



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/gdal-dev/attachments/20110106/32beaeb5/attachment-0001.html


More information about the gdal-dev mailing list