[gdal-dev] VRT derived band pixel functions written in Python

Mon Sep 12 08:08:45 PDT 2016

Hi Even,

On Mon, Sep 12, 2016 at 2:31 PM, Even Rouault <even.rouault at spatialys.com>
wrote:

> Hi,
>
> I wanted to mention a new (and I think pretty cool ) feature I've added in
> trunk: the possibility to define pixel functions in Python in a VRT derived
> band.
>
> ...
>
> There are obvious security concerns in allowing Python code to be run when
> getting the content of a vrt file. The GDAL_VRT_ENABLE_PYTHON config
> option =
> IF_SAFE / NO / YES can be set to control the behaviour. The default is
> IF_SAFE
> (can be change at compilation time by defining
> -DGDAL_VRT_ENABLE_PYTHON_DEFAULT="NO" e.g. And Python code execution can
> be
> completely disabled with -DGDAL_VRT_DISABLE_PYTHON). Safe must be
> understood
> as: the code will not read, write, delete... files, spawn external code, do
> network activity, etc. Said otherwise, the code will only do "maths". But
> infinite looping is something definitely possible in the safe mode. The
> heuristics of the IF_SAFE mode is rather basic and I'd be grateful if
> people
> could point ways of breaking it. If any of the following strings - "import"
> (unless it is "import numpy" / "from numpy import ...", "import math" /
> "from
> math import ..." or "from numba import jit"), "eval", "compile", "open",
> "load", "file", "input", "save", "memmap", "DataSource", "genfromtxt",
> "getattr", "ctypeslib", "testing", "dump", "fromregex" - is found anywhere
> in
> the code, then the code is considered unsafe (there are interestingly a
> lot of
> methods in numpy to do file I/O. Hopefully I've captured them with the
> previous
> filters). Another 'unsafe' pattern is when the pixel function references an
> external module like my above my_lib.hillshade example (who knows if there
> will not be some day a shutil.reformat_your_hard_drive function with the
> right
> prototype...)
>
> This new capability isn't yet documented in the VRT doc, although this
> message
> will be a start.
>
> I'm interested in feedback you may have.
>

I found http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html
to be a good intro to the risks of eval'ing untrusted Python code.
Mentioned in there is a notable attempt to make a secure subset of Python
called "pysandbox", but its developer has since declared it "broken by
design": https://lwn.net/Articles/574215/. I'm not knowledgeable enough
about sandboxing (OS or otherwise) to say if that's right.

I see that in GDAL 2.0+ we can set options in the VRT XML itself. Is it
possible to set GDAL_VRT_ENABLE_PYTHON=YES in a VRT and thus override the
reader's own trust policies? My ignorance of how GDAL separates "open"
options from "config" options might be on display in this question.

My $.02 is that since "is safe" will be hard to guarantee (it's an
outstanding unsolved Python community issue), removing "IF_SAFE" from the
options would be a good thing and that the default for
GDAL_VRT_ENABLE_PYTHON should be "NO".

-- 
Sean Gillies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160912/160dd9f7/attachment.html>