[gdal-dev] Merging a large number of non-overlapping rasters with PixelFunctionType

Even Rouault even.rouault at spatialys.com
Fri Feb 10 03:18:21 PST 2023


Dave,

Most use cases of VRTDerivedRasterBand I'm aware of involve 
non-mosaicing scenarios where all sources have the same extent, which is 
the one of the VRT, and thus are all used. So there's no optimization to 
remove sources that don't intersect to the requested area. But that is 
something definitely reasonable to fix/optimize. Please file a ticket 
about that at https://github.com/OSGeo/gdal/issues

Even

Le 10/02/2023 à 11:20, Dave a écrit :
> As part of our processing pipeline, we are trying to use GDAL to merge a large (potentially >2000) number of rasters.
> In most cases, the rasters are processed from large sources split into tiles of arbitrary coordinates and are mostly non-overlapping, but because of reprojection etc, some small overlap could be expected. In other cases, some tiles might generate more than one product, and we would have some significant overlap.
>
> In order to control the way these overlaps are handled (we potentially need to apply our own mean/stdev/etc functions), we are trying to use a VRTDataset with a Python PixelFunctionType (using Numba).
>
> Things generally work OK, when the number of tiles is “low” (< 1000), however, we ran into an issue where Numba will crash above 1000 tiles, as GDAL tries to pass a 1000+ long tuple as `in_arr` argument to the PixelFunctionType function.
>
> What I do not understand, is why the tuple passed to the Python function, contains *all* the sources, when only a fraction (most of the time: 1) overlap with DstRect.
> This happens regardless of whether I try processing the VRT file with gdalwarp or gdal_translate.
>
> I guess I am missing something in the way VRT/VRTDerivedRasterBand/ComplexSource are supposed to work, but can’t figure out why from the VRT documentation page. Is there a better way to achieve this?
>
> Below is a small excerpt from a typical VRT file. When run through gdalwarp (/gdal_translate), the `in_arr` argument sent to the python function invariably contains as many arrays as there are ComplexSource, regardless of whether they overlap (non-overlapping one will contain all nodata/0):
>
> <VRTDataset rasterXSize="58786" rasterYSize="9222">
>    <SRS dataAxisToSRSAxisMapping="1,2”>[…]</SRS>
>    <GeoTransform> -7.9902911092382642e+06,  4.6331271652776735e+02,  0.0000000000000000e+00,  2.2618926820882889e+06,  0.0000000000000000e+00, -4.6331271652773665e+02</GeoTransform>
>    <VRTRasterBand subClass="VRTDerivedRasterBand" dataType="Float32" band="1" NoDataValue="0" blockXSize="128" blockYSize="128">
>      <PixelFunctionType>test.test</PixelFunctionType><PixelFunctionArguments nodata="0" /><PixelFunctionLanguage>Python</PixelFunctionLanguage>
>      <NoDataValue>0</NoDataValue>
>      <HideNoDataValue>1</HideNoDataValue>
>      <ColorInterp>Gray</ColorInterp>
>      <ComplexSource>
>        <SourceFilename relativeToVRT="1">././test_vrt/0a06211d1ceccd06aff9a5a50f5e70b5.tif</SourceFilename>
>        <SourceBand>1</SourceBand>
>        <SourceProperties RasterXSize="568" RasterYSize="544" DataType="Float32" BlockXSize="256" BlockYSize="256" />
>        <SrcRect xOff="0" yOff="0" xSize="568" ySize="544" />
>        <DstRect xOff="52791" yOff="4881" xSize="568" ySize="544" />
>        <NODATA>0</NODATA>
>      </ComplexSource>
>      <ComplexSource>
>        <SourceFilename relativeToVRT="1">././test_vrt/0a120c177147927bc0bd8c766907e65f.tif</SourceFilename>
>        <SourceBand>1</SourceBand>
>        <SourceProperties RasterXSize="556" RasterYSize="544" DataType="Float32" BlockXSize="256" BlockYSize="256" />
>        <SrcRect xOff="0" yOff="0" xSize="556" ySize="544" />
>        <DstRect xOff="24236" yOff="3796" xSize="556" ySize="544" />
>        <NODATA>0</NODATA>
>      </ComplexSource>
> […]
>
> Thanks in advance!
>> Dave dV
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.



More information about the gdal-dev mailing list