[gdal-dev] Splitting and projecting a large .sid image
Brent Fraser
bfraser at geoanalytic.com
Thu Mar 19 14:43:53 EDT 2009
Steve,
My general strategy when doing this (on terabytes of raster data) is:
1. Create a tileindex of input data.
Since you've got multiple zones of raster data, you'll need to create one tileindex for each zone (using gdaltindex), de-project them to geographic, and concatenate them (use ogr2ogr or PostGIS).
2. Create a geographic tileindex of output files (via your own script)
The size, extents, and amount of overlap are up to you. It doesn't have to be a tileindex; it could be just a list of extents for use in steps 3 and 5 below.
3. Do a spatial select for each output tile on the input tileindex to create a "mini" tileindex for each output tile.
I use a hacked version of shputils to do this, but I expect PostGIS or ogr2ogr + SQL would be a better way to go.
4. Convert all the mini tileindexes to vrt files (see the new gdalbuildvrt app!)
5 Use gdal_translate with the extents from step 2 to extract the output tile pixels from its corresponding vrt.
Convoluted yes, but very "scalable". And no Null pixels in the output tiles.
Brent Fraser
GeoAnalytic Inc.
Stephen Woodbridge wrote:
> Hi all,
>
> This is my annual wake-up got to do something with rasters and forgot
> the finer details of the last time I did a problem similar to it.
>
> I have a bunch of LandSat mrsid imagery, they are in multiple UTM zone
> projections and I want to convert them to geographic projection, WGS84,
> and into tiff files. The files are too large to convert into a single
> tif, so I plan to quarter each file into 4 pieces.
>
> So I wrote a script the reads the files and collects info about them,
> then generates 4 gdalwarp commands to reproject each quarter of the file
> (perl included below). A typical command looks like:
>
> gdalwarp -srcnodata 0 -dstnodata 0 -s_srs +init=epsg:32642 -t_srs
> EPSG:4326 -te 68.9999676729652 24.9707453133203 72.7715338115736
> 27.4799559407707 -rb -wm 250 --config GDAL_ONE_BIG_READ ON -co
> "TILED=YES" ./GeoCover2000/n-42-25.sid ./af/n-42-25_1-0.tif
>
>
> The reprojection seems to have worked fine, but the bounds seem to be
> off because of the black (nodata) areas.
>
> You can see my poor results here:
> http://imaptools.com/downloads/raster.gif
>
> Is there a better way to do this? I seem to remember a tool to generate
> tiles.
> Does this need to be done in two steps? How?
> Did I mess up my logic below? I've been over it multiple times and do
> not see any errors.
>
> Any help would be appreciated.
>
> Thanks,
> -Steve
>
> $ gdalinfo -nomd ./GeoCover2000/n-42-25.sid
> Driver: MrSID/Multi-resolution Seamless Image Database (MrSID)
> Files: ./GeoCover2000/n-42-25.sid
> Size is 51078, 39090
> Coordinate System is `'
> Origin = (136066.125000000000000,3323577.375000000000000)
> Pixel Size = (14.250000000000000,-14.250000000000000)
> Corner Coordinates:
> Upper Left ( 136066.125, 3323577.375)
> Lower Left ( 136066.125, 2766544.875)
> Upper Right ( 863927.625, 3323577.375)
> Lower Right ( 863927.625, 2766544.875)
> Center ( 499996.875, 3045061.125)
> Band 1 Block=1024x128 Type=Byte, ColorInterp=Red
> Minimum=0.000, Maximum=226.000, Mean=109.610, StdDev=39.202
> Overviews: 25539x19545, 12770x9773, 6385x4887, 3193x2444, 1597x1222,
> 799x611, 400x306, 200x153
> Band 2 Block=1024x128 Type=Byte, ColorInterp=Green
> Minimum=0.000, Maximum=229.000, Mean=125.833, StdDev=31.828
> Overviews: 25539x19545, 12770x9773, 6385x4887, 3193x2444, 1597x1222,
> 799x611, 400x306, 200x153
> Band 3 Block=1024x128 Type=Byte, ColorInterp=Blue
> Minimum=0.000, Maximum=227.000, Mean=107.092, StdDev=37.230
> Overviews: 25539x19545, 12770x9773, 6385x4887, 3193x2444, 1597x1222,
> 799x611, 400x306, 200x153
>
>
> $ cat ./GeoCover2000/n-42-25.prj
> Projection UTM
> Datum WGS84
> Zunits NO
> Units METERS
> Zone 42
> Xshift 0.000000
> Yshift 0.000000
> Parameters
>
>
>
> Perl snippet:
>
> foreach my $sid (@files) {
>
> next unless $wanted && $sid =~ /$wanted/i;
>
> if (! -r $sid) {
> warn "SKIPPING: $sid is not readable!\n";
> next;
> }
>
> my $gdalinfo = `gdalinfo -nomd $sid`;
>
> # parse out the strings that we need
>
> $gdalinfo =~ /Upper Left \((\s*[^,]+),\s*([^\)]+)/s || warn
> "Upper Left parse error\n";
> my $ul = [$1, $2];
> $gdalinfo =~ /Lower Left \((\s*[^,]+),\s*([^\)]+)/s || warn
> "Lower Left parse error\n";
> my $ll = [$1, $2];
> $gdalinfo =~ /Upper Right \((\s*[^,]+),\s*([^\)]+)/s || warn
> "Upper Right error\n";
> my $ur = [$1, $2];
> $gdalinfo =~ /Lower Right \((\s*[^,]+),\s*([^\)]+)/s || warn
> "Lower Right error\n";
> my $lr = [$1, $2];
>
> my $outf = $sid;
> $outf =~ s/\.sid//;
> $outf =~ m/\/([^\/]+)$/;
> $outf = $1;
>
> my $prj = $sid;
> $prj =~ s/\.sid/.prj/;
> if (! -r $prj) {
> warn "SKIPPING: $prj is not readable!\n";
> next;
> }
> my $projinfo = `grep Zone $prj`;
> if ($projinfo =~ /^Zone\s*([-+]?\d+)/) {
> my $znum = $1;
> if ($znum < 0) {
> $epsg = 32700 - $znum;
> } else {
> $epsg = 32600 + $znum;
> }
> }
> else {
> warn "SKIPPING: could not find zone or epsg in $prj\n";
> next;
> }
>
> my $proj = sprintf("+init=epsg:%d", $epsg);
> print "Using projection: $proj for $projinfo" if $DEBUG;
>
> my $pj = Geo::Proj4->new($proj);
> if (! $pj) {
> warn "SKIPPING: Geo::Proj4 error: " . Geo::Proj4->error . "\n";
> next;
> }
>
> $ul = [reverse $pj->inverse(@$ul)];
> $ll = [reverse $pj->inverse(@$ll)];
> $ur = [reverse $pj->inverse(@$ur)];
> $lr = [reverse $pj->inverse(@$lr)];
>
> # this is probably a parallelgram in latlong
> if ($DEBUG) {
> printf("ul = %11.6f, %11.6f\n", @$ul);
> printf("ll = %11.6f, %11.6f\n", @$ll);
> printf("ur = %11.6f, %11.6f\n", @$ur);
> printf("lr = %11.6f, %11.6f\n", @$lr);
> }
>
> # Get the bbox of the parallelgram
> my $xmin = $ul->[0]<$ll->[0]?$ul->[0]:$ll->[0];
> my $xmax = $ur->[0]>$lr->[0]?$ur->[0]:$lr->[0];
> my $ymin = $ll->[1]<$lr->[1]?$ll->[1]:$lr->[1];
> my $ymax = $ul->[1]>$ur->[1]?$ul->[1]:$ur->[1];
>
> my $tiled = '-co "TILED=YES"';
> $tiled = '' if -e $outf;
>
> my $nparts = 2;
> my $dx = ($xmax - $xmin) / $nparts;
> my $dy = ($ymax - $ymin) / $nparts;
> my ($x0, $x1, $y0, $y1);
> for (my $i=0, $x0=$xmin; $i<$nparts; $i++, $x0 += $dx) {
> $x1 = $x0 + $dx;
> for (my $j=0, $y0=$ymin; $j<$nparts; $j++, $y0 += $dy) {
> $y1 = $y0 + $dy;
> my $tif = sprintf("%s/%s/%s_%d-%d.tif", $work, $outd,
> $outf, $i, $j);
> my $cmd = sprintf("gdalwarp -srcnodata 0 -dstnodata 0
> -s_srs +init=epsg:%d -t_srs EPSG:4326 -te $x0 $y0 $x1 $y1 -rb -wm 250
> --config GDAL_ONE_BIG_READ ON $tiled $sid $tif\n", $epsg);
>
> print "$cmd\n";
> my $rc = system($cmd);
> print "SYSTEM returned: " . sprintf("%lx\n", ($rc &
> 0xffff))
> if $DEBUG
> }
> }
> }
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
More information about the gdal-dev
mailing list