[postgis-devel] [PostGIS] #590: [raster] Two rasters version of ST_MapAlgebra
PostGIS
trac at osgeo.org
Fri Oct 28 07:36:11 PDT 2011
#590: [raster] Two rasters version of ST_MapAlgebra
----------------------------+-----------------------------------------------
Reporter: pracine | Owner: dustymugs
Type: task | Status: assigned
Priority: critical | Milestone: PostGIS 2.0.0
Component: postgis raster | Version: trunk
Keywords: |
----------------------------+-----------------------------------------------
Comment(by bnordgren):
Replying to [comment:16 pracine]:
> Right. This is why I planned that ST_MapAlgebra would implicitely
resample. However nothing prevent us from forcing users to make explicit
this resampling:
>
> SELECT ST_Mapalgebra(rast1, ST_Resample(rast2, rast1), expr,
'INTERSECTION')
> FROM cov1, cov2
> WHERE ST_Intersects(rast1, rast2)
>
> Are we losing performance here?
There's another axis here. You've mentioned "explicit" v. "implicit". You
didn't mention "on-demand" v. "precomputed tiles". While I'm generally not
a fan of implicit operations, I do not see a perfomance issue on that
axis. However, "on-demand" resampling could well increase performance. In
your example above, you have to cache four tiles to get good performance:
rast1, rast2, ST_Resample(rast2, rast1), and the result. With on-demand
resampling, you need to cache only three. This means that your tiles can
be larger before they cause cache misses.
Now mapalgebra is a special case where this "precomputed tiles" resampling
strategy works well. Your biggest performance hit will come with other
queries, particularly queries which join a raster coverage with a vector
coverage. Users can quickly find themselves causing an entire tile to be
resampled in order to lookup one point. Then the tile is resampled again
to lookup the next point. Offering on demand resampling would do much to
keep users out of trouble...or eliminate the need for them to explicitly
create, populate, use, and destroy a temporary table containing the
resampled raster.
This is sort of a shameless plug for the spatial collection framework I
wrote as part of the raster iterator, as it allows for on-demand
operations. And clearly, one can use "on demand" operations to produce a
precomputed raster, simply by iterating over all of the cells in the
raster. Currently, however, the spatial collection framework is not
exposed to the user. It's a toolbox for C programmers to use. It could be
exposed to the user if there were a need for it.
> PostGIS in general assume that all the geometry in one table are of the
same SRID. PostGIS DO NOT reproject geometries having different SRID in
spatial relationship functions (like ST_Intersection). I think it is wise
to follow the same rule (We do not reproject rasters when performing
spatial operation between two raster having different SRID).
It is rare for global vector datasets to be distributed in more than one
SRID. It is common for global raster datasets to be distributed in more
than one SRID. While I agree that following PostGIS's lead is a good
default position, I would suggest that fundamental differences like this
represent a solid justification for departing from a behavior which was
designed to accommodate vector data.
> The question we have to solve, since this is specific to raster, is
"Should we resample implicitely when needed?"
In spite of my misgivings about implicit operations, I'd vote yes.
Caveats: the resampling should be predictable and shouldn't add a whole
bunch of controls to all functions which may need to resample. For
instance, the second argument should always be resampled to the first
argument; or the user should be able to specify an empty raster which
defines the desired grid. If the user wants something other than nearest
neighbor, they can explicitly resample.
--
Ticket URL: <http://trac.osgeo.org/postgis/ticket/590#comment:18>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-devel
mailing list