[postgis-devel] [PostGIS] #590: [raster] Two rasters version of ST_MapAlgebra

PostGIS trac at osgeo.org
Fri Oct 28 07:36:11 PDT 2011


#590: [raster] Two rasters version of ST_MapAlgebra
----------------------------+-----------------------------------------------
 Reporter:  pracine         |       Owner:  dustymugs    
     Type:  task            |      Status:  assigned     
 Priority:  critical        |   Milestone:  PostGIS 2.0.0
Component:  postgis raster  |     Version:  trunk        
 Keywords:                  |  
----------------------------+-----------------------------------------------

Comment(by bnordgren):

 Replying to [comment:16 pracine]:
 > Right. This is why I planned that ST_MapAlgebra would implicitely
 resample. However nothing prevent us from forcing users to make explicit
 this resampling:
 >
 > SELECT ST_Mapalgebra(rast1, ST_Resample(rast2, rast1), expr,
 'INTERSECTION')
 > FROM cov1, cov2
 > WHERE ST_Intersects(rast1, rast2)
 >
 > Are we losing performance here?

 There's another axis here. You've mentioned "explicit" v. "implicit". You
 didn't mention "on-demand" v. "precomputed tiles". While I'm generally not
 a fan of implicit operations, I do not see a perfomance issue on that
 axis. However, "on-demand" resampling could well increase performance. In
 your example above, you have to cache four tiles to get good performance:
 rast1, rast2, ST_Resample(rast2, rast1), and the result. With on-demand
 resampling, you need to cache only three. This means that your tiles can
 be larger before they cause cache misses.

 Now mapalgebra is a special case where this "precomputed tiles" resampling
 strategy works well. Your biggest performance hit will come with other
 queries, particularly queries which join a raster coverage with a vector
 coverage. Users can quickly find themselves causing an entire tile to be
 resampled in order to lookup one point. Then the tile is resampled again
 to lookup the next point. Offering on demand resampling would do much to
 keep users out of trouble...or eliminate the need for them to explicitly
 create, populate, use, and destroy a temporary table containing the
 resampled raster.

 This is sort of a shameless plug for the spatial collection framework I
 wrote as part of the raster iterator, as it allows for on-demand
 operations. And clearly, one can use "on demand" operations to produce a
 precomputed raster, simply by iterating over all of the cells in the
 raster.  Currently, however, the spatial collection framework is not
 exposed to the user. It's a toolbox for C programmers to use. It could be
 exposed to the user if there were a need for it.

 > PostGIS in general assume that all the geometry in one table are of the
 same SRID. PostGIS DO NOT reproject geometries having different SRID in
 spatial relationship functions (like ST_Intersection). I think it is wise
 to follow the same rule (We do not reproject rasters when performing
 spatial operation between two raster having different SRID).

 It is rare for global vector datasets to be distributed in more than one
 SRID. It is common for global raster datasets to be distributed in more
 than one SRID. While I agree that following PostGIS's lead is a good
 default position, I would suggest that fundamental differences like this
 represent a solid justification for departing from a behavior which was
 designed to accommodate vector data.

 > The question we have to solve, since this is specific to raster, is
 "Should we resample implicitely when needed?"

 In spite of my misgivings about implicit operations, I'd vote yes.
 Caveats: the resampling should be predictable and shouldn't add a whole
 bunch of controls to all functions which may need to resample. For
 instance, the second argument should always be resampled to the first
 argument; or the user should be able to specify an empty raster which
 defines the desired grid. If the user wants something other than nearest
 neighbor, they can explicitly resample.

-- 
Ticket URL: <http://trac.osgeo.org/postgis/ticket/590#comment:18>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-devel mailing list