[mapserver-dev] Server-side Simplication Speed-up

Mon Apr 9 14:35:10 EDT 2012

There's no common denominator per se:

GD could also deal with SIMPLIFY, DECIMATE, FULL , <snap-to-grid>
AGG and CAIRO can only handle SIMPLIFY, DECIMATE, FULL,
<snap-to-grid>. ROUND will cause issues as it pushes down degenerate
geometries.
KML only handles NONE (you can only put lon/lat vertices in a kml file).

We can/should also keep the TRUE value for TRANSFORM, which would make
the renderer use it's default transformation mode for the LAYER.

--
thomas

On Mon, Apr 9, 2012 at 18:50, Lime, Steve D (DNR)
<Steve.Lime at state.mn.us> wrote:
> Hmmm... Which APPROXIMATION_SCALE values would have meaning for all drivers?
>
> -----Original Message-----
> From: mapserver-dev-bounces at lists.osgeo.org [mailto:mapserver-dev-bounces at lists.osgeo.org] On Behalf Of thomas bonfort
> Sent: Monday, April 09, 2012 5:12 AM
> To: Paul Ramsey
> Cc: mapserver-dev
> Subject: Re: [mapserver-dev] Server-side Simplication Speed-up
>
> I went back to look at our decimation strategy to see if it could be optimized and it turned out it could be a bit more, although the gain is marginal in real-world cases. I also added a "discarding" decimator that rejects polygonal features smaller than a pixel. This is all in my "decimator" branch for the curious.
>
> My main point is that since 6.0 (or maybe 5.6?) the strategy used for converting geographical coordinates to pixels is configurable at the layer level with a processing entry. This has never really been documented as the syntax I came up with seemed kludgy. So here's a plea for help for finding something that's fit for being documented:
>
> The actual processing key is called APPROXIMATION_SCALE, it's different values are
>
> - SIMPLIFY: default used by the AGG and CAIRO renderer, maps to floating point pixel coordinates. Discards vertices closer than a pixel than the previous one. For lines, always keeps the first and last vertice. For polygons keeps the first two and last two vertices to ensure the transformed polygon has at least 3 vertices.
> - DECIMATE: (new in 6.2), same as "simplify", except polygons are discarded if they end up having less than 3 vertices (i.e. doesn't try to keep the second and two last vertices).
> - ROUND: default for GD. maps geographical coordinates to integer pixel values, can produce degenerate geometries as GD doesn't mind rendering a single vertice polygon or line.
> - NONE: default for KML. no transformation, coordinates are left in geographical space.
> - FULL: maps to floating point pixel values, without any simplification.
> - <floating point>: snaps transformed points to a grid. Specifying "1"
> here is basically the same as the ROUND value. This one is experimental and hasn't really been tested. The rendered output is of poor quality with AGG.
>
> Ok, now I have written this, I'm thinking that the correct way is to extend the layer level TRANSFORM to accept more than TRUE|FALSE ?
>
> cheers,
>
> thomas
>
> On Sun, Apr 8, 2012 at 14:03, Paul Ramsey <pramsey at opengeo.org> wrote:
>> You are probably right. I can test this without mucking with vtables.
>> I think the invalid/error problem would not be a big deal, because the
>> idea is to drop vertices at below the pixel level. However, talking
>> with strk, it's possible that MapNik actually lacks a generic
>> decimation routine in their pipeline, so his optimization not only
>> reduced the amount of data crossing the pipe from the database, but
>> also cut down the amount hitting the renderer, which would have been
>> the big win. In which case for us (who already have a decimator
>> further up the pipeline) decimating at the datastore level would yield
>> much lower returns. Andrea Aime says he already tried data-store level
>> simplification in GeoServer and saw no speed-up (GeoServer also has a
>> generic decimator already).
>>
>> So, anyway, I'll run the experiment at some point and report back, but
>> no urgency here.
>>
>> P.
>>
>> On Sun, Apr 8, 2012 at 3:05 AM, thomas bonfort <thomas.bonfort at gmail.com> wrote:
>>> I'm somewhat suspicious as to the real-world usage of such a feature:
>>>
>>> Quality-wise, st_simplify produces some invalid geometries that have
>>> to be filtered out, resulting in corrupt output when you are
>>> visualizing adjacent features. If you have a look at this map
>>> http://t.co/KVaLpOV1 (which I suppose has this feature incorporated,
>>> if not my reasoning is wrong) and zoom out far enough you'll see that
>>> all the discarded features make up for a corrupt output. Basically
>>> it's correct to filter out a tiny feature when it's on its own, but
>>> not when it is topologically linked to surrounding features.
>>>
>>> Performance-wise, you are getting a speedup when the scale of the
>>> data you are rendering is grossly out of scale compared to the scale
>>> at which you are rendering it. When rendering openstreetmap data,
>>> this happens on the first zoom levels, and so this 75% speedup
>>> concerns only a fraction of the renderings that occur when seeding a
>>> complete tileset. Openstreetmap data imported with Imposm would
>>> benefit much less from this feature, as it generates simplified
>>> geometries that can be used to avoid using out-of-scale data.
>>>
>>> For MapServer, we already have a simplification routine happening,
>>> (ticket https://github.com/mapserver/mapserver/issues/2381 discusses
>>> the issue, and links to the nabble discussion that happened on -dev) .
>>> It works at the pixel level (once the features have been transformed
>>> from geometry space to pixel space), and as such does not need to
>>> know at what scale the data is being rendered in (basically, a
>>> feature that's smaller than a pixel, or two successive vertices
>>> closer to each other than a pixel, are candidates for
>>> simplification). Supposing postgis's st_simplify isn't more efficient
>>> than mapserver's simplication algorithm (which I hope it is not,
>>> mapserver's one is really stupid :) ), the only gain we could hope
>>> for with this feature is the time taken to send the data down the
>>> wire from postgis to mapserver, and eventually the overhead of
>>> allocating a larger number of points than what will actually be
>>> rendered. Unless my reasoning is flawed or someone can show me the
>>> hard numbers that we have something to gain from this, I think that
>>> changing our input data vtables is too big a cost for the hypothetic
>>> speedup that would occur in only a minor percent of the actual renders.
>>>
>>> Best regards,
>>>
>>> thomas
>>>
>>> On Thu, Mar 22, 2012 at 20:59, Paul Ramsey <pramsey at opengeo.org> wrote:
>>>> Sandro working on MapNik says that this change
>>>>
>>>> https://github.com/mapnik/mapnik/issues/1136
>>>>
>>>> running an "appropriate" st_simplify on the geometries before
>>>> sending them over the wire to the renderer, gave up to 75% speedups
>>>> on rendering complicated items.
>>>>
>>>> In order to try it out on MapServer, the PostGIS driver needs to
>>>> know two things:
>>>> - what's the resolution being rendered to? this I can pick up out of
>>>> the mapObj easily I think, tracing back up from the layerObj passed
>>>> into the driver
>>>> - is this a rendering call to the driver, or a feature access call?
>>>> this is harder... I need to distinguish between calls to the driver
>>>> that are going to use the features for rendering and those that are
>>>> going to send them back to the requester as data. Is there an
>>>> obvious global way to figure this out from the driver level?
>>>>
>>>> P.
>>>> _______________________________________________
>>>> mapserver-dev mailing list
>>>> mapserver-dev at lists.osgeo.org
>>>> http://lists.osgeo.org/mailman/listinfo/mapserver-dev
> _______________________________________________
> mapserver-dev mailing list
> mapserver-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-dev
>