[mapserver-dev] Server-side Simplication Speed-up

Mon Apr 9 06:12:15 EDT 2012

I went back to look at our decimation strategy to see if it could be
optimized and it turned out it could be a bit more, although the gain
is marginal in real-world cases. I also added a "discarding" decimator
that rejects polygonal features smaller than a pixel. This is all in
my "decimator" branch for the curious.

My main point is that since 6.0 (or maybe 5.6?) the strategy used for
converting geographical coordinates to pixels is configurable at the
layer level with a processing entry. This has never really been
documented as the syntax I came up with seemed kludgy. So here's a
plea for help for finding something that's fit for being documented:

The actual processing key is called APPROXIMATION_SCALE, it's
different values are

- SIMPLIFY: default used by the AGG and CAIRO renderer, maps to
floating point pixel coordinates. Discards vertices closer than a
pixel than the previous one. For lines, always keeps the first and
last vertice. For polygons keeps the first two and last two vertices
to ensure the transformed polygon has at least 3 vertices.
- DECIMATE: (new in 6.2), same as "simplify", except polygons are
discarded if they end up having less than 3 vertices (i.e. doesn't try
to keep the second and two last vertices).
- ROUND: default for GD. maps geographical coordinates to integer
pixel values, can produce degenerate geometries as GD doesn't mind
rendering a single vertice polygon or line.
- NONE: default for KML. no transformation, coordinates are left in
geographical space.
- FULL: maps to floating point pixel values, without any simplification.
- <floating point>: snaps transformed points to a grid. Specifying "1"
here is basically the same as the ROUND value. This one is
experimental and hasn't really been tested. The rendered output is of
poor quality with AGG.

Ok, now I have written this, I'm thinking that the correct way is to
extend the layer level TRANSFORM to accept more than TRUE|FALSE ?

cheers,

thomas

On Sun, Apr 8, 2012 at 14:03, Paul Ramsey <pramsey at opengeo.org> wrote:
> You are probably right. I can test this without mucking with vtables.
> I think the invalid/error problem would not be a big deal, because the
> idea is to drop vertices at below the pixel level. However, talking
> with strk, it's possible that MapNik actually lacks a generic
> decimation routine in their pipeline, so his optimization not only
> reduced the amount of data crossing the pipe from the database, but
> also cut down the amount hitting the renderer, which would have been
> the big win. In which case for us (who already have a decimator
> further up the pipeline) decimating at the datastore level would yield
> much lower returns. Andrea Aime says he already tried data-store level
> simplification in GeoServer and saw no speed-up (GeoServer also has a
> generic decimator already).
>
> So, anyway, I'll run the experiment at some point and report back, but
> no urgency here.
>
> P.
>
> On Sun, Apr 8, 2012 at 3:05 AM, thomas bonfort <thomas.bonfort at gmail.com> wrote:
>> I'm somewhat suspicious as to the real-world usage of such a feature:
>>
>> Quality-wise, st_simplify produces some invalid geometries that have
>> to be filtered out, resulting in corrupt output when you are
>> visualizing adjacent features. If you have a look at this map
>> http://t.co/KVaLpOV1 (which I suppose has this feature incorporated,
>> if not my reasoning is wrong) and zoom out far enough you'll see that
>> all the discarded features make up for a corrupt output. Basically
>> it's correct to filter out a tiny feature when it's on its own, but
>> not when it is topologically linked to surrounding features.
>>
>> Performance-wise, you are getting a speedup when the scale of the data
>> you are rendering is grossly out of scale compared to the scale at
>> which you are rendering it. When rendering openstreetmap data, this
>> happens on the first zoom levels, and so this 75% speedup concerns
>> only a fraction of the renderings that occur when seeding a complete
>> tileset. Openstreetmap data imported with Imposm would benefit much
>> less from this feature, as it generates simplified geometries that can
>> be used to avoid using out-of-scale data.
>>
>> For MapServer, we already have a simplification routine happening,
>> (ticket https://github.com/mapserver/mapserver/issues/2381 discusses
>> the issue, and links to the nabble discussion that happened on -dev) .
>> It works at the pixel level (once the features have been transformed
>> from geometry space to pixel space), and as such does not need to know
>> at what scale the data is being rendered in (basically, a feature
>> that's smaller than a pixel, or two successive vertices closer to each
>> other than a pixel, are candidates for simplification). Supposing
>> postgis's st_simplify isn't more efficient than mapserver's
>> simplication algorithm (which I hope it is not, mapserver's one is
>> really stupid :) ), the only gain we could hope for with this feature
>> is the time taken to send the data down the wire from postgis to
>> mapserver, and eventually the overhead of allocating a larger number
>> of points than what will actually be rendered. Unless my reasoning is
>> flawed or someone can show me the hard numbers that we have something
>> to gain from this, I think that changing our input data vtables is too
>> big a cost for the hypothetic speedup that would occur in only a minor
>> percent of the actual renders.
>>
>> Best regards,
>>
>> thomas
>>
>> On Thu, Mar 22, 2012 at 20:59, Paul Ramsey <pramsey at opengeo.org> wrote:
>>> Sandro working on MapNik says that this change
>>>
>>> https://github.com/mapnik/mapnik/issues/1136
>>>
>>> running an "appropriate" st_simplify on the geometries before sending
>>> them over the wire to the renderer, gave up to 75% speedups on
>>> rendering complicated items.
>>>
>>> In order to try it out on MapServer, the PostGIS driver needs to know
>>> two things:
>>> - what's the resolution being rendered to? this I can pick up out of
>>> the mapObj easily I think, tracing back up from the layerObj passed
>>> into the driver
>>> - is this a rendering call to the driver, or a feature access call?
>>> this is harder... I need to distinguish between calls to the driver
>>> that are going to use the features for rendering and those that are
>>> going to send them back to the requester as data. Is there an obvious
>>> global way to figure this out from the driver level?
>>>
>>> P.
>>> _______________________________________________
>>> mapserver-dev mailing list
>>> mapserver-dev at lists.osgeo.org
>>> http://lists.osgeo.org/mailman/listinfo/mapserver-dev