[Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?

Tue Jun 23 05:03:09 PDT 2015

On 23 June 2015 at 20:07, Matthias Kuhn <matthias at opengis.ch> wrote:

> So the question is, is this something that we can fix later on, or would
> fixing it require an API break and therefore we need to wait for 3.0 if
> we don't fix it now?
>

As I mentioned in an earlier reply, could we mark this component of
the API as non-stable for 2.10? That would give us the possibility of
later revision.

IIRC it wouldn't be the first time this has been done - I think the
labeling API was initially marked as experimental and non stable
too...

Nyall

> Regards,
> Matthias
>
> On 06/23/2015 11:37 AM, Marco Hugentobler wrote:
>> Hi Martin
>>
>> The new geometries uses more memory for backward compatibility. In the
>> mid/long term, the cached wkb will be removed, so QgsAbstractGeometry
>> will be the only represention. The provider can feed wkb (e.g.
>> database providers) or construct the geometry by feeding QgsPointV2s.
>>
>> I'm however surprised that this should have such an impact with the
>> attached dataset. Because the shp has size 588MByte, you could load
>> that in memory several times (and we don't load everything into memory
>> in QGIS in case the dataset is even bigger). Looking on the attached
>> file, I think the problem is not the memory, but rather some round
>> trips in the closestVertex function.
>>
>> But it is far from something that would justify a release delay or
>> even a geometry rollback.
>>
>> Regards,
>> Marco
>>
>> On 23.06.2015 05:42, Martin Dobias wrote:
>>> Hi
>>>
>>> Agreed with Nyall. Apart from the above mentioned problems, there are
>>> newly introduced performance regressions with snapping: the cached
>>> geometries in 2.10 take double the amount of memory they used in 2.8,
>>> and snapping got much slower with more complex layers (compared to
>>> 2.8) - e.g. with [1]. The extra memory consumption is caused by the
>>> fact that providers provide geometry in WKB representation, which is
>>> then converted to new representation and WKB is kept. We should use
>>> just one representation and drop the usage of the other one - that
>>> means stop using WKB in providers and common code paths, so the WKB
>>> representation does not even get created (and cached).
>>>
>>> One heretic idea at the end - what others think about postponing the
>>> release of the new geometry architecture to 2.12 so that there is more
>>> time to address the current issues (fix performance, fix high memory
>>> consumption, clean up API, write unit tests). It seems to me that some
>>> of the issues would be difficult to address even if the release of
>>> 2.10 is moved by another week or so.
>>>
>>> Regards
>>> Martin
>>>
>>> [1] http://www.iucnredlist.org/spatial-data/MAMMALS_TERRESTRIAL.zip
>>>
>>>
>>> On Tue, Jun 23, 2015 at 10:09 AM, Nyall Dawson
>>> <nyall.dawson at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> Unfortunately, we've become aware of a serious performance
>>>> regression caused
>>>> by the new geometry engine. Basically, the situation is that for all
>>>> geometry operations which rely on geos (think buffers, splits, spatial
>>>> relation operations such as intersects and within,... ) the geometry
>>>> now
>>>> needs to be converted into a geos representation with *every*
>>>> operation. In
>>>> the old geometry engine this conversion was done once, and the
>>>> result stored
>>>> so that follow up operations would not need to recalculate it. This
>>>> potentially has huge impacts on the performance of common tasks such as
>>>> selecting all features which intersect a geometry.
>>>>
>>>> I've had a look, and unfortunately it's not trivial to fix this. I
>>>> think the
>>>> correct solution to this is to:
>>>>
>>>> - make QgsGeometryEngine accept and return QgsGeometry containers, not
>>>> QgsAbstractGeometryV2
>>>> - store the generated geos representation of geometries within
>>>> QgsGeometryPrivate inside the QgsGeometry container. This way it
>>>> will be
>>>> reusable between different geometry operations, and shared when
>>>> QgsGeometry
>>>> objects are copied. This will also have the benefit that if a
>>>> geometry is
>>>> prepared using geos then subsequent geos operations performed on that
>>>> QgsGeometry and its shared copies will be much faster.
>>>> - make QgsGeometry a friend class of QgsGeo, so that it can access
>>>> QgsGeometryPrivate to retrieve or set the geos representation of the
>>>> geometry as required
>>>>
>>>> An alternative (short term) solution would be to just cache the geos
>>>> representation when geometry operations are called through the older
>>>> QgsGeometry modification/relationship operations. This would be
>>>> easier, but
>>>> means that the API of QgsGeometryEngine will be stuck with the current
>>>> design, and we won't be able to properly fix this until breaking the
>>>> api for
>>>> 3.0.
>>>>
>>>> Either way, I doubt this can be addressed within the remaining 3
>>>> days we
>>>> have until release. Should we delay to address this? Release with the
>>>> regression? Or am I missing something and there's an easier solution we
>>>> could implement? Or even possibly this additional cost of
>>>> recalculating the
>>>> geos representation is trivial and can be ignored (maybe someone
>>>> could test
>>>> this with a little repeated intersection script)?
>>>>
>>>> Thoughts?
>>>>
>>>> Nyall
>>>>
>>>>
>>>> _______________________________________________
>>>> Qgis-developer mailing list
>>>> Qgis-developer at lists.osgeo.org
>>>> http://lists.osgeo.org/mailman/listinfo/qgis-developer
>>> _______________________________________________
>>> Qgis-developer mailing list
>>> Qgis-developer at lists.osgeo.org
>>> http://lists.osgeo.org/mailman/listinfo/qgis-developer
>>
>>
>
> _______________________________________________
> Qgis-developer mailing list
> Qgis-developer at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/qgis-developer