[Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?

Marco Hugentobler marco.hugentobler at sourcepole.ch
Tue Jun 23 07:44:12 PDT 2015


Hi Nyall

>I'm still a bit unclear as to where
>you are taking these classes though. Is the intention that eventually
>ALL operations will be done using QgsGeos?

In the end, yes, everything should go through QgsGeometryEngine 
interface. There are two possibilities:

1. client code creates QgsGeos, calls prepare() on it and makes repeaded 
intersection(), etc. directly on QgsGeometryEngine. This can be done 
with master branch as is.

2. client code uses QgsGeometry and calls intersection() on it. To 
benefit from prepared geometries, we would need to enhance QgsGeometry 
to cache QgsGeos and call prepare() the first time it is used.

Thinking about it, version 1 might be better. We cannot know if the 
cached QgsGeos is really needed a second time and it might use more 
memory if not. In the first case, the client code knows if it is 
intended to do repeaded operations on the same geometry.

What do you think?

>Could we put a giant "NOT STABLE API AND MAY CHANGE" for
>all these classes, so that we can address any unforeseen issues in
>2.12?

That would be ok.

Regards,
Marco



On 23.06.2015 13:59, Nyall Dawson wrote:
> On 23 June 2015 at 16:15, Marco Hugentobler
> <marco.hugentobler at sourcepole.ch> wrote:
>> Hi Nyall
>>
>>> This potentially has huge impacts on the performance of common tasks such
>>> as selecting all features which intersect a geometry.
>> Are you really sure the impact is huge? If you select features on the map,
>> you get a new one from provider each time. So also with the old geometry
>> class, it was necessary to convert to geos for each intersect.
> That's why the subject line says "potentially" ;)
>
>> But yes, if several geos operations are done in sequence on really the same
>> geometry instance, there is an overhead now. However, I would assume the
>> geos conversion takes far less time than the geos operation itself. Did you
>> observe cases where the geos conversion really takes very long time?
> My biggest concern was operations such as the "Select by location"
> operations, where a single feature is compared against all the other
> features in a second layer. In this case there would be benefit from
> retaining the geos representation between each comparison. However,
> I've just run some tests and there's no appreciable difference in the
> time required for selecting intersecting features from large layers
> using this algorithm between 2.8->2.10, so you're correct in that the
> geos conversion isn't very expensive. (Unfortunately, as Martin has
> pointed out, memory usage in 2.10 was up by 50%).
>
>> Lets assume the geos conversion indeed takes a long time. Couldn't we just
>> store the pointer to QgsGeos (or better QgsGeometryEngine class) inside
>> QgsGeometry? This would be straightforward to do. Furthermore, it is
>> possible to prepare QgsGeos. A prepared geos geometry can do _much_ faster
>> intersects/touches/ etc. operations if called repeatedly. We can cache a
>> prepared QgsGeos now with the new engine, but we couldn't do that with the
>> old QgsGeometry class. So for repeated predicate calculations, the new
>> geometry is much better, also performance wise.
> Agreed - when PostGIS flipped their intersects operation to utilise
> prepared geometries the difference was huge. I'm excited to see this
> taken advantage of in QGIS too! I'm still a bit unclear as to where
> you are taking these classes though. Is the intention that eventually
> ALL operations will be done using QgsGeos? Because if not, then I
> can't see anyway to actually utilise the prepared geometries as it
> stands now - none of the methods in QgsGeometry utilise these.
>
> Please don't take my reservations as attacking this work. I think it's
> great stuff, and generally has been quite painless since it landed,
> which is surprising given the extent of the work. I just want to make
> sure 2.10 is in the best shape it can be in, and I'm doubtful that we
> are release ready yet.
>
> My biggest concern is honestly having the current QgsGeometryV2 API
> locked in while there are still questions regarding how it's
> implemented. Could we put a giant "NOT STABLE API AND MAY CHANGE" for
> all these classes, so that we can address any unforeseen issues in
> 2.12?
>
> Nyall
>
>
>
>>
>> Regards,
>> Marco
>>
>>
>> On 23.06.2015 04:09, Nyall Dawson wrote:
>>
>> Hi all,
>>
>> Unfortunately, we've become aware of a serious performance regression caused
>> by the new geometry engine. Basically, the situation is that for all
>> geometry operations which rely on geos (think buffers, splits, spatial
>> relation operations such as intersects and within,... ) the geometry now
>> needs to be converted into a geos representation with *every* operation. In
>> the old geometry engine this conversion was done once, and the result stored
>> so that follow up operations would not need to recalculate it. This
>> potentially has huge impacts on the performance of common tasks such as
>> selecting all features which intersect a geometry.
>>
>> I've had a look, and unfortunately it's not trivial to fix this. I think the
>> correct solution to this is to:
>>
>> - make QgsGeometryEngine accept and return QgsGeometry containers, not
>> QgsAbstractGeometryV2
>> - store the generated geos representation of geometries within
>> QgsGeometryPrivate inside the QgsGeometry container. This way it will be
>> reusable between different geometry operations, and shared when QgsGeometry
>> objects are copied. This will also have the benefit that if a geometry is
>> prepared using geos then subsequent geos operations performed on that
>> QgsGeometry and its shared copies will be much faster.
>> - make QgsGeometry a friend class of QgsGeo, so that it can access
>> QgsGeometryPrivate to retrieve or set the geos representation of the
>> geometry as required
>>
>> An alternative (short term) solution would be to just cache the geos
>> representation when geometry operations are called through the older
>> QgsGeometry modification/relationship operations. This would be easier, but
>> means that the API of QgsGeometryEngine will be stuck with the current
>> design, and we won't be able to properly fix this until breaking the api for
>> 3.0.
>>
>> Either way, I doubt this can be addressed within the remaining 3 days we
>> have until release. Should we delay to address this? Release with the
>> regression? Or am I missing something and there's an easier solution we
>> could implement? Or even possibly this additional cost of recalculating the
>> geos representation is trivial and can be ignored (maybe someone could test
>> this with a little repeated intersection script)?
>>
>> Thoughts?
>>
>> Nyall
>>
>>
>>
>> _______________________________________________
>> Qgis-developer mailing list
>> Qgis-developer at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/qgis-developer
>>
>>
>>
>> --
>> Dr. Marco Hugentobler
>> Sourcepole -  Linux & Open Source Solutions
>> Weberstrasse 5, CH-8004 Zürich, Switzerland
>> marco.hugentobler at sourcepole.ch http://www.sourcepole.ch
>> Technical Advisor QGIS Project Steering Committee
>>
>>
>> _______________________________________________
>> Qgis-developer mailing list
>> Qgis-developer at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/qgis-developer


-- 
Dr. Marco Hugentobler
Sourcepole -  Linux & Open Source Solutions
Weberstrasse 5, CH-8004 Zürich, Switzerland
marco.hugentobler at sourcepole.ch http://www.sourcepole.ch
Technical Advisor QGIS Project Steering Committee



More information about the Qgis-developer mailing list