[postgis-devel] topology : new index on node table

Rémi Cura remi.cura at gmail.com
Wed Jul 30 03:36:17 PDT 2014


OK,
is it the timing with
CREATE INDEX ON node (containing_face);
?
OR is it without index?

Maybe a little vacuum analyze would help?

Cheers,
Rémi-C


2014-07-30 12:24 GMT+02:00 Sandro Santilli <strk at keybit.net>:

> On Wed, Jul 30, 2014 at 12:09:05PM +0200, Rémi Cura wrote:
> > The ticket :
> > http://trac.osgeo.org/postgis/ticket/2860
> >
> > If fitting in memory, I would expect the 160k geom of the benchmark to be
> > imported in less than 30 sec in grass, cgal and geos.
>
> Sum of memory size used to hold those 160k geometries in PostGIS memory
> is 26,592,360 (~26MB).
>
> > 300 sec for Postgis Topology batch mode would be a good goal (reachable
> and
> > nice achievement).
>
> Current best time (in that table) was 4984 secs (~2h20m)
> Still, we don't know exactly how much time went in triggers checking,
> how much in disk writing, how much in toasting/detoasting values etc.
>
> Speaking of which, another postgresql-level optimization that could
> be taken at population time could be setting the storage type, and
> setting the fill factor for tables should also make a difference.
>
> A lot to play with, having the time...
>
> Unfortunately running the population query for 5000 geometries right
> now gives me a longer time than I was getting back when I took the
> times for that test, and I'm not sure why. Maybe it's because I enabled
> gathering statistics about the called user functions, or because I changed
> some parameters about available memory work memory and such. I suspect
> those configurations change a lot the experience.
>
> Anyway, having stats on, will dump here the ones for loading 5000
> geometries
> from that dataset (ordered by gid):
>
> strk=# select * from pg_stat_user_functions  order by total_time desc
> limit 10;
>  funcid | schemaname |       funcname        | calls | total_time |
> self_time
>
> --------+------------+-----------------------+-------+------------+-----------
>   54535 | topology   | topogeo_addpolygon    |  5000 |  55761.337 |
>  1369.537
>   54533 | topology   | topogeo_addlinestring |  5004 |  48302.105 |
> 4038.88
>   54503 | topology   | st_addedgemodface     |  5004 |   33837.97 |
> 10328.094
>   62973 | topology   | _st_addfacesplit      | 10008 |  22461.719 |
>  17372.12
>   54534 | topology   | topogeo_addpoint      | 10008 |   9712.522 |
>  3570.827
>   54508 | topology   | st_addisonode         |  5004 |   6032.572 |
>  4861.636
>   54491 | topology   | getringedges          | 10008 |   4876.065 |
>  4876.065
>   54514 | topology   | st_getfacegeometry    | 11921 |   4674.631 |
>  3712.201
>   52455 | public     | st_pointonsurface     |  9591 |   1267.424 |
>  1267.424
>   52275 | public     | st_buildarea          | 11921 |     962.43 |
>  962.43
> (10 rows)
>
> You'll note that _ST_AddFaceSplit is called 10008 times, and at every
> call the number of points should be at least 1 more than the previous,
> with last call finding ~5000 nodes in the table.
>
> For the record, my current timing for loading those 5000 rows is 55822.181
> ms
> whereas my last best was 36134.786 ms (just to get an idea about how much
> database configuration could change times, as I think the machine was
> really the same).
>
> --strk;
>
> > 2014-07-30 11:56 GMT+02:00 Rémi Cura <remi.cura at gmail.com>:
> >
> > > Almost a 10x factor, really nice !
> > > Cheers,
> > > Rémi-C
> > >
> > >
> > > 2014-07-30 11:38 GMT+02:00 Sandro Santilli <strk at keybit.net>:
> > >
> > > On Wed, Jul 30, 2014 at 11:29:00AM +0200, Sandro Santilli wrote:
> > >>
> > >> > There's a reference dataset produced some time ago by a user who
> > >> resulted
> > >> > in a fix making topology creation much faster:
> > >> >
> http://lists.osgeo.org/pipermail/postgis-devel/2014-January/024078.html
> > >> >
> > >> > I still have a couple of diagrams showing population time
> before/after
> > >> > that are waiting for a blog post that I never find the time to
> write...
> > >>
> > >> Just a sneak preview of the blog post that might never be written
> > >>
> > >> Queries:
> > >>
> > >>  SELECT ST_CreateTopoGeo('million_poly_topo1', ST_Collect(geom))
> > >>  FROM ( SELECT geom from million_poly_topo1 order by gid limit
> 160000) as
> > >> f;
> > >>
> > >>  SELECT TopoGeo_addPolygon('million_poly_topo1', ST_GeometryN(geom,1))
> > >>  FROM ( SELECT geom from million_poly_topo1 order by gid limit
> 160000) as
> > >> f;
> > >>
> > >> Output topology has:
> > >>
> > >>  160167 nodes, 160167 edges, 160167 faces
> > >>
> > >> Times:
> > >>
> > >>  a: before starting
> > >>  b: Ensure face splitting algorithm uses the edge index (#2610)
> > >>  c: Drop all calls to geometry::text during topology population
> (#2616)
> > >>
> > >>       ST_CreateTopoGeo     TopoGeo_addPolygon
> > >>     +-------------------+--------------------+
> > >>  a  |    38352283.535   |    38375126.950    |
> > >>  b  |    12088554.776   |     9644736.277    |
> > >>  c  |    11963757.923   |     4984225.402    |
> > >>
> > >> I've a table with more numbers, with limits from 5000 to 160000.
> > >> The index change would make another configuration (d, I guess).
> > >>
> > >> Please please please file the ticket, maybe referencing this thread
> > >> and #2610 and #2616 as I'd really love to grow the optimizations
> further
> > >>
> > >> --strk;
> > >> _______________________________________________
> > >> postgis-devel mailing list
> > >> postgis-devel at lists.osgeo.org
> > >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
> > >>
> > >
> > >
>
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at lists.osgeo.org
> > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>
>
> --
>
>  ()  ASCII ribbon campaign  --  Keep it simple !
>  /\  http://strk.keybit.net/rants/ascii_mails.txt
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20140730/3e89fca5/attachment.html>


More information about the postgis-devel mailing list