[postgis-devel] creating topology

Paul Goldstein paulgoldenstein0 at gmail.com
Thu Jun 27 06:24:08 PDT 2013


Hey list,
I don't know if this is the right list for just a use-case posting. If
it is not, give me a hint.

I'm creating a topology of about 70k of geometry-segments. The
smallest of them is just a few meters, the longest is about 10 km
long.
Added  to create a topology, my geometries have a z coordinate
(altitude) but that doesn't matter here.

The first idea of creating a topology was like this:

update public.segment set topogeom =
topology.totopogeom(st_force_2d(geom), 'segment_topo', 1, 0);
(geom is the geometry column; topogeom the topography-column;
'segment_topo' is the topology)

The first time I tryed to create the topology with this statement was
overnight. To test how long it runs, I started again and aborted after
3,5 hours.

# part 1 #
With help of strk in postgis-irc (thanks again!) it took about 3
hours. It was made by divide all geometries in chunks by their id. One
chunk were 1000 pieces.
Primary goal was to not hold all pieces in ram and commiting after
updating 70k geometries.
Statement looked like this:

update public.segment
  set topogeom = topology.totopogeom(st_force_2d(geom), 'segment_topo', 1, 0)
  where id >= ? and id < ?;

# part 2 #
I tried to switch off autovacuum (autovacuum_enabled = false) for all
tables in topology-scheme and for my segment-table. This was a massive
speed-improvement cause it took "only" 2 hours.
But strangely there was a knee after 42k of updates where speed
raises. - I don't know why.

# part 3 #
I thought, maybe because my database had too less shared memory
(shared_buffers) and i increased it to 6 times (96MB). But curve
looked like before.

# part 4 #
strk asked me to measure another idea: creating chunks not on their id
instead use their geographic coordinates. so one part is e.g. (0 0, 1
1) -> (1 0, 2 1) -> (2 0, 3 1) ---> (0 1, 1 2) -> (1 1, 2 2) -> (2 1,
3 2) ---> ... - i hope its clear what i meant.
thats my appropriate statement:

update public.segment
  set topogeom = topology.totopogeom(st_force_2d(geom), 'segment_topo', 1, 0)
  where (st_intersects(geom, st_envelope(st_geomfromewkt(?))) = true
    or st_contains(geom, st_envelope(st_geomfromewkt(?))) = true)
  and topogeom is null;

where ? is this: "srid=4326;linestring(%f %f,%f %f)"

This took very too long. I aborted it after 1,5 hours, because it was too slow.


Now my questions:
Does anybody know how to improve the speed to create a topology?
Or is it possible to not creating my topology every night if something
changes in geometry. Instead alter topology in an
after-statement-trigger when something happens with geometries.
strk said when totopogeom() is called TopoGeo_addLinestring() is
entered many times. Is it possible to improve that function? (Maybe
writing it in C - instead of plpgsql :)


Here you can see some graphs for the different parts:
http://privatepaste.com/download/ef41296b0d
blue is part 1: chunks of ids
red is part 2: chunks of ids but without autovacuum
yellow is part 3: as red but 96MB of shared_buffer
green is part 4: chunks of squared geography elements

Thanks for any help or suggestions,
Paul



More information about the postgis-devel mailing list