[postgis-users] A function for “Esri union” union on big tables on github.

Lars Aksel Opsahl Lars.Opsahl at nibio.no
Wed Feb 10 00:11:41 PST 2016


There has been different mails about this topic lately. We have now have added the code we use to Github and hopefully somebody can pick up some ideas or just use this function as it is.

The basic idea is that you call this function and with 2 tables as input. The following happens in the function

  *   Builds up a content based grid

  *   Computes the result

  *   Removes the grid lines from the result

  *   Returns a table name with the union of this two tables. For areas that intersect you get attributes from both tables and for areas that only exits in one of the tables you only get attributes from one table.

The code is found at https://github.com/larsop/esri_union

About performance. The code added now runs in a single thread, but we have a slightly modified code that runs in parallel using “Gnu parallel” and then we can increase the performance many times depending on how many CPU you have on your server. Here is an example running with 20 threads.

num points num polygons table size

Table 1 40435700 1088614 637 MB

Table 1 933145431 7924019 10127 MB

Result table 2042294001 43668256 30 GB

The time used to do the intersection was 152 minutes. I will add the parallel code later when I have time to make the code ready.


More information about the postgis-users mailing list