[postgis-users] Grid of points inside Polygon

James David Smith james.david.smith at gmail.com
Fri Nov 15 02:42:54 PST 2013


Hi Remi,

Thanks for your continued guidance with this big data problem.  I
wasn't in the office yesterday so am looking at it again now. I've
just ran the below below query running, and will let you know how many
rows there are once it's finished.

CREATE TABLE lines_for_each_road AS
WITH all_lines AS (
SELECT *
FROM all_lines
)
SELECT row_number() over() AS id--,*
FROM ukmajrdbuffer, all_lines
WHERE ST_Intersects(ukmajrdbuffer.geom, all_lines.the_geom)=TRUE;

In the meantime however, I have noticed something else that is a
worry. I did this:

SELECT st_isvalid(geom) FROM ukmajrdbuffer;

NOTICE:  Ring Self-intersection at or near point 367541.09074241668
311890.25257437676
NOTICE:  Ring Self-intersection at or near point 209901.50000000186
706890.50000000373
NOTICE:  Ring Self-intersection at or near point 444961.46434461698
161217.66612910852
NOTICE:  Ring Self-intersection at or near point 559818.64931676909
103089.14606037363
NOTICE:  Ring Self-intersection at or near point 420114.82424666081
298567.38722311892
 st_isvalid
------------
 f
 f
 t
 f
 f
 f
 t
 t
(8 rows)

I guess that I should try to fix these geometries first before we
attempt to continue any further right?

Thanks

James







On 13 November 2013 18:04, Rémi Cura <remi.cura at gmail.com> wrote:
> All lines should be in one table, you can add a column to differentiate
> between line going SN and EW,
> this is what the query I wrote does.
>
> Not changing your code much, it gives for the line table :
>
> CREATE TABLE all_lines tablespace jamestabs AS
> ( SELECT st_setsrid(st_makeline(st_makepoint(x, 7780), st_makepoint(x,
> 1226580)),27700) as the_geom, 'SN' AS direction
>
> FROM generate_series(53320::int, 667380::int, 20) as x)
> UNION
> ( SELECT st_setsrid(st_makeline(st_makepoint(53320, y),
> st_makepoint(667380, y)),27700) as the_geom, 'EW' AS direction
>
> FROM generate_series(7780::int, 1226580::int, 20) as y)
>
> then building index on it.
>
>
> "-- Merge the road buffer polygons into one big polygon
> -- Should maybe consider leaving them as they are, but am merging for now.
> SELECT st_union(geom) as geom INTO union_roads FROM ukmajrdbuffer;"
> You must not do it !
> As I told you "blabla index blabla".
>
> instead, create an index on table ukmajrdbuffer, and use it without
> unioning.
>
> Then you query is not going to work at all ... (or *very very* slowly)
> "FROM east_to_west, north_to_south, buff_union_roads" : it potentially
> generates 1000*50*1000*50 * number_of_road rows to filter, very bad idea,
> you'll get the same problem as with points...
>
> (please see some example on internet to understand WITH, it is simple ,
> useful, and makes query much more clear)
>
> First you could try :
>
> CREATE TABLE lines_for_each_road tablespace jamestabs AS
> WITH all_lines AS ( --this subquerry gets all the lines
>
> SELECT *
> FROM all_lines
>
> ),
> SELECT row_number() over() AS id--,*
> FROM ukmajrdbuffer, all_lines
> WHERE ST_Intersects(ukmajrdbuffer.the_geom, all_lines.the_geom)=TRUE;
>
> This will give you (if you uncomment the "--,*", ), for every road, all the
> lines intersecting the road
> example :
> road_1 | line_2_EW
> road_1 | line_3_EW
> road_1 | line_78_SN
> road_1 | line_7_EW
> road_2 ...
>
> How long is it (don't uncomment plz), and how many rows do you get (select
> count(*) from lines_for_each_road )?
>
> If it is not too long, we will go on and "cut" the lines so that for every
> road, we keep the part of the lines that are inside the road_buffer.
> (I have to leave now, tomorrow)
>
> Cheers,
> Rémi-C
>
>
>
>
>
> 2013/11/13 James David Smith <james.david.smith at gmail.com>
>>
>> Hey Remi,
>>
>> I don't quite get what the query you gave does to be honest. But
>> working with some of the ideas and notions you gave me I have put the
>> below together. The final bit of the code is the key query. The tables
>> of NS and EW lines were done very quickly. Nice. But now doing the
>> 'intersects within polygon' is taking some time so I'll see how it
>> gets on when I get in to work tomorrow morning:
>>
>> DROP TABLE IF EXISTS north_to_south;
>> DROP TABLE IF EXISTS east_to_west;
>>
>> -- Make a table of north to south lines
>> CREATE TABLE north_to_south tablespace jamestabs AS
>> ( SELECT st_setsrid(st_makeline(st_makepoint(x, 7780), st_makepoint(x,
>> 1226580)),27700) as the_geom
>> FROM generate_series(53320::int, 667380::int, 20) as x);
>>
>> -- Add an index
>> CREATE INDEX north_to_south_geom_index ON north_to_south USING GIST (
>> the_geom );
>>
>> -- Make a table of east to west lines
>> CREATE TABLE east_to_west tablespace jamestabs AS
>> ( SELECT st_setsrid(st_makeline(st_makepoint(53320, y),
>> st_makepoint(667380, y)),27700) as the_geom
>> FROM generate_series(7780::int, 1226580::int, 20) as y);
>>
>> -- Add an index
>> CREATE INDEX east_to_west_geom_index ON east_to_west USING GIST ( the_geom
>> );
>>
>> -- Import the Road Buffer file using the pg_loader interface
>>
>> -- Update the geometry field as the loader didn't seem to do this.
>> SELECT UpdateGeometrySRID('ukmajrdbuffer','geom',27700);
>>
>> -- Merge the road buffer polygons into one big polygon
>> -- Should maybe consider leaving them as they are, but am merging for now.
>> SELECT st_union(geom) as geom INTO union_roads FROM ukmajrdbuffer;
>>
>> -- The unioned roads file isn't valid geom. So I simplify it a little
>> to make it valid.
>> -- We simplify with a 70 metres factor. Geom is now valid.
>> SELECT ST_SimplifyPreserveTopology(geom, 70) AS geom INTO
>> buff_union_roads FROM union_roads;
>>
>> -- Now calculate the points where the lines intersect and also where
>> they are within the polygon.
>> CREATE TABLE gridpoints tablespace jamestabs
>> AS (SELECT ST_Intersection(east_to_west.the_geom, north_to_south.the_geom)
>> FROM east_to_west, north_to_south, buff_union_roads
>> WHERE st_within(buff_union_roads.geom,
>> ST_Intersection(east_to_west.the_geom, north_to_south.the_geom)) =
>> TRUE);
>>
>> Thanks
>>
>> James
>>
>>
>>
>>
>> On 13 November 2013 13:33, Rémi Cura <remi.cura at gmail.com> wrote:
>> > Using the lines you exactly do the same as with points.
>> > The lines will be spaced exactly by 20 meters, end and start being
>> > exactly
>> > on the 20 meters grid, line being either perfectly veritcla or perfectly
>> > horizontal.
>> > So intersections are guaranteed to be on the 20 meter grid, because this
>> > is
>> > the definition of a 20 meter grid =) .
>> >
>> >
>> > Here is a not optimal way to do it :
>> > For instance if UK is enclosed in a bouding box (xmin,ymin,xmax,ymax),
>> > where
>> > xmin-max and ymin-max are rounded to the nearest 20 meters multiple (ie
>> > x or
>> > y % 20 = 0)
>> > (Please note that this is not clever because we generate too much SN
>> > lines,
>> > GB being only 500 km wide at most, so we would need to filter, but you
>> > don't
>> > care, following would take less than a minute anyway, and you need to do
>> > it
>> > only once).
>> >
>> > WITH minmax AS (
>> > SELECT xmin,ymin,xmax,ymax, bbox_of_england
>> > ),
>> > WITH series AS (
>> > SELECT s
>> > FROM generate_series(0,1000*(1000/20)::bigint,20) AS s --casting to
>> > bigint
>> > to be safe against going over int limit
>> > ),
>> > WITH lines AS (
>> > SELECT ST_MakeLine (ST_MakePoint(xmin+s,ymin),ST_MakePoint(xmin+s,ymax))
>> > AS
>> > lineSN,
>> >  ST_MakeLine(ST_MakePoint(xmin, ymin+s), ST_MakePoint(xmax,ymin+s) AS
>> > lineEW
>> > FROM series, minmax
>> > ),
>> > unioned_lines AS (
>> >
>> > SELECT lineSN AS line, 'SN' as direction
>> > FROM lines,minmax
>> > WHERE ST_Interesects(lineSN,bbox_of_england)= TRUE
>> >
>> > UNION
>> >
>> > SELECT lineEW AS line, 'EW' AS direction
>> > FFROM lines,minmax
>> > WHERE ST_Interesects(lineEW,bbox_of_england)= TRUE
>> >
>> > )
>> > SELECT row_number() over() AS uid, line, direction
>> > FROM unioned_line
>> >
>> > then creating table and index on geom and on direction (and possibly on
>> > uid,
>> > but better create a primary key then if you use it)
>> >
>> > Of course I didn't test this query, but you should be able to use it
>> > easily.
>> >
>> > Cheers,
>> > Rémi-C
>> >
>> >
>> >
>> >
>> > 2013/11/13 James David Smith <james.david.smith at gmail.com>
>> >>
>> >> Hi Remi,
>> >>
>> >> Apologies.  One more thing which I am not sure about. We need the
>> >> final grid of points that we make to be multiples of 20. For example:
>> >>
>> >> 53320, 7780 = Yes
>> >> 53323, 7780 = No
>> >> 53320, 7794 = No
>> >> 53321, 7754 = No
>> >>
>> >> That is why in the original function that we wrote, we put the numbers
>> >> in as below.
>> >>
>> >> SELECT ST_Collect(st_setsrid(ST_POINT(x,y),$3)) FROM
>> >> generate_series(53320::int, 667380::int,$2) as x
>> >> ,generate_series(7780::int, 1226580::int,$2) as y
>> >> where st_intersects($1,st_setsrid(ST_POINT(x,y),$3))'
>> >>
>> >> We did this because the points in the grid have to 'align' with the
>> >> values of mutliple 20.
>> >>
>> >> If we do the N-S and E-W lines solution that you suggest, I don't
>> >> think that this will work will it?
>> >>
>> >> Thanks
>> >>
>> >> James
>> >>
>> >>
>> >> On 13 November 2013 11:49, James David Smith
>> >> <james.david.smith at gmail.com> wrote:
>> >> > Hi Remi,
>> >> >
>> >> > Thanks so much for this detailed response. Your idea about creating
>> >> > the lines and the only storing where they intersect and are within
>> >> > the
>> >> > polygons is a great idea. I'm going to give that a go now.
>> >> >
>> >> > Thanks again,
>> >> >
>> >> > James
>> >> >
>> >> > On 13 November 2013 11:41, Rémi Cura <remi.cura at gmail.com> wrote:
>> >> >> This would be an improvement but still non efficient.
>> >> >>
>> >> >> you have 2 possibilities, supposing that what you want is points 20
>> >> >> meter
>> >> >> spaced in all road_buf  :
>> >> >>
>> >> >> either you compute for each road_buffer the points inside, one road
>> >> >> at
>> >> >> a
>> >> >> time ( figuratively ).
>> >> >>
>> >> >> This means you write a function which generates points 20-meter
>> >> >> spaced
>> >> >> in
>> >> >> the bounding box of the road you are working on, and keep those in
>> >> >> the
>> >> >> real
>> >> >> road buffer, and group result points as a multipoints (for cosmetic
>> >> >> purpose)
>> >> >> .
>> >> >>
>> >> >> You would then use it like this :
>> >> >>
>> >> >> SELECT road_id, points_insidea_road(the_geom) AS
>> >> >> my_points_inside_the_road
>> >> >> FROM road_polygons_table
>> >> >>
>> >> >> You would have as output a line per road with a multipoint
>> >> >> containing
>> >> >> all
>> >> >> the point 20 meter spaced inside the road.
>> >> >>
>> >> >> Or (what you wrote) you generate all points and keep those
>> >> >> intersecting
>> >> >> one
>> >> >> or many road.
>> >> >>
>> >> >> The first one is mandatory, because it avoids to manipulate
>> >> >> (incredibly) big
>> >> >> table of all points spaced by 20 meters for UK (around 500 * 10^6
>> >> >> points ! )
>> >> >> Even with indexes it's not a good idea to use such number of rows.
>> >> >>
>> >> >> That's the first point (write a function working for one road, then
>> >> >> use
>> >> >> it
>> >> >> for all road).
>> >> >>
>> >> >> The second point is the way you compute is very inefficient. If your
>> >> >> road is
>> >> >> going from south-West to NorthEast, you will generate a very big
>> >> >> number
>> >> >> of
>> >> >> points, and very few will be in the road_buffer. This is problematic
>> >> >> as
>> >> >> for
>> >> >> a road of only 20 kms, you may generate as many as 100k points and
>> >> >> keep
>> >> >> only
>> >> >> few of them. If you want to process hundreds of ks or roads it will
>> >> >> become
>> >> >> very problematic. Also you would have to generate points each time.
>> >> >>
>> >> >> So here is what  I suggest you : change your strategy :
>> >> >>  instead of generating all point in bounding box and then keeping
>> >> >> only
>> >> >> those
>> >> >> in road_buffer,
>> >> >> generate a line every 20 meters going North south and an line every
>> >> >> 20
>> >> >> meters going East-West , then use the function ST_Intersection to
>> >> >> keep
>> >> >> only
>> >> >> part of this lines being inside the road_polygon, then you have the
>> >> >> points
>> >> >> inside road_polygons as the intersections of these EW lines with the
>> >> >> SN
>> >> >> lines.
>> >> >>
>> >> >> It will be very efficient because you can create a table with all
>> >> >> the
>> >> >> lines
>> >> >> going East-West and South-North for great britain (about 25k + 50k
>> >> >> lines),
>> >> >> and build index on it (index on geom and on the column saying if it
>> >> >> is
>> >> >> SN or
>> >> >> EW).
>> >> >>
>> >> >> The trick is the number of lines is around 500km * 50 line/km +
>> >> >> 1000km
>> >> >> * 50
>> >> >> line/km , where the number of points is  500km * 50 line/km * 1000km
>> >> >> *
>> >> >> 50
>> >> >> line/km
>> >> >>
>> >> >> Hope it helps,
>> >> >>
>> >> >> Cheers,
>> >> >> Rémi-C
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2013/11/13 James David Smith <james.david.smith at gmail.com>
>> >> >>>
>> >> >>> Hey Remi,
>> >> >>>
>> >> >>> Thanks for your reply. So in your mind you think we should have a
>> >> >>> database of say 300 polygons, and then we run a command like this
>> >> >>> right?
>> >> >>>
>> >> >>> SELECT
>> >> >>> ST_Collect(st_setsrid(ST_POINT(x,y),27700))
>> >> >>> FROM
>> >> >>> generate_series(53320::int, 667380::int,20) as x,
>> >> >>> generate_series(7780::int, 1226580::int,20) as y,
>> >> >>> road_polygons_table
>> >> >>> WHERE
>> >> >>> st_intersects(road_polygons_table.the_geom,
>> >> >>> st_setsrid(ST_POINT(x,y),27700))
>> >> >>>
>> >> >>> What do you think?
>> >> >>>
>> >> >>> Thanks
>> >> >>>
>> >> >>> James
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> On 11 November 2013 14:51, Rémi Cura <remi.cura at gmail.com> wrote:
>> >> >>> > Hey,
>> >> >>> > the whole point on using a sgbds like postgis is using index.
>> >> >>> >
>> >> >>> > If you have one line you don't use indexes...
>> >> >>> >
>> >> >>> > So in short, don't make one polygon with a buffer of all the
>> >> >>> > road,
>> >> >>> > but a
>> >> >>> > table with a line for the buffer for every road, then do you
>> >> >>> > computation
>> >> >>> > to
>> >> >>> > create grid of points inside of polygons, then union the result
>> >> >>> > of
>> >> >>> > points!
>> >> >>> >
>> >> >>> > And it s always a bad idea to run a function on big data when you
>> >> >>> > have
>> >> >>> > not
>> >> >>> > tested it fully (including scaling behavior) on small data.
>> >> >>> >
>> >> >>> >
>> >> >>> > Cheers
>> >> >>> > Rémi-C
>> >> >>> >
>> >> >>> >
>> >> >>> > 2013/11/11 James David Smith <james.david.smith at gmail.com>
>> >> >>> >>
>> >> >>> >> Hi all,
>> >> >>> >>
>> >> >>> >> Would appreciate some advice on the best way to accomplish this
>> >> >>> >> please.
>> >> >>> >>
>> >> >>> >> Our situation is that we have a single polygon which has been
>> >> >>> >> created
>> >> >>> >> by buffering all of the major roads in the UK. Projection is
>> >> >>> >> OSGB36
>> >> >>> >> (27700). Obviously it's quite a big polygon.
>> >> >>> >>
>> >> >>> >> -->  SELECT st_area(geom) FROM roadbufferunion;
>> >> >>> >>      st_area
>> >> >>> >> ------------------
>> >> >>> >>  77228753220.8271
>> >> >>> >>
>> >> >>> >> What we now want to do is create a regular grid of 20 metre x 20
>> >> >>> >> metre
>> >> >>> >> points instead the polygon area. So we wrote this function
>> >> >>> >> (based
>> >> >>> >> on
>> >> >>> >> some googling, apologies for not being able to recall the exact
>> >> >>> >> person
>> >> >>> >> who originally wrote it):
>> >> >>> >>
>> >> >>> >> CREATE OR REPLACE FUNCTION makegrid(geometry, integer, integer)
>> >> >>> >> RETURNS geometry AS
>> >> >>> >> 'SELECT ST_Collect(st_setsrid(ST_POINT(x,y),$3)) FROM
>> >> >>> >>   generate_series(53320::int, 667380::int,$2) as x
>> >> >>> >>   ,generate_series(7780::int, 1226580::int,$2) as y
>> >> >>> >> where st_intersects($1,st_setsrid(ST_POINT(x,y),$3))'
>> >> >>> >> LANGUAGE sql
>> >> >>> >>
>> >> >>> >> and we then run this by doing the following:
>> >> >>> >>
>> >> >>> >> SELECT st_x((ST_Dump(makegrid(geom, 20, 27700))).geom) as x,
>> >> >>> >> st_y((ST_Dump(makegrid(geom, 20, 27700))).geom) as y INTO
>> >> >>> >> grid_points
>> >> >>> >> from roadbufferunion;
>> >> >>> >>
>> >> >>> >> However after over 2 days of the query running on a pretty
>> >> >>> >> powerful
>> >> >>> >> linux cluster, we still have no result.  I'm not sure if it is
>> >> >>> >> actually running or not to be honest.
>> >> >>> >>
>> >> >>> >> Does the query look right?
>> >> >>> >> Any ideas how we can make it run quicker?
>> >> >>> >>
>> >> >>> >> Thanks
>> >> >>> >>
>> >> >>> >> James
>> >> >>> >> _______________________________________________
>> >> >>> >> postgis-users mailing list
>> >> >>> >> postgis-users at lists.osgeo.org
>> >> >>> >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> > _______________________________________________
>> >> >>> > postgis-users mailing list
>> >> >>> > postgis-users at lists.osgeo.org
>> >> >>> > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>> >> >>> _______________________________________________
>> >> >>> postgis-users mailing list
>> >> >>> postgis-users at lists.osgeo.org
>> >> >>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>> >> >>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> postgis-users mailing list
>> >> >> postgis-users at lists.osgeo.org
>> >> >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>> >> _______________________________________________
>> >> postgis-users mailing list
>> >> postgis-users at lists.osgeo.org
>> >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>> >
>> >
>> >
>> > _______________________________________________
>> > postgis-users mailing list
>> > postgis-users at lists.osgeo.org
>> > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users


More information about the postgis-users mailing list