[postgis-users] Generating new random points throughout an update
Regina Obe
lr at pcorp.us
Sat Nov 18 14:09:33 PST 2023
I’m guessing the reason why your example doesn’t work is because the planner is doing some serious short-circuiting cause it sees two subselects that are using no variables from the updated table.
I suspect this is a bug or some intentional stuff that makes no sense to me. Cause I’m pretty sure I’ve done something like UPDATE sometable SET geom = ST_GeneratePoints(somestaticgeom,1);
And gotten different answers. So I suspect it’s the subselect throwing it off or it is intentionally treating like a constant because that subselect doesn’t involve the table being updated.
So the trick I see is to incorporate some value from your events table into your routine.
Try computing first and then updating like so the below gives me different answers for each row.
WITH a AS (
SELECT e.id, ST_Makeline(
ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(s.std_track,1),
0.01),
1),
1), ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(s.std_track,2),
0.01),
1),
1)
) AS geom
FROM events AS e, std_tow AS s
)
UPDATE events
set jittered = a.geom
FROM a
WHERE a.id = events.id;
select ST_AsText(jittered) from events;
the other way to do it, making the planner realize that just cause the code is exactly the same and doesn’t involve the table being updated, doesn’t mean you want all your values to be the same, is to incorporate your event id in your randomize like so
update events
set jittered = ST_Makeline(
(select ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(std_track,1),
0.01),
1,
(random()*1000)::int + events.id),
1)
from std_tow),
(select ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(std_track,2),
0.01),
1,
(random()*1000)::int + events.id),
1)
from std_tow));
select ST_AsText(jittered) from events;
From: Brent Wood <pcreso at yahoo.com>
Sent: Saturday, November 18, 2023 4:35 PM
To: Regina Obe <lr at pcorp.us>; PostGIS Users Discussion <postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Generating new random points throughout an update
Thanks for your time & advice Regina, I appreciate it.
I still can't get this to work as I think it should, so have included actual SQL's to show what I'm doing, using ST_GeneratePoints() this time...
I create a db & add the postgis extension....
Then create the two tables to test, inserting 3 empty geometries in one & a simple linestring in the other:
create table events (id integer,
jittered geometry(LINESTRING,4326));
insert into events (id) values (1);
insert into events (id) values (2);
insert into events (id) values (3);
create table std_tow (id integer,
std_track geometry(LINESTRING,4326));
insert into std_tow values (1, ST_SetSRID(
ST_MakeLine(
ST_MakePoint(176,-47),
ST_MakePoint(177,-48)
),
4326));
I want to update the empty linestrings in one table (events) with slightly randomised versions of the linestring in the other (std_tow).
ST_GeneratePoints() supposedly generates random points (in a polygon created by buffering the vertices in the standard linestring) without a seed, so I run it with no seed & view the results:
update events
set jittered = ST_Makeline(
(select ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(std_track,1),
0.01),
1),
1)
from std_tow),
(select ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(std_track,2),
0.01),
1),
1)
from std_tow));
select ST_AsText(jittered) from events;
LINESTRING(175.99658281229873 -46.99893493622685,177.0081812507064 -47.99873318845546)
LINESTRING(175.99658281229873 -46.99893493622685,177.0081812507064 -47.99873318845546)
LINESTRING(175.99658281229873 -46.99893493622685,177.0081812507064 -47.99873318845546)
(3 rows)
I get three identical linestrings.
I figured I'd use a different integer random seed (between 0 and 1000) in ST_GeneratePoints() to force a different result each time:
update events
set jittered = ST_Makeline(
(select ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(std_track,1),
0.01),
1,
(random()*1000)::int),
1)
from std_tow),
(select ST_GeometryN(
ST_GeneratePoints(
ST_Buffer(
ST_POINTN(std_track,2),
0.01),
1,
(random()*1000)::int),
1)
from std_tow));
select ST_AsText(jittered) from events;
LINESTRING(175.9943248467802 -46.996045972449906,176.9919097521138 -48.00102135929174)
LINESTRING(175.9943248467802 -46.996045972449906,176.9919097521138 -48.00102135929174)
LINESTRING(175.9943248467802 -46.996045972449906,176.9919097521138 -48.00102135929174)
(3 rows)
I get different vertices from the first attempt, but all 3 records are still the same values - despite supposedly having a different seed.
I figure the Postgres query optimiser must be reusing the result from the subqueries rather than recalculating it each time, but am not sure, and the optimiser cannot be turned off.
What am I doing wrong??? (or how can I do it right!!)
Much appreciated,
Brent
On Sunday, November 19, 2023 at 06:44:16 AM GMT+13, Regina Obe <lr at pcorp.us <mailto:lr at pcorp.us> > wrote:
Well when I run random() I do get a different answer for each run so random behaves as I would expect. I didn’t look that closely at your query with random.
e.g.
SELECT random()
FROM generate_series(1,100);
Even if within the same row, the random numbers are different:
SELECT random(), random()
FROM generate_series(1,10);
If you were doing random()::integer as input into ST_GeneratePoints, I thought maybe that was a typo on your end. Then your random number would only be 0 or 1, which is not that random.
So if you really were doing ST_GeneratePoints(geom, random()::integer) then that would explain why you got much less than random results with ST_GeneratePoints.
From: Brent Wood <pcreso at yahoo.com <mailto:pcreso at yahoo.com> >
Sent: Saturday, November 18, 2023 1:29 AM
To: Regina Obe <lr at pcorp.us <mailto:lr at pcorp.us> >; PostGIS Users Discussion <postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org> >
Subject: Re: [postgis-users] Generating new random points throughout an update
Hi Regina,
The seed was an int generated from random(), so I'd expected to generate a different result every time. This didn't happen.
Do I understand that if I omit the seed, I'll get a different point each time by default?
Thanks,
Brent
On Saturday, November 18, 2023 at 06:01:37 PM GMT+13, Regina Obe <lr at pcorp.us <mailto:lr at pcorp.us> > wrote:
If you want the answer different each time, you don’t want to feed a seed to ST_GeneratePoints.
The seed argument was added because some people wanted to generate the same answer for each run.
https://postgis.net/docs/ST_GeneratePoints.html (note the sentence: The optional seed is used to regenerate a deterministic sequence of points, and must be greater than zero.)
From: postgis-users <postgis-users-bounces at lists.osgeo.org <mailto:postgis-users-bounces at lists.osgeo.org> > On Behalf Of Brent Wood via postgis-users
Sent: Friday, November 17, 2023 11:53 PM
To: PostGIS Users Discussion <postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org> >
Cc: Brent Wood <pcreso at yahoo.com <mailto:pcreso at yahoo.com> >
Subject: [postgis-users] Generating new random points throughout an update
Hopefully someone can help with a problem I'm having.
I have a table with simple linestrings that I need to create a randomly modified version of.
The linestrings represent vessel tracks. I can identify a set of "similar" tracks & create a single "average" linestring that is somewhat representative.
Many of the records don't have a linestring, but for statistical purposes I need to assign a linestring to each - by creating a jittered version of the average linestring so they are not all identical.
The simplest approach I have tried is to use an update with ST_Project() given a random() distance & random() direction applied to each vertex in the average line.
I use the first two vertices with ST_Makeline(), then append a vertex for the third point, as in the SQL below.
My problem is that every new line is identical. From some Googled hints, I figure the optimiser has decided to run random() once & re-use the value instead of running the function for every iteration (but I could be wrong!).
Any suggestions as to how I can force a different random result for each record that is updated?
I also tried using ST_GeneratePoints() in a buffer around each point, but need to use something like (random()::int as the seed, and this seems to do exactly the same - valid linestrings are generated, but they are identical, so I'm assuming the seed is not being recalculated for each record.
update events
set jittered = ST_MakeLine(
(select ST_Project(
ST_POINTN(std_track,1),
(random()*5000),
radians(random()*360))::geometry
from std_tow),
(select ST_Project(
ST_PointN(std_track,2),
(random()*5000),
radians(random()*360))::geometry
from std_tow)
);
Thanks,
Brent Wood
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20231118/2babb32d/attachment.htm>
More information about the postgis-users
mailing list