[postgis-users] par_psql code for queries
Regina Obe
lr at pcorp.us
Tue Nov 14 21:47:14 PST 2023
Didn’t notice you were talking about this - https://github.com/gbb/par_psql
New to me, but yes that as a driver should work fine.
I’d still use a procedure and just use par_sql to drive the runs of that procedure.
Generate the driver query something like
SELECT 'CALL clip_rasters(' || i::text || ', ' || (i + j - 1)::text || '); -- &'
FROM (SELECT 10) AS f(j), generate_series((SELECT min(id) FROM gfdl_03_temp ), (SELECT max(id) FROM gfdl_03_temp ), j) AS I;
And then copy that into your par_sql script.
Replace the number 10 with whatever iteration you prefer to use.
From: Regina Obe <lr at pcorp.us>
Sent: Tuesday, November 14, 2023 11:07 PM
To: 'PostGIS Users Discussion' <postgis-users at lists.osgeo.org>
Cc: 'Manaswini Ganjam' <manu.ganjam at gmail.com>
Subject: RE: [postgis-users] par_psql code for queries
I’m guessing your intention is to run this in separate connections cause otherwise sad to say you can’t do this in parallel without having some sort of background processor.
Something also seems incorrectly shuffled in your query.
First of all, you want a COMMIT before your end LOOP otherwise not much point to doing this in a loop since it wouldn’t only commit when all is done.
Also may be okay, but looks like you are relying on filename only appearing once in gfdl_03_temp? Might be safe, but just in case, I’d add the ID in your routine
I also think it would be better to just have this in a stored procedure instead of a DO command, so that you can easily call it in separate connections.
So here is my revised
CREATE TABLE clipped_rasters(
id serial PRIMARY KEY,
clipped_raster raster,
filename text, id_tile integer);
CREATE INDEX ix_clipped_rasters_id_tile ON clipped_rasters(id_tile);
CREATE OR REPLACE PROCEDURE clip_rasters(param_start integer, param_end integer) AS
$body$
DECLARE
min_lon numeric := -98; -- Minimum Longitude
raster_row record;
min_lat numeric := 23; -- Minimum Latitude
max_lon numeric := -74; -- Maximum Longitude
crs integer := 4326; -- Coordinate Reference System (CRS)
max_lat numeric := 41; -- Maximum Latitude
BEGIN
FOR raster_row IN (SELECT id, filename, rast FROM gfdl_03_temp AS t WHERE id BETWEEN param_start AND param_end
AND NOT EXISTS(SELECT 1 FROM clipped_rasters AS n WHERE n.id_tile = t.id ) )
LOOP
RAISE NOTICE 'Starting filename: %, row: %, time: %', raster_row.file_name, raster_row.id, clock_timestamp();
INSERT INTO clipped_rasters (filename, clipped_raster_f, id_tile)
SELECT raster_row.filename, ST_Clip(rast, ST_MakeEnvelope(min_lon, min_lat, max_lon, max_lat, crs)), raster_row.id
FROM gfdl_03_temp
WHERE id = raster_row.id;
COMMIT;
RAISE NOTICE 'Ending filename: %, row: %, time: %', raster_row.file_name, raster_row.id, clock_timestamp();
END LOOP;
RETURN;
END $body$ language plpgsql;
-- then to run for tiles 1 to 10 do below. You want to run a different range in different sessions so they don’t run the risk of conflicting each other.
CALL clip_rasters(1,10);
From: postgis-users <postgis-users-bounces at lists.osgeo.org <mailto:postgis-users-bounces at lists.osgeo.org> > On Behalf Of Manaswini Ganjam via postgis-users
Sent: Tuesday, November 14, 2023 1:44 PM
To: postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
Cc: Manaswini Ganjam <manu.ganjam at gmail.com <mailto:manu.ganjam at gmail.com> >
Subject: [postgis-users] par_psql code for queries
Hi, I want to do parallel processing for this code below:
CREATE TABLE clipped_rasters_f (
id serial PRIMARY KEY,
clipped_raster raster
filename text,
DECLARE
DO $$
);
min_lon numeric := -98; -- Minimum Longitude
raster_row record;
min_lat numeric := 23; -- Minimum Latitude
max_lon numeric := -74; -- Maximum Longitude
crs integer := 4326; -- Coordinate Reference System (CRS)
max_lat numeric := 41; -- Maximum Latitude
BEGIN
FOR raster_row IN (SELECT id, filename, rast FROM gfdl_03_temp)
LOOP
INSERT INTO clipped_rasters (filename, clipped_raster_f)
SELECT raster_row.filename, ST_Clip(rast, ST_MakeEnvelope(min_lon, min_lat, max_lon, max_lat, crs))
FROM gfdl_03_temp
WHERE id = raster_row.id <http://raster_row.id> ;
END LOOP;
END $$; --&
Can someone guide me on understanding the mandatory edits in changing this code or any postgis functions used in a query to convert to a par_psql compatible query?
Thank you,
Manaswini Ganjam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20231115/9c0f5895/attachment.htm>
More information about the postgis-users
mailing list