[postgis-users] par_psql code for queries

Regina Obe lr at pcorp.us
Tue Nov 14 21:47:14 PST 2023


Didn’t notice you were talking about this -  https://github.com/gbb/par_psql

 

New to me, but yes that as a driver should work fine.

 

I’d still use a procedure  and just use par_sql to drive the runs of that procedure.

 

Generate the driver query something like

 

SELECT 'CALL clip_rasters(' || i::text || ', ' || (i + j - 1)::text || '); -- &'

FROM (SELECT 10) AS f(j), generate_series((SELECT min(id) FROM gfdl_03_temp ), (SELECT max(id) FROM gfdl_03_temp ), j) AS I;

 

And then copy that into your par_sql script.

 

Replace the number 10 with whatever iteration you prefer to use.

 

 

 

From: Regina Obe <lr at pcorp.us> 
Sent: Tuesday, November 14, 2023 11:07 PM
To: 'PostGIS Users Discussion' <postgis-users at lists.osgeo.org>
Cc: 'Manaswini Ganjam' <manu.ganjam at gmail.com>
Subject: RE: [postgis-users] par_psql code for queries

 

I’m guessing your intention is to run this in separate connections cause otherwise sad to say you can’t do this in parallel without having some sort of background processor.

 

Something also seems incorrectly shuffled in your query.

First of all, you want a COMMIT before your end LOOP otherwise not much point to doing this in a loop since it wouldn’t only commit when all is done.

 

Also may be okay, but looks like you are relying on filename only appearing once in gfdl_03_temp?  Might be safe, but just in case, I’d add the ID in your routine

 

I also think it would be better to just have this in a stored procedure instead of a DO command, so that you can easily call it in separate connections.

 

So here is my revised

 

CREATE TABLE clipped_rasters(

id serial PRIMARY KEY,

clipped_raster raster,

filename text, id_tile integer);

 

CREATE INDEX ix_clipped_rasters_id_tile ON clipped_rasters(id_tile);

 

CREATE OR REPLACE PROCEDURE clip_rasters(param_start integer, param_end integer) AS

$body$

DECLARE 

min_lon numeric := -98;  -- Minimum Longitude

raster_row record;

min_lat numeric := 23;   -- Minimum Latitude

max_lon numeric := -74;  -- Maximum Longitude

crs integer := 4326;     -- Coordinate Reference System (CRS)

max_lat numeric := 41;   -- Maximum Latitude

 

BEGIN

FOR raster_row IN (SELECT id, filename, rast FROM gfdl_03_temp AS t WHERE id BETWEEN param_start AND param_end 

                AND NOT EXISTS(SELECT 1 FROM clipped_rasters AS n WHERE n.id_tile = t.id ) )

LOOP

RAISE NOTICE 'Starting filename: %, row: %, time: %', raster_row.file_name, raster_row.id, clock_timestamp();

INSERT INTO clipped_rasters (filename, clipped_raster_f, id_tile)

SELECT raster_row.filename, ST_Clip(rast, ST_MakeEnvelope(min_lon, min_lat, max_lon, max_lat, crs)), raster_row.id

FROM gfdl_03_temp

WHERE id = raster_row.id;

COMMIT;

RAISE NOTICE 'Ending filename: %, row: %, time: %', raster_row.file_name, raster_row.id, clock_timestamp();

END LOOP;

RETURN;

END $body$ language plpgsql;

 

-- then to run for tiles 1 to 10 do below. You want to run a different range in different sessions so they don’t run the risk of conflicting each other.

 

CALL clip_rasters(1,10);

 

From: postgis-users <postgis-users-bounces at lists.osgeo.org <mailto:postgis-users-bounces at lists.osgeo.org> > On Behalf Of Manaswini Ganjam via postgis-users
Sent: Tuesday, November 14, 2023 1:44 PM
To: postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org> 
Cc: Manaswini Ganjam <manu.ganjam at gmail.com <mailto:manu.ganjam at gmail.com> >
Subject: [postgis-users] par_psql code for queries

 

Hi, I want to do parallel processing for this code below:

CREATE TABLE clipped_rasters_f (
id serial PRIMARY KEY,
clipped_raster raster
filename text,
DECLARE
DO $$
);
min_lon numeric := -98;  -- Minimum Longitude
raster_row record;
min_lat numeric := 23;   -- Minimum Latitude
max_lon numeric := -74;  -- Maximum Longitude
crs integer := 4326;     -- Coordinate Reference System (CRS)
max_lat numeric := 41;   -- Maximum Latitude

BEGIN
FOR raster_row IN (SELECT id, filename, rast FROM gfdl_03_temp)

LOOP
INSERT INTO clipped_rasters (filename, clipped_raster_f)
SELECT raster_row.filename, ST_Clip(rast, ST_MakeEnvelope(min_lon, min_lat, max_lon, max_lat, crs))
FROM gfdl_03_temp
WHERE id = raster_row.id <http://raster_row.id> ;

END LOOP;

END $$; --&

Can someone guide me on understanding the mandatory edits in changing this code or any postgis functions used in a query to convert to a par_psql compatible query?

 

Thank you,

Manaswini Ganjam

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20231115/9c0f5895/attachment.htm>


More information about the postgis-users mailing list