[gdal-dev] [EXTERNAL] Re: Seeking SQL wizardry
Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC]
jesse.r.meyer at nasa.gov
Wed Jan 8 14:12:12 PST 2025
Hi Patrick,
I’m working with a GeoPackage database file, using the SQLITE dialect, and submitting the query through the ExecuteSQL interface on the database object in Python. This makes the art of crafting SQL queries a bit tricky because ontop of SQL concerns there’s also the abstraction OGR must also maintain.
Looks like:
delete from layer_name where gdal_get_pixel_value(…) < 600
Gets me further along, with your advice, thanks! I originally put the centroid determination in a subexpression because I hesitated with the thought that the SQL engine couldn’t / wouldn’t cache the result for the parameters (and our geometries can be quite dense). But I’ll worry about that later, and I think SQLite can optimize this case. That sent me down a needlessly complicated path I think.
Best,
Jesse
Lead Computer Scientist
Science Systems and Applications, Inc.
Dr Compton Tucker Team
NASA Goddard Space Flight Center
From: P O'Toole <P.OToole at uwyo.edu>
Date: Wednesday, January 8, 2025 at 4:49 PM
To: gdal-dev at lists.osgeo.org <gdal-dev at lists.osgeo.org>, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] <jesse.r.meyer at nasa.gov>
Subject: [EXTERNAL] Re: Seeking SQL wizardry
CAUTION: This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.
Hi, Jesse,
I assume you're working in SQLAlchemy or something, so there may be more to this than I'm thinking, but if the parsing errors are actually coming from the Postgres server(?), there should be plenty of ways around your issue, one way or another. Before you try to get especially fancy, I would try something simple like this, cutting down on the amount of aliasing and cross-referencing you're doing to just find the rows to kill:
DELETE FROM some_table WHERE some_table.row_id = ( SELECT rowid FROM some_table WHERE gdal_get_pixel_value('raster.tif', ndvi_band_num, 'georef', ST_X(ST_Centroid(geom)), ST_Y(ST_Centroid(geom)) < 600);
I may have a few details wrong because I don't have your SQL schema in front of me, but you essentially just want to find a simple SELECT-query that will return the desired row-IDs. You'll then just slap that into subquery inside the WHERE-clause of your DELETE statement. You can fool around with the SELECT statement on its own until you know that's right first. If you're doing additional processing on these rows before you delete, I'd consider using a transaction so any analysis you do happens on the exact same rows as those you're deleting.
Hope that helps.
- Patrick O'Toole
Full-Stack Developer
Wyoming Natural Diversity Database
University of Wyoming
________________________________
From: gdal-dev <gdal-dev-bounces at lists.osgeo.org> on behalf of Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev <gdal-dev at lists.osgeo.org>
Sent: Wednesday, January 8, 2025 1:53 PM
To: gdal-dev at lists.osgeo.org <gdal-dev at lists.osgeo.org>
Subject: [gdal-dev] Seeking SQL wizardry
◆ This message was sent from a non-UWYO address. Please exercise caution when clicking links or opening attachments from external sources.
BTW If this question is too specific to SQL and I should seek aid elsewhere, please let me know.
I am able to return a layer of rowids associated with features whose geotiff band value at the centroid location is above a particular threshold. I then turn back around and delete all such features manually in Python. But it would be nice to do this with a delete from query.
The working select query is: select rowid, gdal_get_pixel_value(path/to/raster.tif, ndvi_band_num, ‘georef’, ST_X(cnt), ST_Y(cent)) as ndvi from (select ST_Centroid(geom) as cnt from layer_name) where ndvi < 600
So, collect feature centroids as a subexpression so gdal_get_pixel_value() has the inputs it needs.
Unfortunately, attempts to mutate this into a delete from query has failed.
sql_query = f"delete from {label_name} where ndvi < 600 from \
(gdal_get_pixel_value('{raster_fp}', {ndvi_band_num}, 'georef', ST_X(cnt), ST_Y(cnt)) as ndvi from \
(select ST_Centroid(geom) as cnt from {label_name}))"
Per tradition, the SQL parser doesn’t provide much insight, merely “near "from": syntax error”.
Simpler queries like “delete from layer_name where ST_Area(GEOMETRY)” just work, without needing to pass through rowids around from subexpressions. So I hope it is just a matter of SQL spellcraft on my side that’s lacking, but I do know some database engines don’t allow deleting from tables that source that table in subexpressions, so AFAIK this may just not be possible.
Ideas?
Best,
Jesse
Lead Computer Scientist
Science Systems and Applications, Inc.
Dr Compton Tucker Team
NASA Goddard Space Flight Center
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20250108/6c4e7f3c/attachment-0001.htm>
More information about the gdal-dev
mailing list