<div dir="ltr"><div>Hello Manswini, <br>Hello Regina, <br>I have read your e-mails with great interest. Many of the described problems seem familiar to me. I have been working with temporally and spatially resolved (weather) raster data for over 10 years. <br><br>Here are a few comments and recommendations: <br><br>1. After some missteps I stopped using multiband-raster. Why? </div><div><ul><li>What happens if individual days (or hours) are missing? </li><li>Can I always trust the order of the bands? </li></ul></div><div>2. I work with 1-band grids. My database structure looks like this:</div><div><img src="cid:ii_lp2bd4zu0" alt="image.png" width="338" height="222"> <br><br>3. The import is using raster2pgsql. Here is snippet: </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> def raster2pgsql(outGtiffPath, dbSchema, dbTable, dbHost=conf.dbHost, dbPort=conf.dbPort, dbName=conf.dbName, dbUser=conf.dbUser):<br> try:<br> tableNotEmptySQL = f"SELECT EXISTS (SELECT * FROM {dbSchema}.{dbTable} LIMIT 1);"<br> if executeSQL(tableNotEmptySQL, fetchone=True):<br> cmd = f"raster2pgsql -a -F -n filename -s 4326 {outGtiffPath} {dbSchema}.{dbTable} | psql -h {dbHost} -p {dbPort} -d {dbName} -U {dbUser}"<br> else:<br> cmd = f"raster2pgsql -a -C -F -n filename -s 4326 {outGtiffPath} {dbSchema}.{dbTable} | psql -h {dbHost} -p {dbPort} -d {dbName} -U {dbUser}"</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br> subprocess.call(cmd, shell=True)</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br> return True</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br> except Exception as e:<br> print(f"Importing '{outGtiffPath}' to postgres db failed.")<br> print(f"{e}\n")<br> with open(conf.failedRaster2pgsql, 'a') as file:<br> file.write(f"{datetime.now().strftime('%Y-%m-%d %X')}\t{outGtiffPath}\t{e}\n")<br> return False </blockquote><div><br>4. I read the parameters "scale" and "offset" via Python and write them to the database. <br><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> hourlyRaster = gdal.Open(str(outGtiffPath))<br> hourlyRasterScale = hourlyRaster.GetRasterBand(1).GetScale()<br> hourlyRasterOffset = hourlyRaster.GetRasterBand(1).GetOffset()<br> hourlyRaster = None</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> insertTimestampSQL = f"SET TIMEZONE = 'UTC'; UPDATE {dataset}.{dbTable} SET timestamp_w_tz = TO_TIMESTAMP('{timeStamp}', 'YYYY-MM-DD HH24:MI'), rast_scale_factor = {hourlyRasterScale}, rast_offset = {hourlyRasterOffset} WHERE filename = '{outGtiff}';" </blockquote><br>5. The query: </div><div><br></div><div><div style="color:rgb(8,8,8);font-family:"JetBrains Mono",monospace;white-space:pre"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="color:rgb(0,51,179)">select </span><span style="color:rgb(135,16,148)">timestamp_w_tz<br></span><span style="color:rgb(135,16,148)"> </span>, st_value(<span style="color:rgb(135,16,148)">rast</span>, st_setsrid(st_makepoint(<span style="color:rgb(23,80,235)">15</span>, <span style="color:rgb(23,80,235)">50</span>), <span style="color:rgb(23,80,235)">4326</span>)) * <span style="color:rgb(135,16,148)">rast_scale_factor </span>+ <span style="color:rgb(135,16,148)">rast_offset </span><span style="color:rgb(0,51,179)">as </span><span style="color:rgb(0,0,0)">value_temperature<br></span><span style="color:rgb(0,51,179)">from </span><span style="color:rgb(0,0,0)">era5</span>.<span style="color:rgb(0,0,0)">temperature_2m<br></span><span style="color:rgb(0,51,179)">where </span><span style="color:rgb(135,16,148)">timestamp_w_tz </span>>= <span style="color:rgb(6,125,23)">'1982-09-18 00:00'<br></span><span style="color:rgb(6,125,23)"> </span><span style="color:rgb(0,51,179)">and </span><span style="color:rgb(135,16,148)">timestamp_w_tz </span>< <span style="color:rgb(6,125,23)">'1982-09-19 00:00'<br></span></blockquote></div></div><div><br></div><div>6. The result (temperature in Kelvin. Of course.)<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><font face="monospace">+---------------------------------+------------------+<br>|timestamp_w_tz |value_temperature |<br>+---------------------------------+------------------+<br>|1982-09-18 00:00:00.000000 +00:00|287.70160804659935|<br>|1982-09-18 01:00:00.000000 +00:00|287.47021164940014|<br>|1982-09-18 02:00:00.000000 +00:00|286.0640335433434 |<br>|1982-09-18 03:00:00.000000 +00:00|285.56675378590086|<br>|1982-09-18 04:00:00.000000 +00:00|285.41434365889944|<br>|1982-09-18 05:00:00.000000 +00:00|285.3075453217306 |<br>|1982-09-18 06:00:00.000000 +00:00|286.3343668343021 |<br>|1982-09-18 07:00:00.000000 +00:00|287.2543900097047 |<br>|1982-09-18 08:00:00.000000 +00:00|290.88219602540977|<br>|1982-09-18 09:00:00.000000 +00:00|292.3985099166719 |<br>|1982-09-18 10:00:00.000000 +00:00|297.1844104010519 |<br>|1982-09-18 11:00:00.000000 +00:00|297.37019500841853|<br>|1982-09-18 12:00:00.000000 +00:00|297.58267920007745|<br>|1982-09-18 13:00:00.000000 +00:00|297.85078752567847|<br>|1982-09-18 14:00:00.000000 +00:00|298.1689575718274 |<br>|1982-09-18 15:00:00.000000 +00:00|297.20666005462874|<br>|1982-09-18 16:00:00.000000 +00:00|296.9697012440353 |<br>|1982-09-18 17:00:00.000000 +00:00|294.35870439679223|<br>|1982-09-18 18:00:00.000000 +00:00|292.8045660944494 |<br>|1982-09-18 19:00:00.000000 +00:00|292.6032067295789 |<br>|1982-09-18 20:00:00.000000 +00:00|289.9332483003572 |<br>|1982-09-18 21:00:00.000000 +00:00|289.59839101402565|<br>|1982-09-18 22:00:00.000000 +00:00|287.79616907430096|<br>|1982-09-18 23:00:00.000000 +00:00|287.87181789646223|<br>+---------------------------------+------------------+</font></blockquote> </div><div>I know that my example does not match your problem 100%. It is also time-consuming for you to establish the 1-band grid approach. It requires you to change your database structure. Sorry about that. <br><br>Hopefully the comments will help you. You are also welcome to contact me personally. <br><br>Best regards<br>Tobias</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Am Fr., 17. Nov. 2023 um 05:32 Uhr schrieb Manaswini Ganjam via postgis-users <<a href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Regina, <div>Thank you for the suggestions, the code looks clean and the computation was much better, but I still have the same issue, the values are unrealistic, this happened to me before as well (when I tried to run st_value for one raster band, with some x and y). Find the enclosed screenshot.</div><div><img src="cid:ii_lp2444m70" alt="image.png" width="484" height="400" style="margin-right: 0px;"><br></div><div>I have test run the code for one prcp file and the prcp values are negative, and the no data value for this dataset is -32768 (most of the values are -32767 or -32750) and I have not preprocessed this data. Do we need to manually unpack this data? If that is the case, how do I access the scale_factor and add_offset attributes packed with netcdf? </div><div><br></div><div>This information is from the climate data readme file:<span style="color:rgb(0,0,0)">The data has been packed into short integers (2 bytes instead of 4 byte reals) to save space. You must unpack that data to get the correct floating point representation of the data. Each netCDF variable that has been packed has an add_offset and scale_factor attribute associated with it. Some software automatically unpacks the data when it is read.</span></div><div><span style="color:rgb(0,0,0)">The formula to unpack the data is:</span></div><div><span style="color:rgb(0,0,0)">unpacked value = add_offset + ( (packed value) * scale_factor )</span></div><div><br></div><div>Thank you,</div><div>Manaswini</div><div><br><br><br><br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 16 Nov 2023 at 22:02, Regina Obe <<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div lang="EN-US"><div><p class="MsoNormal">One more optimization. I think you can get rid of the FOREACH band_number too and reduce all that to this query<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">So all this goes<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"> band_numbers := ARRAY(SELECT generate_series(1, 366));<br><br> -- Loop through all bands within the file<br> FOREACH band_number IN ARRAY band_numbers LOOP<u></u><u></u></p><p class="MsoNormal">:<u></u><u></u></p><p class="MsoNormal">LOOP<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">Use ST_NumBands instead of relying on your rasters all having 366 bands <a href="https://postgis.net/docs/RT_ST_NumBands.html" target="_blank">https://postgis.net/docs/RT_ST_NumBands.html</a><u></u><u></u></p><p class="MsoNormal">So replace all the above with this:<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"> INSERT INTO extracted_data_bbox (year, year_day, lat, lon, prcp, tmax, tmin, observation_time)<br> SELECT year_from_filename, n.band_number, ST_Y(sc.geom), ST_X(sc.geom), CASE WHEN variable_name = 'prcp' THEN sc.val ELSE NULL END,<br> CASE WHEN variable_name = 'tmax' THEN sc.val ELSE NULL END,<br> CASE WHEN variable_name = 'tmin' THEN sc.val ELSE NULL END, observation_time<u></u><u></u></p><p class="MsoNormal"> FROM generate_series(1, ST_NumBands(raster_record.clipped_raster) ) AS n(band_number), ST_PixelAsCentroids(raster_record.clipped_raster, n.band_number, true) AS sc;<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">That will solve the issue of you going over the band count of your raster, which might be an issue you are running into <u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><u></u> <u></u></p><div style="border-top:none;border-right:none;border-bottom:none;border-left:1.5pt solid blue;padding:0in 0in 0in 4pt"><div><div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in"><p class="MsoNormal"><b>From:</b> Regina Obe <<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>> <br><b>Sent:</b> Thursday, November 16, 2023 9:46 PM<br><b>To:</b> 'Manaswini Ganjam' <<a href="mailto:manu.ganjam@gmail.com" target="_blank">manu.ganjam@gmail.com</a>><br><b>Cc:</b> 'PostGIS Users Discussion' <<a href="mailto:postgis-users@lists.osgeo.org" target="_blank">postgis-users@lists.osgeo.org</a>><br><b>Subject:</b> RE: [postgis-users] Extracting variable information from netcdf, imported as raster to a table<u></u><u></u></p></div></div><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">I don’t know much about netCDF but I would assume GDAL would handle the packing unraveling behind the scenes.<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">Some things I notice wrong<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><a href="https://postgis.net/docs/RT_ST_PixelAsCentroids.html" target="_blank">https://postgis.net/docs/RT_ST_PixelAsCentroids.html</a> is a set returning function, so you can’t just stuff it in a record variable. It has to go in a table.<u></u><u></u></p><p class="MsoNormal">So I’d get rid of this line<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal" style="margin-bottom:12pt"> SELECT * FROM ST_PixelAsCentroids(raster_record.clipped_raster, band_number, true) INTO pixel_data;<u></u><u></u></p><p class="MsoNormal">I’d also scrap this, because ultimately you want to work on all centroids not the first one that falls out of the tree<u></u><u></u></p><p class="MsoNormal" style="margin-bottom:12pt">EXECUTE format('SELECT ST_Value($1, $2, $3, true)', raster_record.clipped_raster, band_number, centroid_point) INTO variable_value USING raster_record.clipped_raster, band_number, centroid_point;<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">I’d change this<u></u><u></u></p><p class="MsoNormal"><br> -- Insert the extracted data into the extracted_data_bbox table<br> INSERT INTO extracted_data_bbox (year, year_day, lat, lon, prcp, tmax, tmin, observation_time)<br> VALUES (year_from_filename, band_number, ST_Y(centroid_point), ST_X(centroid_point), CASE WHEN variable_name = 'prcp' THEN variable_value ELSE NULL END,<br> CASE WHEN variable_name = 'tmax' THEN variable_value ELSE NULL END,<br> CASE WHEN variable_name = 'tmin' THEN variable_value ELSE NULL END, observation_time);<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">To this:<u></u><u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><b> INSERT INTO extracted_data_bbox (year, year_day, lat, lon, prcp, tmax, tmin, observation_time)<br> SELECT year_from_filename, band_number, ST_Y(sc.geom), ST_X(sc.geom), CASE WHEN variable_name = 'prcp' THEN sc.val ELSE NULL END,<br> CASE WHEN variable_name = 'tmax' THEN sc.val ELSE NULL END,<br> CASE WHEN variable_name = 'tmin' THEN sc.val ELSE NULL END, observation_time<u></u><u></u></b></p><p class="MsoNormal"><b> FROM ST_PixelAsCentroids(raster_record.clipped_raster, band_number, true) AS sc;<u></u><u></u></b></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><u></u> <u></u></p><div style="border-top:none;border-right:none;border-bottom:none;border-left:1.5pt solid blue;padding:0in 0in 0in 4pt"><div><div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in"><p class="MsoNormal"><b>From:</b> Manaswini Ganjam <<a href="mailto:manu.ganjam@gmail.com" target="_blank">manu.ganjam@gmail.com</a>> <br><b>Sent:</b> Thursday, November 16, 2023 1:19 PM<br><b>To:</b> Regina Obe <<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>><br><b>Cc:</b> PostGIS Users Discussion <<a href="mailto:postgis-users@lists.osgeo.org" target="_blank">postgis-users@lists.osgeo.org</a>><br><b>Subject:</b> Re: [postgis-users] Extracting variable information from netcdf, imported as raster to a table<u></u><u></u></p></div></div><p class="MsoNormal"><u></u> <u></u></p><div><p class="MsoNormal">Thank you, Regina,<u></u><u></u></p><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">I apologize, I have shared the older version of my code, I have tried ST_pixelascentroid and ST_value and the issue was I messed up the parameters of centroids and that caused the errors now I corrected that as well. I think the code below is a good representation of my approach, I would like to iterate for every raster in a table, for every band and for every lat and lon and extract st value, and insert it to a table, <u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">psql:/home/manaswini/MEGA/bbgc_uw_rpackage/rproject/raster2pgsql/extract_table_queryyyy.sql:66: NOTICE: Invalid band index (must use 1-based). Returning empty set for all the coordinates and bands. What could be the issue. <u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">Is this because my data is packed? the readme file of the data has this information: <span style="color:black">Details about the data packing:</span><u></u><u></u></p></div><pre><span style="color:black">The data has been packed into short integers (2 bytes instead of 4 byte reals) to save space. You must unpack that data to get the correct floating point representation of the data. Each netCDF variable that has been packed has an add_offset and scale_factor attribute associated with it. Some software automatically unpacks the data when it is read.<u></u><u></u></span></pre><pre><span style="color:black"><u></u> <u></u></span></pre><pre><span style="color:black">The formula to unpack the data is:<u></u><u></u></span></pre><pre><span style="color:black"><u></u> <u></u></span></pre><pre><span style="color:black">unpacked value = add_offset + ( (packed value) * scale_factor )<u></u><u></u></span></pre><pre><span style="color:black"><u></u> <u></u></span></pre><pre><span style="color:black">For more information see here:<u></u><u></u></span></pre><pre><span style="color:black"><a href="https://www.unidata.ucar.edu/software/netcdf/workshops/2010/bestpractices/Packing.html" target="_blank">https://www.unidata.ucar.edu/software/netcdf/workshops/2010/bestpractices/Packing.html</a><u></u><u></u></span></pre><pre><span style="color:black"><u></u> <u></u></span></pre><pre><span style="color:black">There's also another attribute called "missing_value". In this case all the -32768 values you see are missing. Only the grid points outside the downscaling domain is given the missing data value.<u></u><u></u></span></pre><pre><span style="color:black"><u></u> <u></u></span></pre><pre><span style="color:black">The packing saves a lot of space, that is why the data is packed.<u></u><u></u></span></pre><p class="MsoNormal"><u></u> <u></u></p><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">-- Create the precipitation_temperature_data table<br>DROP TABLE IF EXISTS extracted_data_bbox;<br>CREATE TABLE extracted_data_bbox (<br> id SERIAL PRIMARY KEY,<br> year integer,<br> year_day integer,<br> lat double precision,<br> lon double precision,<br> prcp double precision,<br> tmax double precision,<br> tmin double precision,<br> observation_time timestamp<br>);<br><br>-- Loop through all records in gfdl_03_bbox<br>DO $$<br>DECLARE<br> raster_record RECORD;<br> year_from_filename integer;<br> observation_time timestamp;<br> centroid_point geometry;<br> variable_name text;<br> variable_value double precision;<br> band_number integer;<br> band_numbers integer[];<br> pixel_data RECORD;<br>BEGIN<br> FOR raster_record IN (SELECT * FROM gfdl_03_bbox) LOOP<br> SELECT regexp_replace(raster_record.filename, '.*_(\d{4})[^0-9]+', '\1')::integer INTO year_from_filename;<br><br> -- Determine the variable name based on the filename<br> IF raster_record.filename ~ 'prcp' THEN<br> variable_name := 'prcp';<br> ELSIF raster_record.filename ~ 'tmax' THEN<br> variable_name := 'tmax';<br> ELSIF raster_record.filename ~ 'tmin' THEN<br> variable_name := 'tmin';<br> ELSE<br> RAISE EXCEPTION 'Unknown variable in filename: %', raster_record.filename;<br> END IF;<br><br> band_numbers := ARRAY(SELECT generate_series(1, 366));<br><br> -- Loop through all bands within the file<br> FOREACH band_number IN ARRAY band_numbers LOOP<br> -- Print the band number to the PostgreSQL log<br> RAISE NOTICE 'Band Number: %', band_number;<br><br> observation_time := MAKE_DATE(year_from_filename, 1, 1) + (band_number - 1) * interval '1 day';<br><br> SELECT * FROM ST_PixelAsCentroids(raster_record.clipped_raster, band_number, true) INTO pixel_data;<br><br> -- Extract the centroid point from pixel_data<br> centroid_point := pixel_data.geom;<br><br> -- Extract the variable value based on the variable name<br> EXECUTE format('SELECT ST_Value($1, $2, $3, true)', raster_record.clipped_raster, band_number, centroid_point) INTO variable_value USING raster_record.clipped_raster, band_number, centroid_point;<br><br> -- Insert the extracted data into the extracted_data_bbox table<br> INSERT INTO extracted_data_bbox (year, year_day, lat, lon, prcp, tmax, tmin, observation_time)<br> VALUES (year_from_filename, band_number, ST_Y(centroid_point), ST_X(centroid_point), CASE WHEN variable_name = 'prcp' THEN variable_value ELSE NULL END,<br> CASE WHEN variable_name = 'tmax' THEN variable_value ELSE NULL END,<br> CASE WHEN variable_name = 'tmin' THEN variable_value ELSE NULL END, observation_time);<br> END LOOP;<br> END LOOP;<br>END;<br>$$;<br><br>I have tried the following query to check values for a raster band : the output is again null psql:/home/manaswini/MEGA/bbgc_uw_rpackage/rproject/raster2pgsql/test_query_fff.sql:22: NOTICE: Band: 20, X: 1, Y: 135, Value: <NULL><u></u><u></u></p></div><p class="MsoNormal">:<u></u><u></u></p><div><p class="MsoNormal">DO $$<br>DECLARE<br> band_number integer := 20; -- Replace with the desired band number<br> x_coord integer;<br> y_coord integer;<br> pixel_value double precision;<br> srid integer := 4326; -- Replace with the correct SRID for your data<br>BEGIN<br> -- Loop through all x and y coordinates in the raster band<br> FOR x_coord IN (SELECT generate_series(1, ST_Width(rast.clipped_raster)) FROM gfdl_03_bbox AS rast WHERE rast.filename = '<a href="http://prcp_03_2034.nc" target="_blank">prcp_03_2034.nc</a>') LOOP<br> FOR y_coord IN (SELECT generate_series(1, ST_Height(rast.clipped_raster)) FROM gfdl_03_bbox AS rast WHERE rast.filename = '<a href="http://prcp_03_2034.nc" target="_blank">prcp_03_2034.nc</a>') LOOP<br> -- Get the pixel value at the current x and y coordinates<br> SELECT ST_Value(ST_SetSRID(rast.clipped_raster, srid), band_number, ST_SetSRID(ST_Point(x_coord, y_coord), srid)) INTO pixel_value<br> FROM gfdl_03_bbox AS rast<br> WHERE rast.filename = '<a href="http://prcp_03_2034.nc" target="_blank">prcp_03_2034.nc</a>';<br><br> -- Print or use the pixel value as needed<br> RAISE NOTICE 'Band: %, X: %, Y: %, Value: %', band_number, x_coord, y_coord, pixel_value;<br> END LOOP;<br> END LOOP;<br>END;<br><br>Thank you,<u></u><u></u></p></div><div><p class="MsoNormal">Manaswini<u></u><u></u></p><div><p class="MsoNormal"><u></u> <u></u></p></div></div></div><p class="MsoNormal"><u></u> <u></u></p><div><div><p class="MsoNormal">On Wed, 15 Nov 2023 at 20:42, Regina Obe <<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>> wrote:<u></u><u></u></p></div><blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt"><div><div><div><p class="MsoNormal">I didn’t realize that netCDF also has a vector component.<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal"><a href="https://gdal.org/drivers/vector/netcdf.html#vector-netcdf" target="_blank">https://gdal.org/drivers/vector/netcdf.html#vector-netcdf</a><u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">To read the vector component, you’d use the ogr_fdw extension to read that rather than postgis_raster extension.<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">Details here - <a href="https://github.com/pramsey/pgsql-ogr-fdw" target="_blank">https://github.com/pramsey/pgsql-ogr-fdw</a><u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">So perhaps that is what your python is doing reading the vector component. I don’t know if netcdf mixes those in one data set or not since I have no experience using netcdf.<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">I recall you said you are using the OSGeo Live 16 distribution. I just checked the OSGeoLive 16 does include ogr_fdw <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">So do <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">CREATE EXTENSION ogr_fdw;<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">The list of supported formats you can see with this query:<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">SELECT name FROM unnest(ogr_fdw_drivers()) AS f(name) ORDER BY name;<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">Which for osgeolive 16, I see netCDF listed<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><div style="border-top:none;border-right:none;border-bottom:none;border-left:1.5pt solid blue;padding:0in 0in 0in 4pt"><div><div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in"><p class="MsoNormal"><b>From:</b> Regina Obe <<a href="mailto:lr@pcorp.us" target="_blank">lr@pcorp.us</a>> <br><b>Sent:</b> Wednesday, November 15, 2023 6:19 PM<br><b>To:</b> 'PostGIS Users Discussion' <<a href="mailto:postgis-users@lists.osgeo.org" target="_blank">postgis-users@lists.osgeo.org</a>><br><b>Cc:</b> 'Manaswini Ganjam' <<a href="mailto:manu.ganjam@gmail.com" target="_blank">manu.ganjam@gmail.com</a>><br><b>Subject:</b> RE: [postgis-users] Extracting variable information from netcdf, imported as raster to a table<u></u><u></u></p></div></div><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">Just confirming some stuff, since I’m not completely following:<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">Raster_record.rast column is of type raster correct? IF so ST_X and ST_Y won’t work since those are for geometry types.<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">Also ST_Value(raster_record.rast, band_number), won’t work either since that expects as input a geometry or x,y on the raster you want the value you.<u></u><u></u></p><p class="MsoNormal">I would think you would have gotten an error with that, which makes me feel I’m missing something critical.<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">If you want to extract all the pixels in a raster, you’d do something like <a href="https://postgis.net/docs/RT_ST_PixelAsPoints.html" target="_blank">https://postgis.net/docs/RT_ST_PixelAsPoints.html</a><u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal">SELECT pp.x, pp.y, pp.val, ST_X(pp.geom) AS lon, ST_Y(pp.geom) AS lat<u></u><u></u></p><p class="MsoNormal">FROM raster_record, <u></u><u></u></p><p class="MsoNormal">ST_PixelAsPoints(raster_record.rast, 1) AS pp<u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><p class="MsoNormal"><br> <u></u><u></u></p><p class="MsoNormal"> <u></u><u></u></p><div style="border-top:none;border-right:none;border-bottom:none;border-left:1.5pt solid blue;padding:0in 0in 0in 4pt"><div><div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in"><p class="MsoNormal"><b>From:</b> postgis-users <<a href="mailto:postgis-users-bounces@lists.osgeo.org" target="_blank">postgis-users-bounces@lists.osgeo.org</a>> <b>On Behalf Of </b>Manaswini Ganjam via postgis-users<br><b>Sent:</b> Wednesday, November 15, 2023 2:01 PM<br><b>To:</b> <a href="mailto:postgis-users@lists.osgeo.org" target="_blank">postgis-users@lists.osgeo.org</a><br><b>Cc:</b> Manaswini Ganjam <<a href="mailto:manu.ganjam@gmail.com" target="_blank">manu.ganjam@gmail.com</a>><br><b>Subject:</b> [postgis-users] Extracting variable information from netcdf, imported as raster to a table<u></u><u></u></p></div></div><p class="MsoNormal"> <u></u><u></u></p><div><p class="MsoNormal">Hi, <u></u><u></u></p><div><div><p class="MsoNormal">I have been trying to download s3 cloud stored gridded climate data and generate tables with variables, lat, lon and timestamp (year, yearday). To achieve this I used raster2pgsql and imported multiple netcdf files into a database table. <u></u><u></u></p><div><p class="MsoNormal"> <u></u><u></u></p></div><div><p class="MsoNormal">Question: How to achieve the extraction of variables using postgis? I tried using ST_value, ST_pixelaspoints but I was getting errors, mainly due to the format in which netcdfs are stored in the database (the error says can't load some characters like 00E30100082...), I even tried changing the datatype to float but still did not work. I mean it is probably not simple like selecting a variable from the netcdf. I have enclosed my sql query below: <u></u><u></u></p></div><div><p class="MsoNormal"> <u></u><u></u></p></div><div><p class="MsoNormal"> -- Iterate through all raster files in the table<br> FOR raster_record IN (SELECT * FROM gfdl_03_prcp) LOOP<br> -- Determine the year from the raster file name, assuming the format is '<a href="http://prcp_03_1950.nc" target="_blank">prcp_03_1950.nc</a>'<br> SELECT<br> regexp_replace(raster_record.filename, '.*_(\d{4})\.nc', '\1')::integer<br> INTO<br> year;<br> <br> -- Calculate the start date of the year<br> year_start := (year || '-01-01')::date;<br> <br> -- Determine if the year is a leap year<br> is_leap_year := EXTRACT(ISODOW FROM (year_start + interval '1 year')) = 7;<br> <br> -- Set the number of bands for the year (365 for non-leap years, 366 for leap years)<br> FOR band_number IN 1..(CASE WHEN is_leap_year THEN 366 ELSE 365 END) LOOP<br> -- Calculate the observation_time using the year and band number<br> observation_time := year_start + (band_number - 1) * interval '1 day';<br> <br> -- Extract X (lon) and Y (lat) coordinates from the raster<br> SELECT<br> ST_X(raster_record.rast) AS lon,<br> ST_Y(raster_record.rast) AS lat<br> INTO<br> lon,<br> lat;<br> <br> -- Insert the lat, lon, prcp, and observation_time into the extracted_values table<br> INSERT INTO extracted_values (lat, lon, prcp, observation_time)<br> VALUES<br> (lat, lon, ST_Value(raster_record.rast, band_number), observation_time);<br> <br> -- Increment the counter<br> counter := counter + 1;<br> <br> -- Commit the transaction periodically in batches<br> IF counter % batch_size = 0 THEN<br> COMMIT;<br> END IF;<br> END LOOP;<br> END LOOP;<br> <u></u><u></u></p></div><div><p class="MsoNormal"> <u></u><u></u></p></div><div><p class="MsoNormal">The metadata for the two files is as follows: <u></u><u></u></p></div><div><p class="MsoNormal"> <u></u><u></u></p></div><div><p class="MsoNormal">File from database:<u></u><u></u></p></div><div><div><div><div><div><div><pre>{'NC_GLOBAL#Conventions': 'CF-1.5',<u></u><u></u></pre><pre> 'NC_GLOBAL#GDAL': 'GDAL 3.6.4, released 2023/04/17',<u></u><u></u></pre><pre> 'NC_GLOBAL#history': 'Wed Nov 15 13:32:13 2023: GDAL CreateCopy( <a href="http://not_clipped_prcp.nc" target="_blank">not_clipped_prcp.nc</a>, ... )'}<u></u><u></u></pre></div></div></div></div></div><div><pre>File before loading into the database:<u></u><u></u></pre><pre>{'lat#units': 'degrees_north',<u></u><u></u></pre><pre> 'lon#units': 'degrees_east',<u></u><u></u></pre><pre> 'NC_GLOBAL#title': 'Daily statistically downscaled CMIP5 data for the United States and southern Canada east of the Rocky Mountains, version 1.0, realization 1, 0.1x0.1 degree spatial resolution.',<u></u><u></u></pre><pre> 'NETCDF_DIM_EXTRA': '{time}',<u></u><u></u></pre><pre> 'NETCDF_DIM_time_DEF': '{366,4}',<u></u><u></u></pre><pre> 'NETCDF_DIM_time_VALUES': '{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14....362,363,364,365}',<u></u><u></u></pre><pre> 'prcp#add_offset': '819.17499',<u></u><u></u></pre><pre> 'prcp#long_name': 'daily precipitation accumulation',<u></u><u></u></pre><pre> 'prcp#missing_value': '-32768',<u></u><u></u></pre><pre> 'prcp#scale_factor': '0.025',<u></u><u></u></pre><pre> 'prcp#units': 'mm',<u></u><u></u></pre><pre> 'prcp#_FillValue': '-32768',<u></u><u></u></pre><pre> 'time#units': 'days since 1952-1-1 0:0:0.0'}<u></u><u></u></pre><pre> <u></u><u></u></pre><pre><span style="font-family:Arial,sans-serif">In case this information is useful: Previously I used python to extract variable information and generate a csv or table using this variable information, and the code is enclosed below. In the code I extracted variable values using lon = dataset.variables['lon'][:] and iterated for loops to write them all in csv. </span><u></u><u></u></pre></div><div><div><p class="MsoNormal">Python code:<u></u><u></u></p></div><div><p class="MsoNormal">import netCDF4 as nc<br><br># Step 1: Read the NetCDF file<br>filename = "/home/manaswini/<a href="http://prcp_03_1950.nc" target="_blank">prcp_03_1950.nc</a>"<br>dataset = nc.Dataset(filename)<br>dataset.set_auto_mask(False)<br>dataset.set_auto_scale(True)<u></u><u></u></p></div><div><p class="MsoNormal"> <u></u><u></u></p></div><div><p class="MsoNormal">lon = dataset.variables['lon'][:]<br>lat = dataset.variables['lat'][:]<br>time = dataset.variables['time'][:]<br>prcp = dataset.variables['prcp'][:]<u></u><u></u></p></div><div><p class="MsoNormal"> <u></u><u></u></p></div><div><p class="MsoNormal">import numpy as np<br>import csv<br><br># csv_buffer<br>csv_buffer = open('output.csv', 'w', newline='')<br>csv_writer = csv.writer(csv_buffer)<br><br># Iterate through grid points and write to CSV buffer<br>for i in enumerate(lon):<br> for j in enumerate(lat):<br> for k in enumerate(time):<br> csv_writer.writerow([lat[j], lon[i], prcp[i][j][k], year[k], yearday[k]])<br><br><br># Close the CSV buffer<br>csv_buffer.close()<u></u><u></u></p></div><div><p class="MsoNormal"> <u></u><u></u></p></div><div><p class="MsoNormal">Thank you, <u></u><u></u></p></div><div><p class="MsoNormal"><span style="font-family:"Trebuchet MS",sans-serif">Manaswini Ganjam</span><u></u><u></u></p></div></div></div></div></div></div></div></div></div></div></div></blockquote></div><p class="MsoNormal"><br clear="all"><u></u><u></u></p><div><p class="MsoNormal"><u></u> <u></u></p></div><p class="MsoNormal"><span>-- </span><u></u><u></u></p><div><div><p class="MsoNormal"><span style="font-size:10pt;font-family:"Trebuchet MS",sans-serif">Manaswini Ganjam</span><u></u><u></u></p></div></div></div></div></div></div></div></blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><font size="2" face="trebuchet ms, sans-serif">Manaswini Ganjam</font></div></div>
_______________________________________________<br>
postgis-users mailing list<br>
<a href="mailto:postgis-users@lists.osgeo.org" target="_blank">postgis-users@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/postgis-users" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/postgis-users</a><br>
</blockquote></div>