[postgis-users] Raster elevation x/y lookup (was: Splitting/merging linework into equal interval linestrings)
Dane Springmeyer
blake at hailmail.net
Thu Aug 14 15:31:33 PDT 2008
Postgis-users,
Another note on using GDAL's python bindings to update a postgis table
with raster elevation values.
Soon after getting the below script working I realized how expensive
it is to call an update statement to the database the way I was doing
inside the raster lookup function loop.
In this case I am trying to calculate the high and lowpoint for each
start/endpoint of 26,000 linestrings based on a 1.7 GB 10m dem.
I've found that it is much faster to build a python list of values and
use them to generate a sql insert script to be run from the filesystem
like: 'psql -f python_generated_insert.sql'. Then back inside the
database a single update statement can be used by joining to the newly
inserted table.
This new method takes a total of about 7 seconds:
6.2 seconds to query for the lookup coordinates, loop through and
get the elevation value using gdal and write out an insert statement
to the filesystem
22 seconds to run the insert statement
24 seconds to update the original table based on the newly inserted
table values
The revised script that uses this method can be found here: http://code.google.com/p/dbsgeo/source/browse/trunk/postgis/hydro_model/scripts/elevation_lookup.py
Compared to the below script which takes at least 7 seconds to process
50 records, and over 40 minutes to process all 26,000. Ouch.
Cheers,
Dane
On Jul 15, 2008, at 6:20 PM, Dane Springmeyer wrote:
> For anyone interested I've worked up Regina's great python snippet
> into a full script.
>
> The script pulls records of x/y values from a river hydrography
> network from both upstream and downstream nodes of each equi-length
> linestring. It then iterates through each record and updates the
> table with the elevation values for the low and highpoint of each
> river segment. Stream gradient is then easy to calculate back inside
> the database.
>
> I'm sure there are ways to improve or optimize this script, so if
> anyone has any pointers don't hesitate to send them along.
>
> Thanks again to Regina for the great push forward.
>
> Cheers,
> Dane
>
>
> ###########
>
> from osgeo import gdal
> import math
> import struct
>
> import psycopg2
> import psycopg2.extras
>
> gdal.SetCacheMax(15)
>
> connection = psycopg2.connect("dbname='beaver' user='postgres'
> host='localhost'");
> sql = connection.cursor(cursor_factory=psycopg2.extras.DictCursor)
>
> dataset = gdal.Open('/Volumes/Wren/Users/spring/projects/beaver/data/
> elevation/uw/tiff/methow-tiled.tif', gdal.GA_ReadOnly)
> geotransform = dataset.GetGeoTransform()
> min_x = geotransform[0]
> max_y = geotransform[3]
> res_x = abs(geotransform[1])
> res_y = abs(geotransform[5])
>
> sql.execute("""SELECT oid, ST_X(ST_StartPoint(geometry)) as down_x,
> ST_Y(ST_StartPoint(geometry)) as down_y,ST_X(ST_EndPoint(geometry))
> as up_x, ST_Y(ST_EndPoint(geometry)) as up_y FROM sshiap_splits
> ORDER by OID;""")
>
> query_records = float(sql.rowcount) - 1.0
>
> print 'Beginning lookup of %s features...' % query_records
>
> for row in sql.fetchall():
> oid = row['oid']
> down_x = row['down_x']
> down_y = row['down_y']
> up_x = row['up_x']
> up_y = row['up_y']
> down_row = int(math.ceil((max_y - down_y)/res_y))
> down_col = int(math.ceil((down_x - min_x)/res_x))
> up_row = int(math.ceil((max_y - up_y)/res_y))
> up_col = int(math.ceil((up_x - min_x)/res_x))
> if down_row > 0 and down_col > 0 and up_row > 0 and up_col > 0
> and dataset.RasterYSize > down_row and dataset.RasterXSize >
> down_col and dataset.RasterYSize > up_row and dataset.RasterXSize >
> up_col:
> down_raster_scan = dataset.ReadRaster(down_col - 1, down_row
> - 1, 1, 1)
> down_raster_area = struct.unpack('f' * 1, down_raster_scan)
> lowpoint = int(down_raster_area[0])
> up_raster_scan = dataset.ReadRaster(up_col - 1, up_row - 1,
> 1, 1)
> up_raster_area = struct.unpack('f' * 1, up_raster_scan)
> highpoint = int(up_raster_area[0])
> sql.execute("""UPDATE sshiap_splits SET lowpoint = %s,
> highpoint = %s WHERE oid = %s""" % (lowpoint, highpoint, oid))
> if oid/query_records in [.1,.2,.5,.10,.15,.20, .
> 25,.30,.35,.40,.45,.50,.55,.60,.65,.70,.75,.80,.85,.90,.95,1.0]:
> print '%s%s completed' % (oid/query_records*100, '%')
> else:
> print 'error
> ---------------------------------------------------------------------------------'
>
> print 'Complete'
> connection.commit()
>
>
>
>
> On Jul 10, 2008, at 11:38 AM, Dane Springmeyer wrote:
>
>> Regina and Brent:
>>
>> Wow. Thanks so much for the pointers.
>>
>> I've now got a single query that is working excellently so far.
>> Without indexes it runs on the entire dataset in about 5 minutes
>> when splitting lines to ~ 150 m, which is just fine.
>>
>> I was able to use the ST_Reverse function to make sure that any
>> fragments at the end of the linestrings would occur at the upstream
>> end, which I think handles my minimum requirement (for now).
>>
>> Below is what just a slight reworking of Regina's second email
>> example looks like.
>>
>> Thanks!
>>
>> Dane
>>
>>
>> ------
>>
>> drop table if exists test2;
>> create table test2 (oid serial,strahler varchar, geometry geometry);
>> INSERT
>> INTO test2 (strahler, geometry)
>> SELECT strahler,
>> -- take a substring if the length is greater than 152.4 meters
>> otherwise take the remainder
>> ST_Line_Substring(geometry, 152.4*n/length,
>> CASE
>> WHEN 152.4*(n+1) < length THEN 152.4*(n+1)/length
>> ELSE 1
>> END) As geometry
>> FROM
>> (SELECT strahler,
>> -- reverse the vertex order so that the upstream end of each
>> linestring will be the remainder
>> ST_LineMerge(ST_Reverse(ts.geometry)) AS geometry,
>> ST_Length(ts.geometry) As length
>> FROM sshiap_lines ts
>> ) t
>> CROSS JOIN generate_series(0,10000) n
>> WHERE n*152.4/length < 1;
>>
>>
>>
>> On Jul 9, 2008, at 10:21 PM, Paragon Corporation wrote:
>>
>>> Dane,
>>>
>>> I recall doing something along 2 and 3 before and I did it in python
>>> (although I was interested in getting each measure at x distance
>>> along the
>>> line so I was collecting
>>> Points every x distance meters along the way rather than breaking
>>> up the
>>> lines. So my translation of what I did to what you want may have
>>> a lot of
>>> logical errors)
>>>
>>> For 2 - I think your logic would look something like
>>>
>>> 2) SELECT gid, length*frac As dist,
>>> ST_Line_Substring(the_geom,startfrac,
>>> endfrac) As the_geom
>>> FROM (SELECT gid, 500*n/length As startfrac, CASE WHEN 500*(n+1) <
>>> length
>>> THEN 500*(n+1)/length ELSE 1 END As endfrac, length, the_geom
>>> FROM (SELECT ts.gid, ST_LineMerge(ts.the_geom) As the_geom,
>>> ST_Length(ts.the_geom) As length FROM line_segments ts
>>> ) t CROSS JOIN generate_series(0,10000) n
>>> WHERE n*500/length < 1) As segments_500
>>>
>>>
>>>
>>> The above basically assumes your data is in some meter projection
>>> and you
>>> have no line segment that has got more than 10000 (500 ft segments)
>>> If you know for sure there is no way you have a line segment
>>> greater than
>>> 500*10000 then you can safely reduce that generate_series number
>>> and get
>>> better performance.
>>>
>>> This I did do in python by the way - and my logic is much more
>>> complicated
>>> than above but hopefully I cut out enough for you to get the basic
>>> idea
>>>
>>> 3) For this I used python and the gdal and psycopg libraries.
>>> Basic logic
>>> looks something like Below
>>> open up a tile (my above query (maybe taking the centroid of each
>>> point I
>>> actually limited to only those linesegments that intersected the
>>> tile I was
>>> loading so I processed one tile at a time) - so my above query had
>>> an extra
>>> join with the tile extent
>>>
>>> Looks something like below where pt is an array of the centroid
>>> points of
>>> the above query
>>>
>>> dataset =
>>> osgeo.gdal.Open('%(atilefolder)s/%(atilefile)s'%
>>> {'atilefolder':atilefolder,'
>>> atilefile':atilefile}, GA_ReadOnly)
>>> geot = dataset.GetGeoTransform()
>>> #0 0 row col is at top left corner of the tif so we
>>> need min_x, max_y that corresponds to pos (0,0)
>>> min_x = geot[0]
>>> max_y = geot[3]
>>> res_x = abs(geot[1])
>>> res_y = abs(geot[5])
>>> print 'Origin of tile ', atilefile , '= (',min_x,
>>> ',',max_y,')'
>>> print 'Pixel Size = (',res_x, ',',res_y,')'
>>> print 'Size is
>>> ',dataset.RasterXSize,'x',dataset.RasterYSize,
>>> 'x',dataset.RasterCount
>>> for pt in r:
>>> #for each point figure out the row col
>>> position in the tiff
>>> gid = pt[0]
>>> x = pt[2]
>>> y = pt[3]
>>> #print max_y - y
>>> row = int(math.ceil((max_y - y)/res_y))
>>> col = int(math.ceil((x - min_x)/res_x))
>>>
>>> #If point falls in grid proceed to determine
>>> elevation from tiff
>>> #Some line segments may have measures that
>>> span multiple tiles so we need to skip the measurement points that
>>> don't
>>> fall in the tile
>>> if row > 0 and col > 0 and
>>> dataset.RasterYSize > row and dataset.RasterXSize > col:
>>> numpoints = numpoints + 1
>>> pt_dist = pt[1]
>>>
>>> #grab 1 pixel at row col position
>>> in tif and return the band1 - which is the elevation
>>> rd_scan = dataset.ReadRaster(col -
>>> 1, row - 1, 1, 1)
>>> rd_area = struct.unpack('f' * 1,
>>> rd_scan)
>>> elev = int(rd_area[0])
>>> #your update statement goes here
>>>
>>>
>>> Hope that helps,
>>> Regina
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: postgis-users-bounces at postgis.refractions.net
>>> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf
>>> Of Dane
>>> Blakely Springmeyer
>>> Sent: Wednesday, July 09, 2008 11:25 PM
>>> To: postgis-users at postgis.refractions.net
>>> Subject: [postgis-users] Splitting/merging linework into equal
>>> intervallinestrings
>>>
>>> PostGIS users,
>>>
>>> I have 1:24k hydrographic linework that I need process in several
>>> successive
>>> steps using a postgis workflow.
>>>
>>> I'm stuck at step 2 of this overall workflow:
>>>
>>> 1) separate all linestrings into distinct Strahler Order groups
>>> (done)
>>> 2) split all linework into linestrings of 500 ft (while avoiding any
>>> linestring fragments smaller than 500 ft)
>>> 3) perform an elevation lookup to raster data to calculate the
>>> gradient of
>>> each 500 ft segment and..
>>> 4) combine all adjacent linestrings which fall into similar gradient
>>> classes.
>>>
>>> For step 2 I'm investigating using ST_Segmentize() and/or
>>> ST_Line_substring(), but I'd really appreciate some help with how
>>> to best
>>> approach the problem.
>>>
>>> ST_Segmentize seems to only insert more nodes/points into already
>>> very high
>>> resolution linework (ie nodes already exist at intervals much more
>>> frequent
>>> than 500ft), thus extracting linestrings from the result of
>>> ST_Segmentize
>>> clearly isn't as simple as using MakeLine() between each segment.
>>>
>>> ST_Line_Substring works on percentages which is smart, but I have
>>> yet to
>>> wrap my mind around how to measure each individual line such that
>>> I can
>>> translate percentage distance along a linestring into equal distance
>>> intervals.
>>>
>>> Anyone have examples or guidance?
>>>
>>> Thanks,
>>>
>>> Dane
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>>>
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20080814/0f758338/attachment.html>
More information about the postgis-users
mailing list