[GRASS-dev] [GRASS GIS] #2131: Terrible performance from v.what.rast due to per-iteration db_execute

GRASS GIS trac at osgeo.org
Tue Nov 12 02:26:04 PST 2013


#2131: Terrible performance from v.what.rast due to per-iteration db_execute
-------------------------------------+--------------------------------------
 Reporter:  hamish                   |       Owner:  grass-dev@…              
     Type:  defect                   |      Status:  new                      
 Priority:  major                    |   Milestone:  6.4.4                    
Component:  Database                 |     Version:  svn-develbranch6         
 Keywords:  v.what.rast, db_execute  |    Platform:  Linux                    
      Cpu:  x86-64                   |  
-------------------------------------+--------------------------------------
 Hi,

 I'm running v.what.rast for 175k query points in 6.x. It's taking a
 horribly long time.
 With debug at level 1 it shows that it gets done with the query processing
 and
 on to the "Updating db table" stage in less than 1 second. Over an *hour
 later* I'm still waiting for the dbf process, which is running at 99% cpu!
 This
 is a fast workstation too.

 v.out.ascii's columns= option was suffering the same trouble last time I
 tried,
 to the point where it becomes unusable with more than ~ 10k vector points.

 The v.colors, v.in.garmin, and v.in.gpsbabel scripts /used to/ suffer from
 the same
 thing, but we sped that up by writing all the sql commands to a temp file
 and
 then just running db.execute once. It seems that opening and closing the
 database has non-trivial overhead associated with it, and when you do that
 for
 every single cat it adds up in a pretty impressive way. Even if another DB
 backend is faster to start+write+stop, I doubt it would be more than ~20%
 different, max. It seems 100k points takes much much longer than just 10x
 the time for a 10k point vector map.

 demo:
 {{{
 g.region rast=elevation
 v.random out=test_100k_pts n=100000
 v.db.addtable test_100k_pts column='cat integer, elev double'   #gets slow
 too!
 time v.what.rast vect=test_100k_pts rast=elevation column=elev
 }}}


 My current workaround is to add a flag to v.what.rast to optionally print
 the
 result to stdout instead of writing it to a db column. (done locally, I'm
 still
 testing some other interpolation improvements so haven't committed
 anything yet)
 With that -p flag, the module takes 0.5 seconds to complete when stdout is
 redirected to /dev/null.

 any thoughts on the idea to write the sql commands to a to tempfile or
 pipe,
 then run db_execute_immediate() just once for all of them?

 (maybe the per-iteration bsearch() in the loop is inefficient too, but
 `top`
 shows that 'dbf' is the thing eating all the cpu time)


 in trunk it takes about 6 seconds to complete the 100k random points, I'm
 not seeing anything obvious in the module changelog, so I guess something
 in the libraries got fixed? any hints?


 thanks,
 Hamish

-- 
Ticket URL: <https://trac.osgeo.org/grass/ticket/2131>
GRASS GIS <http://grass.osgeo.org>



More information about the grass-dev mailing list