[GRASS-dev] [GRASS GIS] #3361: v.select: very slow on within (GEOS) operator

GRASS GIS trac at osgeo.org
Mon Jun 19 07:48:54 PDT 2017


#3361: v.select: very slow on within (GEOS) operator
---------------------------------------+-------------------------
 Reporter:  mlennert                   |      Owner:  grass-dev@…
     Type:  enhancement                |     Status:  new
 Priority:  normal                     |  Milestone:  7.4.0
Component:  Vector                     |    Version:  svn-trunk
 Keywords:  v.select GEOS within slow  |        CPU:  Unspecified
 Platform:  Unspecified                |
---------------------------------------+-------------------------
 I have not made similar tests with the other operators, but using the
 within operator v.select is very slow.

 First I create a buffer around the NC railroads map:


 {{{
 v.buffer railroads dist=5000 out=rail5000
 }}}

 Then v.select:

 {{{
 time v.select ain=boundary_municp bin=rail5000 op=within out=select
 real   2m13.989s
 user   1m57.888s
 sys    0m15.956s
 }}}

 Using the following script, I get the identical result much faster (maybe
 using v.distance is another option, but I haven't tried that):

 {{{
 g.copy vect=boundary_municp,munic
 v.db.addcolumn munic col="totalarea double precision"
 v.to.db munic op=area col=totalarea
 v.overlay ain=munic bin=rail5000 op=and out=munic_and_buffer
 v.db.addcolumn munic_and_buffer col="area double precision"
 v.to.db munic_and_buffer op=area col=area
 sleep 1
 v.extract boundary_municp cat=$(db.select -c sql="select a_cat from
 munic_and_buffer where round(area,1)/round(a_totalarea,1)=1" | awk
 '{printf"%s,", $1}') output=select_bis
 }}}

 Time for running entire script:


 {{{
 real    0m14.611s
 user    0m6.084s
 sys     0m5.084s
 }}}

 I stumbled across this because a student had a within operation that kept
 on running for hours and hours, and using an equivalent of the above
 script we were able to get the same result within minutes.

 I imagine that by going through GEOS we lose the spatial index, or that
 there are other significant overheads, and that this is what causes such a
 serious slowdown. This is such a difference, however, that I wonder if
 there is anything we could do to optimize v.select's GEOS operators ? Or
 is the only solution to implement the same operators natively ? Maybe a
 nice GSoC project ?

 I'm classifying this as an enhancement, but I'm pretty close to
 considering such long operation time as soon as there is a significant
 amount of data as a bug...

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3361>
GRASS GIS <https://grass.osgeo.org>



More information about the grass-dev mailing list