[GRASS-dev] [GRASS GIS] #438: v.distance -a uses too much memory

GRASS GIS trac at osgeo.org
Fri Jan 16 09:20:52 EST 2009


#438: v.distance -a uses too much memory
------------------------------------------+---------------------------------
 Reporter:  mlennert                      |       Owner:  grass-dev at lists.osgeo.org
     Type:  defect                        |      Status:  new                      
 Priority:  major                         |   Milestone:  7.0.0                    
Component:  Vector                        |     Version:  svn-trunk                
 Keywords:  v.distance memory allocation  |    Platform:  Unspecified              
      Cpu:  Unspecified                   |  
------------------------------------------+---------------------------------
 Not sure if this should be considered as a bug or a wish for
 enhancement...chosing bug for now as it makes the module useless with
 large files.

 When trying to calculate a distance matrix between 20 000 points with
 v.distance -a, I get:

 ERREUR:G_realloc: unable to allocate 1985321728 bytes at main.c:568

 As the machine only has 1 GB of RAM, this is normal, but v.distance should
 be rewritten to not keep everything in memory, at least when dealing with
 the -a flag, and to only allocate memory for data really requested.

 Currently, it allocates memory for a large number of NEAR structures
 (3xint+10xdouble i.e., for example, 3x4+10x8=92Bytes for each point) which
 contain space for all the potential uplad options (lines 447-8 of
 vector/v.distance/main.c):


 {{{
         anear = 2 * nfrom;
         Near = (NEAR *) G_calloc(anear, sizeof(NEAR));
 }}}


 And then goes on to if necessary add memory space for the entire From x To
 matrix (lines 566-8 of vector/v.distance/main.c) in the loop of the to
 objects (count= total number of distances calculated after each loop):

 {{{

              if (anear <= count) {
                    anear += 10 + nfrom / 10;
                    Near = (NEAR *) G_realloc(Near, anear * sizeof(NEAR));
 }}}


 I'm not sure I completely understand this last part, as it seems to create
 huge jumps in allocation, i.e. when the count of distances goes beyond
 nfrom*2 (or later values of anear), it reallocates memory space for anear
 new NEARS. In my case, when count>40000, anear=40000+10+20000/10=42010,
 i.e. adding space for 2010 new NEAR structures, without knowing (AFAICT)
 how many will actually still come...

 But, as I said, I don't understand the code well enough to make a definite
 judgement. It would seem, however, that it might be better to calculate
 each distance and update the table immediately, or maybe write the
 necessary queries to a temp file to be able to launch the query at the end
 in one run, but without keeping everything in memory.

-- 
Ticket URL: <http://trac.osgeo.org/grass/ticket/438>
GRASS GIS <http://grass.osgeo.org>


More information about the grass-dev mailing list