[GRASS-user] Estimating import time for large data file

Moritz Lennert mlennert at club.worldonline.be
Fri May 29 02:20:17 PDT 2020


On 28/05/20 17:25, Rich Shepard wrote:
> I'm trying to import (using v.in.ogr) a rather large file (1.22G *.shp and
> 4.5M *.dbf). I see a lot of system disk reads, low cpu usage, but after a
> half-hour it appears to have stopped processing.

Importing such a large file can take a lot of time because of the 
cleaning in order to get it into GRASS GIS' topological format.

> 
> I'm running Slackware-14.2/x86_64 on a desktop with an AMD Ryzen7 2700 CPU
> (8 cores/16 threads) and 32G DDR4 memory. GRASS-7.9.dev is configured to
> enable largefile and use openmp.
> 
> 1. Can I do more to facilitate import of this large, statewide wetlands
> data set?
> 
> 2. Is there a way to estimate the time import would take? If so, I'll start
> it using screen and let it run in the background, overnight if necessary.

I think it really depends on the specific file in terms of geometry 
complexity and cleanliness, so difficult to pre-determine. I regularly 
launch large import jobs overnight.

> 
> 3. Is there a realtime progress monitor that informs me grass is chewing on
> the import or stuck somewhere?
> 
> The GUI layer manager shows the map needs cleaning for many (most? all?)
> polygons when it eventually is imported, but that's to be addressed later.

You should see an information about its progress in the different 
cleaning steps (note that some cleaning is repeated).

If you are sure you do not want to do any polygon cleaning during 
import, you can use the -c flag. But you will have to handle the result 
with care.

Moritz


More information about the grass-user mailing list