[GRASS-stats] Loading a point-vector table with 466 columns

Wed May 27 12:01:25 EDT 2009

On Wed, 27 May 2009, Roger Bivand wrote:

> On Wed, 27 May 2009, Hamish wrote:
>
>> 
>> Roger wrote:
>>> Next script in R generating increasing NR and NC cases through
>>> writeVECT6() to test plugin=FALSE/plugin=TRUE ratios?
>> 
>> 
>> Does R have any built in profiling tools? as grass is just a collection
>> of small C programs the normal ones work fine with it:
>
> Yes, at the R level, so they won't help here. readVECT6() banches on plugin - 
> if TRUE, it just calls readOGR() on the GRASS driver, if FALSE, it does 
> (something like) v.out.ogr with shapefile driver to a temporary file and 
> readOGR() on the shapefile. The C/C++ level code is in the (same) GDAL shared 
> object for v.out.ogr and readOGR(), so I think the only difference is in the 
> use or not of the plugin.
>
> My second post (testing v.out.ogr to shapefile against ogr2ogr from GRASS 
> plugin to shapefile for a many-column vector) should reveal where the problem 
> is - my present feeling is that the plugin and v.out.ogr handle access to the 
> GRASS vector and its attribute data differently in one way or another.

The outcome was that for the 250 by 250 case, v.out.ogr ran at about 1 
sec, and ogr2ogr using the plugin at 0.2 secs. In readVECT6() - running at 
20 secs on the same data, the culprit is the C/C++ ogrDataFrame() function 
called by readOGR() in the rgdal package, which takes almost all of the 20 
secs with the GRASS driver, but < 1 sec with the shapefile driver on the 
same data. I'll try to investigate further - there is an interaction that 
I don't understand. It is possible that ogrDataFrame() is inefficient in 
that it reads by column, sending the driver out by feature for each field. 
I'll look at alternatives.

Roger

>
> Roger
>
>>
>>  http://grass.osgeo.org/wiki/Bugs#Using_a_profiling_tool
>> 
>> or use a profiling tool at the command line while running ogr2ogr with
>> input=grass and output=shapefile?
>> 
>> 
>> Nikos wrt your 10hr script:
>> v.db.update must open and close the DB for every time you call it. That
>> is very slow and inefficient. Better is to write all SQL update commands
>> to a file (end each line with a ';') then use that file as input to
>> db.execute so it opens the DB, updates all fields, closes DB again in
>> a single step. see the db.execute man page and v.in.garmin script for an
>> example (where it made a huge improvement).
>> 
>> 
>> 
>> Hamish
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> grass-stats mailing list
>> grass-stats at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/grass-stats
>> 
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no