[GRASS-stats] Loading a point-vector table with 466 columns
Roger Bivand
Roger.Bivand at nhh.no
Wed May 27 12:01:25 EDT 2009
On Wed, 27 May 2009, Roger Bivand wrote:
> On Wed, 27 May 2009, Hamish wrote:
>
>>
>> Roger wrote:
>>> Next script in R generating increasing NR and NC cases through
>>> writeVECT6() to test plugin=FALSE/plugin=TRUE ratios?
>>
>>
>> Does R have any built in profiling tools? as grass is just a collection
>> of small C programs the normal ones work fine with it:
>
> Yes, at the R level, so they won't help here. readVECT6() banches on plugin -
> if TRUE, it just calls readOGR() on the GRASS driver, if FALSE, it does
> (something like) v.out.ogr with shapefile driver to a temporary file and
> readOGR() on the shapefile. The C/C++ level code is in the (same) GDAL shared
> object for v.out.ogr and readOGR(), so I think the only difference is in the
> use or not of the plugin.
>
> My second post (testing v.out.ogr to shapefile against ogr2ogr from GRASS
> plugin to shapefile for a many-column vector) should reveal where the problem
> is - my present feeling is that the plugin and v.out.ogr handle access to the
> GRASS vector and its attribute data differently in one way or another.
The outcome was that for the 250 by 250 case, v.out.ogr ran at about 1
sec, and ogr2ogr using the plugin at 0.2 secs. In readVECT6() - running at
20 secs on the same data, the culprit is the C/C++ ogrDataFrame() function
called by readOGR() in the rgdal package, which takes almost all of the 20
secs with the GRASS driver, but < 1 sec with the shapefile driver on the
same data. I'll try to investigate further - there is an interaction that
I don't understand. It is possible that ogrDataFrame() is inefficient in
that it reads by column, sending the driver out by feature for each field.
I'll look at alternatives.
Roger
>
> Roger
>
>>
>> http://grass.osgeo.org/wiki/Bugs#Using_a_profiling_tool
>>
>> or use a profiling tool at the command line while running ogr2ogr with
>> input=grass and output=shapefile?
>>
>>
>> Nikos wrt your 10hr script:
>> v.db.update must open and close the DB for every time you call it. That
>> is very slow and inefficient. Better is to write all SQL update commands
>> to a file (end each line with a ';') then use that file as input to
>> db.execute so it opens the DB, updates all fields, closes DB again in
>> a single step. see the db.execute man page and v.in.garmin script for an
>> example (where it made a huge improvement).
>>
>>
>>
>> Hamish
>>
>>
>>
>>
>>
>> _______________________________________________
>> grass-stats mailing list
>> grass-stats at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/grass-stats
>>
>
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the grass-stats
mailing list