[GRASS-dev] single vs. multiple sqlite db files [was v.db.calc]

Dylan Beaudette debeaudette at ucdavis.edu
Mon May 11 14:03:55 EDT 2009


On Friday 08 May 2009, C Michael Barton wrote:
> On 5/7/09 12:23 PM, "Dylan Beaudette" <debeaudette at ucdavis.edu> wrote:
>
> On Thursday 07 May 2009, Moritz Lennert wrote:
> > On 07/05/09 14:04, Hamish wrote:
> > >>> todo: for sqlite decide if we want per-map DB files
> > >>> like with DBF or single DB file per mapset.
> > >
> > > ...
> > >
> > >> I would vote for both approaches. User can decide what he
> > >> prefers via environmental variable, e.g. GRASS_SQLITE_PER_MAPSET.
> > >
> > > I think it would have to be a g.gisenv variable, or perhaps stored in
> > > the $MAPSET/VAR file.
> > >
> > > which way would be the default?
> >
> > I would plead for per mapset as this allows for more complete database
> > approach (i.e. joins, views, etc).
> >
> > Moritz
>
> I also agree on this matter. I think that one thing that is making this
> decision particularly difficult is that we are lacking a robust interchange
> format for complex vector data. Dumping from GRASS vectors --> shapefiles
> --> GRASS can sometimes lead to data loss. I wonder if further development
> of the SQLite spatial datatypes would help address this problem.
>
> Dylan

Hi Michael and others,

> Note that one database per vector file, as we now have with the more
> limited dbf format, does not preclude joins, views, schemas, etc. 

That may be true, but what mechanism is 'doing' the joining? I realize that 
with the DBF interface all 'SQL-like' operations are somewhat of an 
abomination, but with something like SQLite at least we know that a robust 
system can deal with most of the SQL thrown at it. This is especially 
important when trying to work with subqueries or LEFT joins. I am not sure 
how these operations would be possible between different (SQLite) database 
files. This could be due to my limited experience with SQLite...

> The only 
> thing it does is prevent joins between tables actively linked to different
> vectors. That is, if you have a vector file of towns and a vector file of
> soil sampling points, you can't join tables from these two different vector
> files without exporting a table from one and importing it into the other.
> (We could probably make this a relatively painless procedure in GRASS).

This is more useful than it appears-- as it allows for arbitrary transfer of 
attributes from one feature to another via a join.

> I can imagine situations where one might want to do such joins between
> different vectors, but question whether it is really all that common. In
> many (perhaps most) cases, different vector files represent different kinds
> of information such that their attribute tables cannot be joined in a
> meaningful way. And are joins between different vector files sufficiently
> common to warrant the loss of portability that happens by combining all
> attribute tables from all vector files in a mapset into a single SQL
> database?

I am not sure how common this is, but I can tell you that I do this type of 
operation all of the time when working with soils and land-use information. 
Sometimes the join condition is spatial, but more often it is based on 
attributes and look-up tables.

One potential down-side to a single SQLite db file that I recently encountered 
is locking-- I was unable to perform some actions on vector data when a 
long-running vector operation was in progress. It seems to be related to how 
locking of the attribute tables works-- I am not sure if it is possible to 
use table-level locking rather than file-level locking.

Either way, I am excited to see the DBF backend replaced by something like 
SQLite.

Cheers,
Dylan


>
> Michael





More information about the grass-dev mailing list