[GRASS-user] FW: Working with large vector files

Jonathan Greenberg jgreenberg at arc.nasa.gov
Thu Oct 5 19:00:29 EDT 2006


I sent a similar question about large vector files to a listserv I moderate
(starserv), and one of the users made the comment below indicating dbf files
themselves can¹t be larger than 2gb.  Can other types of databases be used
as the backend for vector files?  Is this statement not true?  How would
this behavior affect things like v.in.ascii (which I noticed uses a process
called dbf for most of the importing process).

--j

-- 
Jonathan A. Greenberg, PhD
NRC Research Associate
NASA Ames Research Center
MS 242-4
Moffett Field, CA 94035-1000
Office: 650-604-5896
Cell: 415-794-5043
AIM: jgrn307
MSN: jgrn307 at hotmail.com

------ Forwarded Message
From: Richard Pollock <pollock at pcigeomatics.com>
Reply-To: <starserv at ucdavis.edu>
Date: Thu, 5 Oct 2006 18:40:54 -0400
To: <starserv at ucdavis.edu>
Conversation: Working with large vector files
Subject: RE: Working with large vector files

The file format can also be an issue. The maximum size of a .DBF file is
2GB. That is because the file has to contain offsets to various other
locations within the file, and those offsets are based on 32-bit integers.
No software that is writing to a .DBF file can get around this. Ideally, the
software should detect when it it has written as much data that the output
file format can accommodate, refuse to write any more, close the file, and
inform the user of the situation. If the software keeps writing then all it
will do is convert a maximum-sized file that is at least usable into a
corrupt file that may contain more data but is unusable because of messed up
offset values.
 
Lots of file formats that have been around a while have similar problems.
When they were designed, people weren't worried about files anywhere near
2GB in size.
 
So, the first thing is to find a storage format that is not intrinsically
limited in size.
 
I understand that GRASS can write to a PostGIS database. PostGIS is based on
PostgreSQL (a free, opensource DBMS), which has a maximum table size of 32
TB. If GRASS can read a buffer-full of the input data, process it, and
write the results out to PostGIS table, and repeat until all the input data
are processed, then that may be your solution. At least, as long as  the
processing doesn't involve displaying the data  (displaying very large
datasets has its own problems).
 
Cheers,
 
Richard


From: owner-starserv at ucdavis.edu [mailto:owner-starserv at ucdavis.edu] On
Behalf Of Jonathan Greenberg
Sent: Thursday, October 05, 2006 5:31 PM
To: STARServ
Subject: Re: Working with large vector files

I¹ve been working on techniques to perform tree crown recognition using high
spatial resolution remote sensing imagery ‹ the final output of these
algorithms is a polygon coverage representing each tree crown in an image ‹
as you can imagine that¹s a LOT of trees for a standard quickbird image (on
the order of 2 million polygons).  I understand that I can be subsetting the
rasters and doing smaller extractions, but this is, at best, a hack ‹
there¹s been a lot of work on efficient handling of massive raster images
(look at RS packages like ENVI and GRASS), but massive vector handling is
seriously lagging ‹ early estimations are that I¹d need about 25 or so
subsets for a single quickbird scene to keep myself under the memory
requirements.  

Right now I¹m just trying to import a csv of xloc,yloc, crown radius (the
output of my crown mapping algorithm) into SOME GIS, perform a buffer
operation on that crown radius parameter (to give me the crown polygon), and
work with that layer.  ArcMap can actually import the points, but the
buffering process completely overwhelms it (I noticed the DBF file hits 2gb
and then I get the error).  I¹m trying GRASS right now but my first try also
got a memory error (I¹m working on a 32-bit PC and a 64-bit mac,
incidentally).

Besides GRASS and ArcMap, what else could I be trying out?  I should point
out ENVI also has a vector size problem ‹ displaying large vectors creates
an out of memory error (at least on a 32-bit PC, haven¹t tried it on my mac
yet).

--j

On 10/5/06 2:08 PM, "Richard Pollock" <pollock at pcigeomatics.com> wrote:

> What software created these large files in the first place?
> 
> What format are the files  in?
> 
> Cheers,
> 
> Richard
> 
>  
> 
>  From: owner-starserv at ucdavis.edu [mailto:owner-starserv at ucdavis.edu]  On
> Behalf Of Joanna Grossman
> Sent: Thursday, October 05, 2006  4:23 PM
> To: starserv at ucdavis.edu
> Subject: Re: Working with  large vector files
> 
> I'm  not sure Jonathan, but it's certainly worth trying out GRASS and some of
> the  other open source tools out there.
> http://www.freegis.org/database/?cat=4
> 
> Good  luck!
> 
> Joanna
> 
> Jonathan Greenberg  <jgreenberg at arc.nasa.gov> wrote:
>  
>> After  banging my head against this  issue for the Nth time, I'm putting out
>> a
>> plaintive cry of "HELP!"  -- I am working (or would like to work with)
>> vector
>> files which are  larger than the 2gb limit imposed on them by ArcMap  -- can
>> anyone  recommend a GIS program that CAN deal with massive vector   coverages
>> -- efficiently would be nice, but simply being able to  open and  process
>> them
>> without getting corruption errors would be a  great  start...
>> 
>> --j



-- 
Jonathan A. Greenberg, PhD
NRC Research Associate
NASA Ames Research Center
MS 242-4
Moffett Field, CA 94035-1000
Office: 650-604-5896
Cell: 415-794-5043
AIM: jgrn307
MSN: jgrn307 at hotmail.com


------ End of Forwarded Message

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/grass-user/attachments/20061005/5f01b78a/attachment.html


More information about the grass-user mailing list