<HTML>
<HEAD>
<TITLE>FW: Working with large vector files</TITLE>
</HEAD>
<BODY>
<FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'>I sent a similar question about large vector files to a listserv I moderate (starserv), and one of the users made the comment below indicating dbf files themselves can’t be larger than 2gb. Can other types of databases be used as the backend for vector files? Is this statement not true? How would this behavior affect things like v.in.ascii (which I noticed uses a process called dbf for most of the importing process).<BR>
<BR>
--j<BR>
<BR>
-- <BR>
Jonathan A. Greenberg, PhD<BR>
NRC Research Associate<BR>
NASA Ames Research Center<BR>
MS 242-4<BR>
Moffett Field, CA 94035-1000<BR>
Office: 650-604-5896<BR>
Cell: 415-794-5043<BR>
AIM: jgrn307<BR>
MSN: jgrn307@hotmail.com<BR>
<BR>
------ Forwarded Message<BR>
<B>From: </B>Richard Pollock <pollock@pcigeomatics.com><BR>
<B>Reply-To: </B><starserv@ucdavis.edu><BR>
<B>Date: </B>Thu, 5 Oct 2006 18:40:54 -0400<BR>
<B>To: </B><starserv@ucdavis.edu><BR>
<B>Conversation: </B>Working with large vector files<BR>
<B>Subject: </B>RE: Working with large vector files<BR>
<BR>
</SPAN></FONT><SPAN STYLE='font-size:12.0px'><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">The file format can also be an issue. The maximum size of a .DBF file is 2GB. That is because the file has to contain offsets to various other locations within the file, and those offsets are based on 32-bit integers. No software that is writing to a .DBF file can get around this. Ideally, the software should detect when it it has written as much data that the output file format can accommodate, refuse to write any more, close the file, and inform the user of the situation. If the software keeps writing then all it will do is convert a maximum-sized file that is at least usable into a corrupt file that may contain more data but is unusable because of messed up offset values.<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"> <BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">Lots of file formats that have been around a while have similar problems. When they were designed, people weren't worried about files anywhere near 2GB in size.<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"> <BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">So, the first thing is to find a storage format that is not intrinsically limited in size.<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"> <BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">I understand that GRASS can write to a PostGIS database. PostGIS is based on PostgreSQL (a free, opensource DBMS), which has a maximum table size of 32 TB. If GRASS can read a buffer-full of the input data, process it, and write the results out to PostGIS table, and repeat until all the input data are processed, then that may be your solution. At least, as long as the processing doesn't involve displaying the data (displaying very large datasets has its own problems).<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"> <BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">Cheers,<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"> <BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">Richard<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
<HR ALIGN=CENTER SIZE="3" WIDTH="100%"></FONT><FONT FACE="Tahoma"><B>From:</B> owner-starserv@ucdavis.edu [<a href="mailto:owner-starserv@ucdavis.edu]">mailto:owner-starserv@ucdavis.edu]</a> <B>On Behalf Of </B>Jonathan Greenberg<BR>
<B>Sent:</B> Thursday, October 05, 2006 5:31 PM<BR>
<B>To:</B> STARServ<BR>
<B>Subject:</B> Re: Working with large vector files<BR>
</FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
I’ve been working on techniques to perform tree crown recognition using high spatial resolution remote sensing imagery — the final output of these algorithms is a polygon coverage representing each tree crown in an image — as you can imagine that’s a LOT of trees for a standard quickbird image (on the order of 2 million polygons). I understand that I can be subsetting the rasters and doing smaller extractions, but this is, at best, a hack — there’s been a lot of work on efficient handling of massive raster images (look at RS packages like ENVI and GRASS), but massive vector handling is seriously lagging — early estimations are that I’d need about 25 or so subsets for a single quickbird scene to keep myself under the memory requirements. <BR>
<BR>
Right now I’m just trying to import a csv of xloc,yloc, crown radius (the output of my crown mapping algorithm) into SOME GIS, perform a buffer operation on that crown radius parameter (to give me the crown polygon), and work with that layer. ArcMap can actually import the points, but the buffering process completely overwhelms it (I noticed the DBF file hits 2gb and then I get the error). I’m trying GRASS right now but my first try also got a memory error (I’m working on a 32-bit PC and a 64-bit mac, incidentally).<BR>
<BR>
Besides GRASS and ArcMap, what else could I be trying out? I should point out ENVI also has a vector size problem — displaying large vectors creates an out of memory error (at least on a 32-bit PC, haven’t tried it on my mac yet).<BR>
<BR>
--j<BR>
<BR>
On 10/5/06 2:08 PM, "Richard Pollock" <pollock@pcigeomatics.com> wrote:<BR>
<BR>
</FONT></SPAN><BLOCKQUOTE><SPAN STYLE='font-size:12.0px'><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">What software created these large files in the first place? <BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">What format are the files in?<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">Cheers,<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
</FONT><FONT COLOR="#0000FF"><FONT FACE="Lucida Sans">Richard<BR>
</FONT></FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
<BR>
<HR ALIGN=CENTER SIZE="3" WIDTH="100%"> </FONT><FONT FACE="Tahoma"><B>From:</B> owner-starserv@ucdavis.edu [<a href="mailto:owner-starserv@ucdavis.edu]">mailto:owner-starserv@ucdavis.edu]</a> <B>On Behalf Of </B>Joanna Grossman<BR>
<B>Sent:</B> Thursday, October 05, 2006 4:23 PM<BR>
<B>To:</B> starserv@ucdavis.edu<BR>
<B>Subject:</B> Re: Working with large vector files<BR>
</FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
I'm not sure Jonathan, but it's certainly worth trying out GRASS and some of the other open source tools out there.<BR>
<a href="http://www.freegis.org/database/?cat=4">http://www.freegis.org/database/?cat=4</a><BR>
<BR>
Good luck!<BR>
<BR>
Joanna<BR>
<BR>
<B><I>Jonathan Greenberg <jgreenberg@arc.nasa.gov></I></B> wrote: <BR>
<BR>
</FONT></SPAN><BLOCKQUOTE><SPAN STYLE='font-size:12.0px'><FONT FACE="Verdana, Helvetica, Arial">After banging my head against this issue for the Nth time, I'm putting out a<BR>
plaintive cry of "HELP!" -- I am working (or would like to work with) vector<BR>
files which are larger than the 2gb limit imposed on them by ArcMap -- can<BR>
anyone recommend a GIS program that CAN deal with massive vector coverages<BR>
-- efficiently would be nice, but simply being able to open and process them<BR>
without getting corruption errors would be a great start...<BR>
<BR>
--j<BR>
</FONT></SPAN></BLOCKQUOTE></BLOCKQUOTE><SPAN STYLE='font-size:12.0px'><FONT FACE="Verdana, Helvetica, Arial"><BR>
<BR>
<BR>
-- <BR>
Jonathan A. Greenberg, PhD<BR>
NRC Research Associate<BR>
NASA Ames Research Center<BR>
MS 242-4<BR>
Moffett Field, CA 94035-1000<BR>
Office: 650-604-5896<BR>
Cell: 415-794-5043<BR>
AIM: jgrn307<BR>
MSN: jgrn307@hotmail.com<BR>
<BR>
<BR>
------ End of Forwarded Message<BR>
</FONT></SPAN>
</BODY>
</HTML>