<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=us-ascii">

<META content="MSHTML 6.00.2900.3243" name=GENERATOR></HEAD>

<BODY>

<DIV dir=ltr align=left><SPAN class=176014515-22022008><FONT face=Arial 

color=#0000ff size=2>Hi,</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=176014515-22022008><FONT face=Arial 

color=#0000ff size=2></FONT></SPAN> </DIV>

<DIV dir=ltr align=left><SPAN class=176014515-22022008><FONT face=Arial 

color=#0000ff size=2>I'm processing a dataset for the Cairngorms National 

Park in the UK. This source is NextMap data at a 5 metre square gridded 

raster. It has 30000 columns and 24000 rows. Amongst other things I calculate 

roughness for kernels taking in all values within 64 celldistances. The 

roughness output is calculated at the same resolution as the input (along with 

around 60 other metrics).</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=176014515-22022008><FONT face=Arial 

color=#0000ff size=2></FONT></SPAN> </DIV>

<DIV dir=ltr align=left><SPAN class=176014515-22022008><FONT face=Arial 

color=#0000ff size=2>This is a small dataset in comparison with some data for 

Mars that I am processing in a similar way. I am also grappling with the SRTM90 

data from 60South to 60North this has several hundred thousands of rows and 

columns. On the hardware side, I need terrabytes of disc space, but 

only one or so gigabytes of faster access memory to do this work. Point is 

as many of you know, there are big raster datasets out there and now is as good 

a time as any to process them.</FONT></SPAN></DIV>

<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff size=2>I 

split the data into chunks and store them as files on disc. I have problems when 

the number of files gets too large and when the size of each chunk gets too 

large. I compromise and at the moment and tend to use chunks with 

about 500 row and 500 columns (I could use any with my program so long as 

all chunks have the same dimensions). The problem of too many files I think is 

an operating system problem. The problem of too large chunks is more down to 

the implementation of raster processing and it's memory handling. I try to 

hold enough data in memory in my program so that I get answers in a reasonable 

time frame. (BTW, my programs are FOSS and I'll release a new version of Grids 

soon which you can pick up via <A 

href="http://www.geog.leeds.ac.uk/people/a.turner/src/">http://www.geog.leeds.ac.uk/people/a.turner/src/</A>).</FONT></SPAN></DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff 

size=2></FONT></SPAN> </DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff size=2>I do 

the Geomorphometrics processing both on my PC and on some High End Computers. 

</FONT></SPAN><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff 

size=2>On HECs I am considering using a federated datastore like SRB and 

parallelising as the task is "embarasingly parallel" so it is reasonably easy to 

do (please excuse my spelling). I am also looking into more Grid/Web 

Services SOA ways of doing this.</FONT></SPAN></DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff 

size=2></FONT></SPAN> </DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff size=2>In the 

past I have used the blobs database approach. I don't know which is best, but 

I'm working with files again now. In the past I have found that for some things 

RandomAccessFiles are best and directly manipulating information on disc 

rather than say using a swap appraoch. I think what is best all depends on what 

you are doing, how many raster datasets you simultaneously use in the 

processing. Nearly all of my processing these days involves computing for 

kernels at the same resolution of the inputs with (sometimes) multiple inputs 

(but usually just one input) and multiple outputs (about 50 or 

so).</FONT></SPAN></DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff 

size=2></FONT></SPAN> </DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff size=2>Best 

wishes,</FONT></SPAN></DIV>

<DIV><SPAN class=176014515-22022008><FONT face=Arial color=#0000ff 

size=2></FONT></SPAN> </DIV>

<DIV><SPAN class=176014515-22022008></SPAN><FONT size=2>Andy<BR><A 

href="http://www.geog.leeds.ac.uk/people/a.turner/">http://www.geog.leeds.ac.uk/people/a.turner/</A><BR> </FONT> 

</DIV>

<DIV> </DIV><BR>

<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>

<HR tabIndex=-1>

<FONT face=Tahoma size=2><B>From:</B> discuss-bounces@lists.osgeo.org 

[mailto:discuss-bounces@lists.osgeo.org] <B>On Behalf Of 

</B>Bruce.Bannerman@dpi.vic.gov.au<BR><B>Sent:</B> 22 February 2008 

04:53<BR><B>To:</B> OSGeo Discussions<BR><B>Subject:</B> Re: [OSGeo-Discuss] 

Image Management in an RDBMS...(was OS Spatialenvironment 

'sizing')<BR></FONT><BR></DIV>

<DIV></DIV><BR><FONT face=sans-serif size=2>IMO</FONT> <BR><BR><FONT 

size=2><TT>> <BR>> 12 million records is teensy. Stuff it into PostGIS. 

It's the billion- <BR>> point LIDAR sets that leave me queasy, but I can't 

begin to think of a  <BR>> reasonable architecture for that without 

learning more about how the  <BR>> points are actually USED, which I 

really am not clear on at the moment.<BR>> <BR></TT></FONT><BR><FONT 

size=2><TT>Paul,</TT></FONT> <BR><BR><FONT size=2><TT>Agreed. 

</TT></FONT><BR><BR><FONT size=2><TT>Generation of TINs or surfaces of roughness 

over that number of points will challenge any data management 

solution.</TT></FONT> <BR><BR><FONT size=2><TT>However, the time is coming / has 

come when people will want to do it.</TT></FONT> <BR><BR><FONT size=2><TT>It is 

perhaps a good candidate for Grid architectures and high performance 

computing.</TT></FONT> <BR><BR><FONT size=2><TT>Bruce</TT></FONT> 

<P><FONT face=Arial size=2>Notice:</FONT><FONT 

style="BACKGROUND-COLOR: #ff0000"><BR></FONT><FONT size=2><FONT face=Arial>This 

email and any attachments may contain information that is personal, 

confidential,<BR>legally privileged and/or copyright.</FONT> <FONT face=Arial>No 

part of it should be reproduced, adapted or communicated </FONT><FONT 

face=Arial>without the prior written consent of the copyright owner. 

</FONT></FONT></P>

<P><FONT size=2><FONT face=Arial>It is the responsibility of the recipient to 

check for and remove viruses.</FONT></FONT></P>

<P><FONT face=Arial size=2>If you have received this email in error, please 

notify the sender by return email, delete it from your system and destroy any 

copies. You are not authorised to use, communicate or rely on the information 

contained in this email.</FONT></P>

<P><FONT face=Arial color=#008000 size=2>Please consider the environment before 

printing this email.</FONT></P>

<P><FONT face=Arial size=2></FONT> </P>

<P> </P>

<P> </P></BODY></HTML>