[Live-demo] OSGeo-Live 7.0 - big data version?

Alex Mandel tech_dev at wildintellect.com
Fri Jul 5 14:10:29 PDT 2013


Similar to Hamish's response. We're looking for data that is a subset of 
what would normally be big data, and comes in the same formats. 
Something that people could test big data tools with. That way people 
can grasp the concepts with something manageable and know that it can be 
scaled when given more hardware. I don't think it's a good idea to hand 
some 4 GB of data and have them try to interact with that off a live 
boot in a workshop, better to get them familiar with data structure and 
tools and say, now if you want to do the whole world with MODIS data go 
out and install tool X on some cluster.

If you want to put up something big (file size), an extra example data 
set available for download separately is my suggestion, bundling it into 
the disk/usb just means really long downloads for some people (many people).

Note this is exactly why we extract a subset of OpenStreetMap, having 
the whole OSM on disk isn't practical or useful, but a subset that 
demonstrates the data structure and works with all the usual OSM tools 
show how all the steps work if you were to download the whole set. 
Rather than days people in a workshop see results in 5-10 minutes.

Thanks,
Alex

On 07/04/2013 02:34 PM, James Klassen wrote:
> I think a distribution with more data (or a link where you could download
> it load it into the VM) where it is setup to work with the included
> applications would be useful.  I am not sure it makes sense to limit this
> to a 8GB USB/DVD.  I would guess that working in a VM off of a real drive
> rather than a slow thumb drive or slower DVD would make more sense when
> working with larger data sets.
>
> I am not sure I would call 4GB of extra data "Big Data".  To me, "Big Data"
> implies something bigger than fits easily in RAM on one node... For
> imagery, something closer 100 TB+ on the low end.
> On Jul 4, 2013 4:00 PM, "Cameron Shorter" <cameron.shorter at gmail.com> wrote:
>
>> On IRC right now we are discussing the possibility of creating a "Big
>> Data" version of OSGeo-Live 7.0.
>>
>> This will likely be the standard OSGeo-Live (which will still need to work
>> stand along), plus an extra data directory which could include big data,
>> such as netCDF datasets. This could be distributed as a VM or on an 8Gig
>> USB.
>>
>> I'm interested to hear thoughts on whether this will work for those
>> interested in showing big data on OSGeo-Live.
>>
>> --
>> Cameron Shorter
>> Geospatial Solutions Manager
>> Tel: +61 (0)2 8570 5050
>> Mob: +61 (0)419 142 254



More information about the Live-demo mailing list