[Live-demo] "big" data sets on OSGeo Live

Stephan Meißl stephan at meissl.name
Mon Jan 28 05:38:37 PST 2013


as promised I created a ticket for inclusion of Big Data in 7.0 [1].
Please feel free to add your thoughts.


[1] http://trac.osgeo.org/osgeo/ticket/1065

On 01/21/2013 06:46 PM, Stephan Meißl wrote:
> Angelos, All,
> thanks for moving this further which I should have done after the IRC
> meeting. Anyway, I try to briefly summarize the meeting results:
> We agreed to add an overview for "Big Data" similar to the one for
> Natural Earth [1] describing some hyperspectral data which has yet to be
> decided and added. Thanks Angelos for the first ideas. However, the
> deadline for inclusion in 6.5 is the next IRC meeting on Thursday 24th.
> So, are there any volunteers? Otherwise I'll create a ticket for 7.0.
> cu
> Stephan
> [1] http://live.osgeo.org/en/overview/naturalearth_overview.html
> [2] http://epi.whoi.edu/osgeolive/ossim_data/
> On 01/18/2013 12:33 AM, Angelos Tzotsos wrote:
>> Hi Peter, Stephan,
>> Since we are getting close to RC (just 2 weeks) perhaps we should think
>> of a quick solution for 6.5 and re-think the data disk space balance for
>> version 7.0.
>> Suggestions:
>> 1. Lets get a hyperspectral dataset of ~60MB on this version (are there
>> any free AVIRIS, Hyperion, CHRIS/PROBA, MODIS datasets for North
>> Carolina where we have a full dataset?)
>> 2. Lets call all remote sensing related applications to use them
>> 3. Lets open a Trac Ticket for OSGeoLive 7.0 about evaluating all common
>> datasets.
>> What do you think?
>> Best,
>> Angelos
>> On 01/18/2013 01:10 AM, Peter Baumann wrote:
>>> Stephan,
>>> Alan is handling the OSGeo finalization; but let me point out that
>>> there are projects that have hundreds of MB, also of vector data
>>> (according to the overview spreadsheet), so maybe this has a potential
>>> for an effective reduction.
>>> my 2 cents,
>>> Peter
>>> On 01/17/2013 08:21 PM, Stephan Meißl wrote:
>>>> Angelos, Peter,
>>>> great discussion and I would be more than happy to use a common dataset.
>>>> At the moment the EOxServer demonstration is using 5MB (confirmed on
>>>> 6.5beta1). I guess using a common and probably bigger dataset would make
>>>> the demonstration more convincing.
>>>> An alternative to MODIS data would be Landsat. Btw. which format,
>>>> projection, etc. should we use for the common data?
>>>> cu
>>>> Stephan
>>>> On 01/10/2013 10:47 PM, Angelos Tzotsos wrote:
>>>>> Hi Peter and thanks for bringing this topic here.
>>>>> It would be great to have more data space and perhaps we can save some
>>>>> space if we track which projects use non common data-sets.
>>>>>  From the IRC discussion there was a thought to include a hyperspectral
>>>>> dataset so that rasdaman, OTB, OSSIM, EOxServer and GRASS to use in
>>>>> common. Any thoughts on that? Should we play safe with free MODIS
>>>>> dataset or is there another suggestion? Would it be ok if we limit this
>>>>> to ~50MB size or would we need more? I know this is not enough for a
>>>>> good remote sensing dataset, but we are short on disk space currently.
>>>>> I liked the idea of using some web services for demo purposes in
>>>>> addition to data included in the disk as long as there is a notice to
>>>>> the user that those are available only with internet connection.
>>>>> About projects mentioned:
>>>>> MapGuide is actually not included in the disk due to that large
>>>>> footprint (caused by mono dependency).
>>>>> Marble in large because its dependencies for Qt bring in large parts of
>>>>> the KDE libraries.
>>>>> We definitely need to hear more from the projects on this!
>>>>> Best,
>>>>> Angelos
>>>>>> Hi listers,
>>>>>> as space is running short on the ISO we need to reconsider sizings. In
>>>>>> today's OSGel Live chat I was tasked to initiate this discussion.
>>>>>> Current disk footprint of each application is available from
>>>>>> https://docs.google.com/spreadsheet/ccc?key=0Al9zh8DjmU_RdGIzd0VLLTBpQVJuNVlHMlBWSDhKLXc#gid=13
>>>>>> Top riders currently:
>>>>>> - mapguide: 550mb
>>>>>> - marble: 300 mb (includes disk caching!)
>>>>>> - grass >250mb
>>>>>> One way of saving space is to enlarge it (ie, go to 8 GB) which is
>>>>>> unclear due to some unresolved questions; another one is to deflate
>>>>>> unused datasets, a third one to share data among applications.
>>>>>> Can I ask all those with 3-digit disk hunger to chime into this
>>>>>> discussion.
>>>>>> thanks,
>>>>>> Peter

More information about the Live-demo mailing list