[Live-demo] "big" data sets on OSGeo Live

Stephan Meißl stephan at meissl.name
Mon Jan 21 09:46:21 PST 2013


Angelos, All,

thanks for moving this further which I should have done after the IRC
meeting. Anyway, I try to briefly summarize the meeting results:

We agreed to add an overview for "Big Data" similar to the one for
Natural Earth [1] describing some hyperspectral data which has yet to be
decided and added. Thanks Angelos for the first ideas. However, the
deadline for inclusion in 6.5 is the next IRC meeting on Thursday 24th.
So, are there any volunteers? Otherwise I'll create a ticket for 7.0.

cu
Stephan

[1] http://live.osgeo.org/en/overview/naturalearth_overview.html
[2] http://epi.whoi.edu/osgeolive/ossim_data/


On 01/18/2013 12:33 AM, Angelos Tzotsos wrote:
> Hi Peter, Stephan,
> 
> Since we are getting close to RC (just 2 weeks) perhaps we should think
> of a quick solution for 6.5 and re-think the data disk space balance for
> version 7.0.
> 
> Suggestions:
> 1. Lets get a hyperspectral dataset of ~60MB on this version (are there
> any free AVIRIS, Hyperion, CHRIS/PROBA, MODIS datasets for North
> Carolina where we have a full dataset?)
> 2. Lets call all remote sensing related applications to use them
> 3. Lets open a Trac Ticket for OSGeoLive 7.0 about evaluating all common
> datasets.
> 
> What do you think?
> 
> Best,
> Angelos
> 
> On 01/18/2013 01:10 AM, Peter Baumann wrote:
>> Stephan,
>>
>> Alan is handling the OSGeo finalization; but let me point out that
>> there are projects that have hundreds of MB, also of vector data
>> (according to the overview spreadsheet), so maybe this has a potential
>> for an effective reduction.
>>
>> my 2 cents,
>> Peter
>>
>>
>> On 01/17/2013 08:21 PM, Stephan Meißl wrote:
>>> Angelos, Peter,
>>>
>>> great discussion and I would be more than happy to use a common dataset.
>>> At the moment the EOxServer demonstration is using 5MB (confirmed on
>>> 6.5beta1). I guess using a common and probably bigger dataset would make
>>> the demonstration more convincing.
>>>
>>> An alternative to MODIS data would be Landsat. Btw. which format,
>>> projection, etc. should we use for the common data?
>>>
>>> cu
>>> Stephan
>>>
>>>
>>> On 01/10/2013 10:47 PM, Angelos Tzotsos wrote:
>>>> Hi Peter and thanks for bringing this topic here.
>>>>
>>>> It would be great to have more data space and perhaps we can save some
>>>> space if we track which projects use non common data-sets.
>>>>  From the IRC discussion there was a thought to include a hyperspectral
>>>> dataset so that rasdaman, OTB, OSSIM, EOxServer and GRASS to use in
>>>> common. Any thoughts on that? Should we play safe with free MODIS
>>>> dataset or is there another suggestion? Would it be ok if we limit this
>>>> to ~50MB size or would we need more? I know this is not enough for a
>>>> good remote sensing dataset, but we are short on disk space currently.
>>>>
>>>> I liked the idea of using some web services for demo purposes in
>>>> addition to data included in the disk as long as there is a notice to
>>>> the user that those are available only with internet connection.
>>>>
>>>> About projects mentioned:
>>>> MapGuide is actually not included in the disk due to that large
>>>> footprint (caused by mono dependency).
>>>> Marble in large because its dependencies for Qt bring in large parts of
>>>> the KDE libraries.
>>>>
>>>> We definitely need to hear more from the projects on this!
>>>>
>>>> Best,
>>>> Angelos
>>>>
>>>>> Hi listers,
>>>>>
>>>>> as space is running short on the ISO we need to reconsider sizings. In
>>>>> today's OSGel Live chat I was tasked to initiate this discussion.
>>>>> Current disk footprint of each application is available from
>>>>> https://docs.google.com/spreadsheet/ccc?key=0Al9zh8DjmU_RdGIzd0VLLTBpQVJuNVlHMlBWSDhKLXc#gid=13
>>>>>
>>>>>
>>>>>
>>>>> Top riders currently:
>>>>> - mapguide: 550mb
>>>>> - marble: 300 mb (includes disk caching!)
>>>>> - grass >250mb
>>>>>
>>>>> One way of saving space is to enlarge it (ie, go to 8 GB) which is
>>>>> unclear due to some unresolved questions; another one is to deflate
>>>>> unused datasets, a third one to share data among applications.
>>>>>
>>>>> Can I ask all those with 3-digit disk hunger to chime into this
>>>>> discussion.
>>>>>
>>>>> thanks,
>>>>> Peter





More information about the Osgeolive mailing list