[mapserver-users] Best way to import 4.5TB of imagery?

Rahkonen Jukka jukka.rahkonen at mmmtike.fi
Mon Jun 10 21:55:25 PDT 2013


Hi,

If you reproject images one by one to any common coordinate system you will get black collars to most of them because images will be rotated. In order to make them transparent by using nodata you must keep images uncompressed. However, you may find it teasing to use the jpeg-in-tiff compression because that way you can save 90 of disk space and your users will not notice any difference in the quality and the speed can still be pretty good. For reprojecting it would be best to reproject into a regular South-North and West-East oriented grid in the target SRID. Then you could safely compress the reprojected tiles. That is a bit more complicated but still possible to do with scripts. If you want to keep your coverage up-to-date and you should slip in new images from here and there it will make a bit more trouble because usually the target tiles are mosaiced from many originals. This is what I am doing right now with a couple of terabytes of Finnish aerial photos.

With your data I would do what Frank W. already suggested. Convert your jpeg files into tiled tiffs with jpeg compression and use creation option PHOTOMETRIC=YCBCR. Create also overviews which are compressed in the same way. Make one layer for each UTM zone by combining them with tileindex and group all the zones together with a layer group. You can also start from the original images by creating tileindexes from them and convert them into more speedy tiled jpeg-in-tiff images once you have time for that. 

-Jukka Rahkonen-


________________________________________
 Evans, James wrote:

> For some reason I didn't see the reply from Robert Sanson.  Anyway, we have no requirement to stay with NAD83.  I have global mapper, and I think it will do a bulk reprojection, from my local hard drive to a network drive.  Maybe I will take a look at that tomorrow.  Is there a better tool for that?  Some GDAL utility?  Anyway, thanks for all the great suggestions.
James


-----Original Message-----
From: mapserver-users-bounces at lists.osgeo.org on behalf of Stephen Woodbridge
Sent: Mon 6/10/2013 6:36 PM
To: mapserver-users at lists.osgeo.org
Subject: Re: [mapserver-users] Best way to import 4.5TB of imagery?

On 6/10/2013 7:07 PM, Robert Sanson wrote:
> Sounds all very complicated. I would advise that you choose a single
> projection that will work across your entire are and then re-project
> your imagery first, and then build your image datastore around that.

I would agree with this that if htere are not other contraints on the
problem this is often the best way to go. But that will not owrk if his
users HAVE to have the data in UTM projection for some reason.

-Steve

>>>> "Evans, James R Civ USAF ACC 84 RADES/SCZE"
>>>> <James.Evans at hill.af.mil> 11/06/2013 10:48 a.m. >>>
> So, I'm guessing there's no easy way to automate this?  Even looking
> at the states, some of the states are in two zones, and Texas is
> across 3 zones. At least the naming convention of the files indicate
> the UTM zone.  For instance:
> m_2408002_ne_17_1_20100422_201001123.jp2, is in zone 17.  As far as I
> can tell, all the files in a particular directory are all in the
> same UTM zone.  I could create a layer for each UTM zone across
> CONUS, but that's not going to be particularly useful to my users.
> I'm thinking of making a layer for each state.  For the stats that
> cross zones, there will probably be two layers.  For Texas, there
> would be Texas_east, Texas_middle, and Texas_west.  I will probably
> limit visibility until zoomed in sufficiently to see the whole state
> on the screen anyway, since the continental view of this data is
> pretty crappy anyway.  So now it seems like it will be a lot of grunt
> work just copying these directories up to the server, and going
> through and creating a shape file index for each state.  For states
> in more than one UTM, there would be more than one shape.  Then I'll
> have to add a layer for to my mapfile for each shapefile, using the
> correct projection. Is there an easier way?  I'm starting with
> Oklahoma, which is also in three UTM zones.  I'll get that working
> before moving on.  Any suggestions on making this pretty would be
> welcomed.  :-)
>
>
>
>
> -----Original Message----- From: Stephen Woodbridge
> [mailto:woodbri at swoodbridge.com] Sent: Monday, June 10, 2013 12:34
> PM To: Evans, James R Civ USAF ACC 84 RADES/SCZE Cc:
> mapserver-users at lists.osgeo.org Subject: Re: [mapserver-users] Best
> way to import 4.5TB of imagery?
>
> On 6/10/2013 12:57 PM, Evans, James R Civ USAF ACC 84 RADES/SCZE
> wrote:
>> Hi Stephen, Thanks, for the reply.  I previously got 4 sample
>> images from the USDA, and was able to get them to work just fine.
>> There was no processing
> required.
>> The sample images I got were all from Utah, and they are NAD83, UTM
>> zone
> 12.
>> I added the 4 sample images to a shape file using gdaltindex.   I
>> used
> UPSG
>> 26912 and mapserver served them up very quickly for such large
>> files.
>>
>> Now I have this entire data set, and it stretches from UTM zone 10,
>> to UTM zone 19.  The data is divided into directories by two letter
>> state abbreviations, and under that into subdirectories.  I'm just
>> wondering how to add this to my mapfile.  Do I need a different
>> entry for each UTM
> Zone?
>> How is it possible to get a single layer entry that includes
>> multiple projections?  This is looking like a huge job and I just
>> want to know the best strategy for getting this done.
>
> So now you have a problem. You data is in UTM spread over 10
> different projections. What do you plane to do when have your image
> is zone 10 and half is in zone 11 or if you zoom out and you images
> has 3-4 zones displayed?
>
> All data in an image must be rendered in the same projection. While I
> don't doubt that your test with 4 images worked fine, did you you
> test this a multiple zoom levels and at some point you will probably
> want to create a super overlay image so you do not have to open
> multiple files to just pull a tiny overlay out of each one.
>
> Your use cases will determine how you want to deal with the data.
> For example does it HAVE to be in a UTM projection, or can you work
> with a Spherical Mercator or geographic projection? The end solution
> will be much easier if you can work with one common projection over
> your whole data set. Otherwise, you will have to deal with the
> transitions from one zone to the next or maybe set up 10 separate
> servers that only serve one zone.
>
> Having pushed larges amounts (4-25TB) of imagery data more than once
> it is important to make these decisions up front and and prototype up
> something like a 4-10 degree square across a UTM boundary and make
> sure that the results are going to be what you expect before you
> process all the data.
>
> -Steve
>
>> Thanks, James
>>
>>
>>
>>
>>
>> -----Original Message----- From:
>> mapserver-users-bounces at lists.osgeo.org
>> [mailto:mapserver-users-bounces at lists.osgeo.org] On Behalf Of
>> Stephen Woodbridge Sent: Friday, June 07, 2013 8:41 AM To:
>> mapserver-users at lists.osgeo.org Subject: Re: [mapserver-users] Best
>> way to import 4.5TB of imagery?
>>
>> On 6/7/2013 10:31 AM, James_in_Utah wrote:
>>> Hi, We just got 3 hard drive, loaded with 4.5TB of NAIP imagery
>>> for all of CONUS.  I think there's a total of about 400,000 jpgs.
>>> The data is in directories, by states.  Under each state, there
>>> are subfolders, probably reference by longitude.  Other than
>>> going through folder by folder, adding each image to a shape file
>>> using gdaltindex, what's the best strategy for loading a couple
>>> of hundred thousand files up to our server and making the imagery
>>> available via our mapserver?  Should I maintain the current
>>> directory structure when I copy the imagery to the server, or
>>> just dump all of it into a single directory?  Do I want to stay
>>> with 1 shape file, or break it up by state?  We eventually want a
>>> contiguous layer for all of CONUS to
> be served up to our users.
>>
>> James,
>>
>> Since imagery data is served via gdal, you might want to also ask
>> this question on the gdal list.
>>
>> There are issues with jpg related to the fact that if you only want
>> a small part of the image you still have to uncompress the whole
>> image. So part of the answer might be that you need to pre-process
>> all the imagery into something like a jpg compress tiled geotif or
>> something else.
>>
>> You also need to consider what projection your imagery is in and
>> what projection you want to display it in. Because if you need to
>> preprocess the data, that would also be a good time to reproject
>> it.
>>
>> Anyway the gdal list can probably ask additional questions to help
>> sort all that out.
>>
>> -Steve W
>>
>> _______________________________________________ mapserver-users
>> mailing list mapserver-users at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>>
>
>
>
> This email and any attachments are confidential and intended solely
> for the addressee(s). If you are not the intended recipient, please
> notify us immediately and then delete this email from your system.
>
> This message has been scanned for Malware and Viruses by Websense
> Hosted Security. www.websense.com
> _______________________________________________ mapserver-users
> mailing list mapserver-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>

_______________________________________________
mapserver-users mailing list
mapserver-users at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users

_______________________________________________
mapserver-users mailing list
mapserver-users at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapserver-users


More information about the mapserver-users mailing list