[OSGeo-Discuss] OS Spatial environment 'sizing'

Lucena, Ivan ivan.lucena at pmldnet.com
Tue Feb 19 13:59:28 PST 2008


Hi Randy, Bruce,

That is a nice piece of advise Randy. I am sorry to intrude the 
conversation but I would like to ask how that "heavy raster" 
manipulation would be treated by PostgreSQL/PostGIS, managed or unmanaged?

Best regards,

Ivan

Randy George wrote:
> Hi Bruce,
> 
>  
> 
>                 On the “scale relatively quickly” front, you should look 
> at Amazon’s EC2/S3 services. I’ve recently worked with it and find it an 
> attractive platform for scaling http://www.cadmaps.com/gisblog
> 
>  
> 
> The stack I like is Ubuntu+Java+ Postgresql/PostGIS + Apache2 mod_jk 
> Tomcat + Geoserver + custom SVG or XAML clients run out of Tomcat
> 
>  
> 
>                 If you use the larger instances the cost is higher but 
> it sounds like you plan on some heavy raster services (WMS,WCS) and lots 
> of memory will help.
> 
> Small EC2 instance provides $0.10/hr:
> 
> 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute 
> Unit), 160 GB of instance storage, 32-bit platform
> 
>  
> 
> Large EC2 instances provide $0.40/hr:
> 
> 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 
> Compute Units each), 850 GB of instance storage, 64-bit platform
> 
>  
> 
> Extra large EC2 instances $0.80/hr:
> 
> 15 GB of memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute 
> Units each), 1690 GB of instance storage, 64-bit platform
> 
>  
> 
> Note: that the instances do not need to be permanent. Some people 
> (WeoGeo) have been using a couple of failover small instances and then 
> starting new large instances for specific requirements. The idea is to 
> start and stop instances as required rather than having ongoing 
> infrastructure costs. It only takes a minute or so to start an ec2 
> instance. If you are running a corporate service there may be parts of 
> the day with very little use so you just schedule your heavy duty 
> instances for peak times. If you can connect your raster to S3 buckets 
> rather than instance storage you have built in replicated backup.
> 
>  
> 
> I know that Java JAI can easily eat up memory and is core to Geoserver 
> WMS/WCS so you probably want to look at large memory footprint for any 
> platform with lots of raster service. I’m partial to Geoserver because 
> of its Java foundation.  I think I would try to keep the Apache2 mod_jk 
> Tomcat Geoserver on a separate server instance from PostGIS. This might 
> avoid problems for instance startup since your database would need to be 
> loaded separately. The instance ami resides in a 10G partition the 
> balance of data will probably reside on a /mnt partition separate from 
> ec2-run-instances. You may be able to avoid datadir problems by adding 
> something like Elastra to the mix. Elastra beta is a wrapper for 
> PostgreSql that puts the datadir on S3 rather than local to an instance. 
> I suppose they still keep indices(GIST et al) on the local instance.
> 
> (I still think it an interesting exercise to see what could be done 
> connecting PostGIS to AWS SimpleDB services.)
> 
>  
> 
> So thinking out loud here is a possible architecture–
> 
>     Basic permanent setup
> 
> put raster in S3 – this may require some customization of Geoserver,
> 
> build a datadir in a PostGIS and backup to S3
> 
> create a private ami for Postgresql/PostGIS
> 
> create a private ami for the load balancer instance
> 
> create a private ami with your service stack for both a small and large 
> instance for flexibility,
> 
>    Startup services
> 
> start a balancer instance
> 
> point your DNS CNAME to this balancer instance
> 
> start a PostGis instance (you could have more than one if necessary but 
> it would be easier to just scale to a larger instance type if the load 
> demands it)
> 
> have a scripted download from an S3 BU to your PostGIS datadir (I’m 
> assuming a relatively static data resource)
> 
>    Variable services
> 
> start service stack instance and connect to PostGIS
> 
> update balancer to see new instance – this could be tricky
> 
> repeat previous  two steps as needed
> 
> at night scale back – cron scaling for a known cycle or use a controller 
> like weoceo to detect and respond to load fluctuation
> 
>  
> 
> By the way the public AWS ami with the best resources that I have found 
> is Ubuntu 7.10 Gutsy. The debian dependency tools are much easier to use 
> and the resources are plentiful.
> 
>  
> 
> I’ve been toying with using an AWS stack adapted for serving some larger 
> Postgis vector sets such as fully connected census demographic data and 
> block polygons here in US. The idea would be to populate the data 
> directly from the census SF* and TIGER with a background Java bot. There 
> are some potentially novel 3D viewing approaches possible with xaml. 
> Anyway lots of fun to have access to virtual systems like this.
> 
>  
> 
> As you can see I’m excited anyway.
> 
>  
> 
> randy
> 
>  
> 
>  
> 
> *From:* discuss-bounces at lists.osgeo.org 
> [mailto:discuss-bounces at lists.osgeo.org] *On Behalf Of 
> *Bruce.Bannerman at dpi.vic.gov.au
> *Sent:* Monday, February 18, 2008 6:35 PM
> *To:* OSGeo Discussions
> *Subject:* [OSGeo-Discuss] OS Spatial environment 'sizing'
> 
>  
> 
> 
> IMO:
> 
> 
> Hello everyone,
> 
> I'm trying to get a feel for server 'sizing' for a **hypothetical** 
> Corporate environment to support OS Spatial apps.
> 
> 
> 
> Assume that:
> 
> - this is a dedicated environment to allow the use of OS Spatial 
> applications to serve Corporate OGC Services.
> 
> - the applications of interest are GeoServer, Deegree, GeoNetwork, 
> MapServer, MapGuide and Postgres/PostGIS.
> 
> - the environment may need to scale relatively quickly.
> 
> - it will be required to serve in the vicinty of 5 to 10 TB of data 
> initially (WMS, WFS, WCS).
> 
> 
> 
> Can anyone shed some light on the following questions please?
> 
> - I'm assuming a Linux installation (SLES, Redhat or Debian) or possibly 
> Intel Solaris. Has anyone experienced any issues in these (or other) 
> environments that they'd like to share?
> 
> - Are there any recommendations as to dedicated network bandwidth that 
> should be allocated?
> 
> - Has anyone done any work with load balancing and would like to share 
> their experiences?
> 
> - Of the above OS Spatial products, which ones could co-exist on the 
> same server (excluding Postgres/PostGIS)?
> 
> 
> Any thoughts are appreciated.
> 
> 
> Bruce Bannerman
> Australia
> 
> Notice:
> This email and any attachments may contain information that is personal, 
> confidential,
> legally privileged and/or copyright. No part of it should be reproduced, 
> adapted or communicated without the prior written consent of the 
> copyright owner.
> 
> It is the responsibility of the recipient to check for and remove viruses.
> 
> If you have received this email in error, please notify the sender by 
> return email, delete it from your system and destroy any copies. You are 
> not authorised to use, communicate or rely on the information contained 
> in this email.
> 
> Please consider the environment before printing this email.
> 
>  
> 
>  
> 
>  
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Discuss mailing list
> Discuss at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/discuss



More information about the Discuss mailing list