[postgis-users] Hardware requirements for a server

Mathieu Basille basille.web at ase-research.org
Tue Feb 10 11:54:12 PST 2015


Hey Rémi,

Thanks for the feedback! Exactly the type of information I'm looking for. 
Before commenting further, if anyone has another experience about such a 
multi-user PostGIS server in a small unit, please feel free to share! The 
more I get from actual users, with different experience and different 
settings, the better!

Let me clear this one right away: I did not talk about backups because I 
didn't think this was an issue here! I'm lucky enough that my center is 
actually backing every server daily and monthly on tapes, with also another 
backup that will come in addition to it at the university level...
Thanks to James for the warning (I know about RAID, never uses it as a 
backup solution!), I will also look at pg_dump and see how we can make it 
part of their backup process.
Network is not an issue either. The same guys than above make sure the 
network is good too.

This said, here are a couple of additional comments:

* I didn't know about pgpool. It looks like it may come in handy if there 
connections become sluggish or simply impossible due to too many users. I 
will definitely keep this under my hat, although I understand it is a 
UNIX-only solution.

* You suggest that I should ask about hardware requirements on the postgres 
list directly. The reason I didn't in the first place was that I thought 
that computer demands would be different using PostGIS, because of the very 
nature of the data (i.e. GIS layers that can be really big). But maybe I'm 
thinking about it in a wrong way... I will nevertheless try to see what 
people there have to say about it (and let people here know!).

* About usage being mostly read: this will be true for most "pure GIS" 
tasks (mostly intersecting), but I find that (from experience), we usually 
end up with a lot of intermediary tables for our analyses (new tables for 
the most part, not new columns).

* About MS vs. Linux based servers: same here, as long as the IT deal with 
it, I would be inclined to say this is not an issue (or at least this is 
not mine!). I agree though that, for a personal use (i.e. computer based in 
my lab), Linux systems are much easier to deal with (all my computers 
actually run Debian). But that's one the reasons I want a server in the 
center: it would come with a sysadmin who would deal with the hassle... I 
was thinking more in terms of possibilities here (remote access, linking 
PostGIS with R, etc.); in other words, do we lose anything *as a user* with 
a MS server?

* I knew about Shiny—although I never used it. I think this is not the 
focus for the moment, the server (both PostGIS and R if we can make it) 
would be 100% research/analysis oriented. For instance, I'm not considering 
(yet) a visualization solution either (CartoDB...). Thanks for your 
additional suggestions on linking R and PostGIS, very useful.

Finally, may I ask you about your own setup (number of users, typical use 
cases, hardware specs, etc.)? It would probably help me.

Thank you,
Mathieu.


Le 10/02/2015 04:39, Rémi Cura a écrit :
> Hey,
> nice project =)
>
> If you use something like qgis, each user can easily have a dozen
> connection open to server, so with 10 users, you may need to use something
> like pgpool.
>
> About hardware dimension, it is more  a question for postgres list.
>
> You may stress that your usage is probably mostly read, and that usage will
> be spread on a lot of table.
> Your storage being external, you may need some good network.
> You didn't talk about backup, it is essential (raid, replication, backup
> script?).
>
> In my experience (research), it is totally unpractical to use a ms based
> server, because all good stuff need to be compiled (sfcgal, geos, gdal,
> postgis, plr ...), and it is much more easier on linux.
> I solved it by using a virtualbox with ubuntu.
>
> We used a NAS server to store postgres files, although it is was not
> recommended. It worked very well over the gigabit ethernet.
>
> About pl/r or pl/python, I used both (tough much more plpython).
> For my settings the best is small function in pl language (by small I mean
> not much memory and not too long (like max few minutes)) , big function
> (like controlling your whole process, 60 hour computing, etc) in R or
> python with postgres connector.
> Or , another rule of thumb : if it fit naturally into a transaction, in pl
> language, if it is bigger, python or R.
>
> Having a dedicated R server would enable to use something like Shiny (R web
> applet).
>
> Cheers,
> Rémi-C
>
>
>
> 2015-02-10 5:07 GMT+01:00 Mathieu Basille <basille.web at ase-research.org
> <mailto:basille.web at ase-research.org>>:
>
>     Dear PostGIS users,
>
>     I am currently planning to set up a PostGIS instance for my lab. Turns
>     out I believe this would be useful for the whole center, so that I'm
>     now considering setting up a PostGIS server for everyone—if interest is
>     shared of course. At the moment, I am however struggling with what
>     would be required in terms of hardware, and of course, the cost will
>     depend on that—at the end of the day, it's really a matter of money
>     well spent. I have then a series of questions/remarks, and I would
>     welcome any feedback from people with existing experience on setting up
>     a multi-user PostGIS server.
>
>     * My own experience is rather limited: I used PostGIS quite a bit, but
>     only on a desktop, with 2 users. The desktop was quite good (quad-core
>     Xeon, 12 Go RAM, 500 GB hd), running Debian, and we never had any
>     performance issue (although some queries were rather long, but still
>     acceptable).
>
>     * The use case I'm envisioning would be (at least in the foreseeable
>     future):
>     - About 10 faculty users (which means potentially a little bit more
>     students using it); I would have hard time considering more than 4
>     concurrent users;
>     - Data would primarily involve a lot (hundreds/thousands) of high
>     resolution (spatial and temporal) raster and vector maps, possibly over
>     large areas (Florida / USA / continental), as well as potentially
>     millions of GPS records (animals individually monitored);
>     - Queries will primarily involve retrieving points/maps over given
>     areas/time, as well as intersecting points over environmental layers;
>     other use cases will involve working with steps, i.e. the straight line
>     segment connecting two successive locations, and intersecting them with
>     environmental layers;
>
>     * I couldn't find comprehensive or detailed guidelines on-line about
>     hardware, but from what I could see, it seems that memory wouldn't be
>     the main issue, but the number of cores would be (one core per database
>     connection if I'm not mistaken). At the same time, we want to make sure
>     that the experience is smooth for everyone...
>
>     * Is there a difference in terms of performance and usability between a
>     Linux-based and a MS-based server? My center is unfortunately
>     MS-centered, and existing equipment runs with MS systems... It would
>     thus be easier for them to set up a MS-based server.
>
>     * Does anyone have worked with a server running the DB engine, while
>     the DB itself was stored on another box/server? That would likely be
>     the case here since we already have a dedicated box for file storage.
>     Along these lines, does the system of the file storage box matter
>     (Linux vs. MS)?
>
>     * We may also use the server as a workstation to streamline PostGIS
>     processing with further R analyses/modeling (or even use R from within
>     the database using PL/R). Again, does anyone have experience doing it?
>     Is a single workstation the recommended way to work with such workflow?
>     Or would it be better (but more costly) to have one server dedicated to
>     PostGIS and another one, with different specs, dedicated to analyses (R)?
>
>     I realize my questions and comments may be a confusing, likely because
>     of the lack of experience about these issues on my side. I really
>     welcome any feedback of people working with PostGIS servers in a small
>     unit, or any similar setting that could be informative!
>
>     In advance, thank you very much!
>
>     Sincerely,
>     Mathieu Basille.
>
>
>     --
>
>     ~$ whoami
>     Mathieu Basille
>     http://ase-research.org/__basille <http://ase-research.org/basille>
>
>     ~$ locate --details
>     University of Florida \\
>     Fort Lauderdale Research and Education Center
>     (+1) 954-577-6314 <tel:%28%2B1%29%20954-577-6314>
>
>     ~$ fortune
>     « Le tout est de tout dire, et je manque de mots
>     Et je manque de temps, et je manque d'audace. »
>       -- Paul Éluard
>
>     _________________________________________________
>     postgis-users mailing list
>     postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>     http://lists.osgeo.org/cgi-__bin/mailman/listinfo/postgis-__users
>     <http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users>
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>

-- 

~$ whoami
Mathieu Basille
http://ase-research.org/basille

~$ locate --details
University of Florida \\
Fort Lauderdale Research and Education Center
(+1) 954-577-6314

~$ fortune
« Le tout est de tout dire, et je manque de mots
Et je manque de temps, et je manque d'audace. »
  -- Paul Éluard



More information about the postgis-users mailing list