[Geomoose-users] query.php performance advice sought

Brent Fraser bfraser at geoanalytic.com
Tue Jun 22 23:50:55 EDT 2010


Ok, disregards my comment below.  Based on your emails the likely cause is
 the query PHP and mapscript.

Brent

> It could be mapserver's processing of the query.    See
> http://trac.osgeo.org/mapserver/ticket/3069
>
> Try mapserver v5.6
>
> Best Regards,
> Brent Fraser
>
>
>> Hi Len, thanks for the PgAdmin suggestion; I hadn't thought to do the
>> SQL
>> there to test, only checked with a row filter. Here is the result of a
>> simple SELECT query to find all the records with syspop between 0 and
>> 5000.
>>
>> SELECT gid, pwsid, sysname, pwstype, syspop, parish, dhhregion,
>> sysstatus,
>>        other_sysn, sourcetype, sourcecoun, facid, facname, other_facn,
>>        wellregid, constr_dat, facstatus, facstatus_, lat, lon, datum,
>>        location_d, moreinfo, staticleve, groutingde, welldepth,
>> welldiam,
>>        point_x, point_y, wkb_geometry
>>   FROM webmaps_sdwis_wells
>>   WHERE (syspop > 0 and syspop < 5000);
>>
>> Total query runtime: 241 ms.
>> 3294 rows retrieved.
>>
>> I would submit that postgres is working ok with that runtime.
>>
>> I tested a brief version of your template suggestion by pointing to a
>> simple html that only returns a record id. With that I get back 3294
>> rows
>> of records in my "Search results" tab in 43 seconds. With my swanky
>> formatted template it takes 5 minutes and 35 seconds to get the same
>> results back.
>>
>> That's hard data that seems to suggest that it's the templating that
>> consumes all the time and resources.
>>
>> Looks like a workaround is prescribed, and I need to learn how to make
>> the
>> right links from the result list to the "more info" template.
>>
>> Thanks for the help!
>> /Johan.
>>
>>> -----Original Message-----
>>> From: Len Kne [mailto:lkne at houstoneng.com]
>>> Sent: Tuesday, June 22, 2010 4:38 PM
>>> To: Johan Forsman; geomoose-users at lists.sourceforge.net
>>> Subject: RE: query.php performance advice sought
>>>
>>> Johan
>>>
>>> Something seems wrong with the performance when accessing the Postgres
>>> DB.
>>> You say there are only 4000 records, unless there are a million
>>> attributes, seems like that should be a really quick query.  Have you
>>> tried running the query from the command line or PGAdmin to see how it
>>> performs?
>>>
>>> If you have a large number of records with a large mapserver template,
>>> it
>>> is going to take some time to get that data to the client.  Could you
>>> have
>>> a simple template that returns just the syspop attribute and then have
>>> a
>>> link to make a follow-up query if details are needed?
>>>
>>> Another thing to check out is FastCGI, I seen some performance boost
>>> using
>>> it.
>>> http://www.maptools.org/ms4w/index.phtml?page=README_INSTALL_3.0.html#f-
>>> fastcgi
>>>
>>> Len
>>>
>>> -----Original Message-----
>>> From: Johan Forsman [mailto:Johan.Forsman at LA.GOV]
>>> Sent: Tuesday, June 22, 2010 3:06 PM
>>> To: geomoose-users at lists.sourceforge.net
>>> Subject: Re: [Geomoose-users] query.php performance advice sought
>>>
>>> Please pardon the prior incomplete post; I inadvertently pressed "send"
>>> rather than "save draft". Blame it on the streaming World Cup coverage
>>> online. Here is the more complete version.
>>>
>>> Preemptive apologies for a long post. I have attempted to collect
>>> enough
>>> data for the collective to have a chance at theorizing.
>>>
>>> I am attempting to use the new query.php to extract a subset of a
>>> PostGIS
>>> table based on integer values stored in the "syspop" field. My service
>>> definition looks like this at the moment (I made a copy of the original
>>> query.php and changed the name to allow for troubleshooting without
>>> risking destroying the original, hence the different script name):
>>>
>>> 	<service name="advanced_search">
>>> 		<url>php/query_sdwp.php</url>
>>> 		<step type="input">
>>> 			<input type="hidden" name="highlight" value="false"/>
>>> 			<input type="hidden" name="mode" value="search"/>
>>>
>>> 			<input type="hidden" name="layer0"
>>> value="wells/sdwis_wells"/>
>>> 			<input type="hidden" name="template0"
>>> value="itemquery"/>
>>>
>>> 			<input type="hidden" name="fieldname0" value="syspop"/>
>>> 			<input type="user" name="value0" title="Population is
>>> greater than"/>
>>> 			<input type="hidden" name="comparitor0" value="gt"/>
>>>
>>> 			<input type="hidden" name="operator1" value="and"/>
>>>
>>> 			<input type="hidden" name="fieldname1" value="syspop"/>
>>>
>>> 			<input type="user" name="value1" title="and less than"/>
>>> 			<input type="hidden" name="comparitor1" value="lt"/>
>>> 		</step>
>>> 	</service>
>>>
>>> It's not pretty but it works. Sort of. I am having problems with the
>>> performance to such extent it's not usable. I have made some attempts
>>> to
>>> quantify this below. It's not scientific, but it is illustrative.
>>>
>>> Relevant stats:
>>> Server hardware is 2 x 2.0 GHz Xeon, 3 GB RAM. It's a repurposed Dell
>>> PowerEdge 4600, I think.
>>> Server OS is Ubuntu Server 9.10.
>>> GeoMoose 2.2 in FGS 9.5 with MapServer 5.4.2.
>>>
>>> Dataset of interest is a point set with 4100 total records, stored in
>>> PostGIS on the same server. An index on the syspop field has no
>>> discernible effect on the performance.
>>>
>>> During testing I was the only user on the server.
>>>
>>> I turned off highlighting so the query would only run once.
>>>
>>> I ran two scenarios:
>>> 1) syspop >5000 and syspop <10000
>>> 2) syspop >0 and syspop <5000
>>>
>>> I observed the server process list to determine where the CPU time was
>>> being spent.
>>>
>>> I uncommented the existing diagnostic lines in query.php to verify the
>>> request and the result count for both scenarios and they are both
>>> correct.
>>>
>>> I increased the php timeout to 10 minutes and the php memory to 128
>>> megs
>>> to stop those errors from interfering. I am logging the php errors and
>>> file is now clear with no errors.
>>>
>>> These are the results for the two scenarios:
>>>
>>> Search Layer: wells/sdwis_wells
>>> Template: itemquery
>>> FILTER: syspop > 5000 and syspop < 10000 Total Results: 306 Stop watch
>>> time to completion: 13 seconds Time spent by postgres: 7 seconds with
>>> CPU
>>> at 88% Time spent by httpd: 6 seconds with CPU at 92%
>>>
>>> Search Layer: wells/sdwis_wells
>>> Template: itemquery
>>> FILTER: syspop > 0 and syspop < 5000
>>> Total Results: 3323
>>> Stop watch time to completion: 5.5 minutes!
>>> Time spent by postgres: 30 seconds with CPU at 88% Time spent by httpd:
>>> 5
>>> minutes with CPU at 92%
>>>
>>> With that CPU usage I'd say stuff is going on, but what? SQL-wise this
>>> is
>>> a very simple query, is it not?
>>>
>>> My baseline comparison is that for both scenarios a SQL query from
>>> Access
>>> 2007 on my workstation on the linked table from the server, or a row
>>> filter from PgAdmin, returns the expected results instantaneously.
>>> Apples
>>> and oranges I know, but perhaps confirmation that the database can work
>>> quickly and serve the expected results in no time flat?
>>>
>>> I tried a MapServer direct URL query, with a simple template to just
>>> list
>>> the record IDs in the browser like so:
>>>
>>> .../cgi-
>>> bin/mapserv?map=/opt/fgs/apps/geomoose2/maps/sdwp/wells/wells_webmaps.map&
>>> layers=sdwis_wells&mode=itemnquery&qlayer=sdwis_wells&qitem=syspop&qstring
>>> =(syspop > 5000 and syspop < 10000)
>>>
>>> This returns the 306 records to the browser window in 3 seconds with
>>> barely a blip in the server CPU usage.
>>>
>>> Direct URL query with syspop > 0 and syspop < 5000:
>>> Returns the 3323 records to the browser window in 35 seconds with
>>> server
>>> CPU (postgres) at 85%. A respectable portion of that time is
>>> transferring
>>> the data to the browser window.
>>>
>>> If those URL queries are representative of what GM is doing, then I'd
>>> say
>>> that the 30-second mark is as fast as MapServer can query the PostGIS
>>> and
>>> return the results on this system, but what in the world is going on
>>> during those other 5 minutes? I know GM building the formatted results
>>> with the template, but what else? Is there a way to tell?
>>>
>>> Very oddly with the URL queries: The mapfile contains a FILTER
>>> "sysstatus
>>> = 'Active'" directive to only include the active records. If that
>>> directive is absent the URL query works, but if it is present I receive
>>> and error: "Content-type: text/html msDrawMap(): Image handling error.
>>> Failed to draw layer named 'sdwis_wells'". Depending on what numbers I
>>> use
>>> for the lower and upper bounds I can get some records returned before
>>> the
>>> error. MapServer bug? MapServer feature? PEBKAC? Just weird?
>>>
>>> I don't know that there is anything I can do, but please feel free to
>>> opine at will; I am fresh out of ideas. My coding skills are worse than
>>> non-existent; they are dangerous, and regrettably there is no budget
>>> for
>>> that shiny new 26 GHz Core i13, which surely would address some part of
>>> this.
>>>
>>> Sorry about the long post. After this, I'm thinking it's noon
>>> somewhere...
>>> /Johan.
>>>
>>> ----------------------
>>> Johan Forsman
>>> Geologist
>>> Safe Drinking Water Program
>>> Louisiana Department of Health and Hospitals Office of Public Health
>>> Telephone: 225.342.7309
>>> Telefax: 225.342.7303
>>>
>>>
>>> --------------------------------------------------------------------------
>>> ----
>>> ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's
>>> Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit.  See the
>>> prize
>>> list and enter to win:
>>> http://p.sf.net/sfu/thinkgeek-promo
>>> _______________________________________________
>>> Geomoose-users mailing list
>>> Geomoose-users at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/geomoose-users
>>
>> ------------------------------------------------------------------------------
>> ThinkGeek and WIRED's GeekDad team up for the Ultimate
>> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
>> lucky parental unit.  See the prize list and enter to win:
>> http://p.sf.net/sfu/thinkgeek-promo
>> _______________________________________________
>> Geomoose-users mailing list
>> Geomoose-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/geomoose-users
>>
>
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> Geomoose-users mailing list
> Geomoose-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geomoose-users
>







More information about the Geomoose-users mailing list