[mapserver-users] Performance in regular expressions or analternativeway to select a list of features

Steve Lime Steve.Lime at dnr.state.mn.us
Tue Jul 21 10:52:09 PDT 2009


Thanks for the report, not unexpected I think... Steve

>>> On 7/21/2009 at 3:32 AM, in message <4A657D19.9050105 at romtelecom.ro>, Adrian
Popa <adrian_gh.popa at romtelecom.ro> wrote:
> Just for reference:
> 
> I tried to select about 130 items using a regular expression in the form 
> of EXPRESSION (/^ITEM1|ITEM2|ITEM3|...|ITEM130$/) and rendering took 
> about 4.5-5 seconds.
> 
> I tried to select the exact same 130 items using "IN" syntax: EXPRESSION 
> ("[myfield]" IN "ITEM1,ITEM2,ITEM3,...,ITEM130") and the query took 
> about 8.5-9 seconds.
> 
> The tests were run several times in the same conditions - so the 
> relative results should be relevant.
> 
> So - performance favors regular expressions :)
> 
> 
> 
> Adrian Popa wrote:
>> Here's how I "fixed" this issue. I ended up regenerating the shapefile 
>> and dbf and adding a separate column with the grouping I desired. Now, 
>> my map file selects from that column, and the syntax is much simpler 
>> (by one or two orders of magnitude). I am happy with the results, 
>> however, I didn't get the chance to try out all the other methods 
>> because of lack of time.
>>
>> Thanks again,
>> Adrian
>>
>> Adrian Popa wrote:
>>> Hello Steve,
>>>
>>> I haven't tried out the simplified regex so I don't know if it will 
>>> be faster. I will try to test it as part of a speed test of the 
>>> various methods...
>>>
>>> I'm not sure what you mean by writing a temporary set of geometries. 
>>> Do you mean adding an index to my data so that I can select it by a 
>>> different (grouping) field instead? Unfortunately I can't do that 
>>> because the same item can be part of 10-20 groups, so there would not 
>>> be an easy way to group items apart from duplicating them in the 
>>> shapefile/dbf. I'm not sure if there's a problem if the same feature 
>>> appears 12 times in the same shapefile.
>>>
>>> In the end data reorganizing might be the fastest method available. 
>>> Problem is some items will belong to groups dinamically, so I will 
>>> have to implement a selection mechanism based on item id...
>>>
>>> Regards,
>>> Adrian
>>>
>>> Steve Lime wrote:
>>>> Have you tried a simplified version of your regex? I think you can do:
>>>>
>>>>   EXPRESSION /^ITEM1|ITEM2|ITEM3|ITEM4$/
>>>>
>>>> You might also consider writing a temporary set of geometries if a user will 
> continually display from
>>>> that set. In that case your overhead would be in managing the set of 
> features which would be higher
>>>> the first time but then very fast to render. Your dynamic portion of the 
> mapfile would reference the
>>>> temporary data.
>>>>
>>>> Steve
>>>>
>>>>   
>>>>>>> On 7/14/2009 at 1:15 AM, in message <4A5C2277.80204 at romtelecom.ro>, Adrian 
> Popa
>>>>>>>         
>>>> <adrian_gh.popa at romtelecom.ro> wrote:
>>>>   
>>>>> Hello everyone,
>>>>>
>>>>> Here's my problem: I'm trying to highlight segments from a line layer by 
>>>>> using an expression in a specific class. This portion of the mapfile is 
>>>>> dynamically generated and when it is done, it is sent to mapserver for 
>>>>> rendering.
>>>>> My problem is that I have to select between 10 - 400 features at a time 
>>>>> and I noticed when I have a lot of features there is a severe 
>>>>> performance degradation in mapserver (takes a lot of time to render or 
>>>>> even times out).
>>>>> Right now, my expression is built using regular expressions: something like:
>>>>> *EXPRESSION /^ITEM1$|^ITEM2$|^ITEM3$|^ITEM4$/*
>>>>> This works ok, but as I said has a performance penalty when you reach 
>>>>> ~400 items.  My data is selected from a shapefile layer which has about 
>>>>> 5500 items.
>>>>>
>>>>> Since I wouldn't be using the regular expressions at full capacity (I'm 
>>>>> matching the full name), I might rewrite the expression using something 
>>>>> like:
>>>>> *EXPRESSION ( ([NAME]=="ITEM1") OR ([NAME]=="ITEM2") OR 
>>>>> ([NAME]=="ITEM3") OR ([NAME]=="ITEM4") )*
>>>>>
>>>>>  From the documentation I see that:
>>>>> /Regular expression with MapServer work similarly to string comparison, 
>>>>> but allow more complex operation. They are slower than pure string 
>>>>> comparisons, but might be still faster than logical expression. As with 
>>>>> the string comparison use regular expressions, a FILTERITEM or a 
>>>>> CLASSITEM has to defined, respectively.
>>>>>
>>>>> /I would like to know if there is an efficient way of selecting a list 
>>>>> of elements from a layer, or what are your recommendations.
>>>>>
>>>>> Also - have there been significant changes in performance for this issue 
>>>>> from mapserver 4.10 (I am now migrating to mapserver 5.4)?
>>>>>
>>>>> Thanks,
>>>>> Adrian
>>>>>     
>>>>
>>>>
>>>>   
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> mapserver-users mailing list
>>> mapserver-users at lists.osgeo.org 
>>> http://lists.osgeo.org/mailman/listinfo/mapserver-users 
>>>   
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> mapserver-users mailing list
>> mapserver-users at lists.osgeo.org 
>> http://lists.osgeo.org/mailman/listinfo/mapserver-users 
>>   
> 
> 
> -- 
> --- 
> Adrian Popa
> NOC Division
> Network Engineer
> Divizia Centrul National de Operare Retea
> Departament Transport IP & Metro
> Compartiment IP Core & Backbone
> Phone: +40 21 400 3099



More information about the MapServer-users mailing list