Label Cache Performance

Stephen Woodbridge woodbri at SWOODBRIDGE.COM
Tue Feb 27 11:27:57 EST 2007


OK, great. When you get to it, let me know and I would be happy to help 
with algorithm issues and with building some test cases, or anything 
else I can do to support this.

-Steve

PS: Here is an extra (Round Tuit) for you :)

Steve Lime wrote:
> This still is on my radar, it's just a bit down the list at the moment.
> I would propose testing a simple
> implementation (worry about MINFEATUREDISTANCE later) to see if the
> performance gains are
> real. 
> 
> Steve
> 
>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 2/22/2007 11:59:08 AM
>>>>
> Steve,
> 
> Can I get you to pop this back on your stack. I really think this would
> 
> represent a significant performance improvement to Mapserver and would
> 
> hate to see it die here.
> 
> -Steve W
> 
> Steve Lime wrote:
>> Sorry, will try to get back to this thread over the weekend. I was
>> hoping someone
>> more knowledgable than I would comment on the hash issue.
>>
>> Steve
>>
>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/17/2007 3:32:58 PM
>>>>>
>> Steve,
>>
>> Hope you had a good vacation. As you wade through your email, any 
>> thoughts on this? I didn't see any follow up post from your on it.
>>
>> -Steve W.
>>
>> Stephen Woodbridge wrote:
>>> Steve Lime wrote:
>>>> Too bad our hash implmentation doesn't allow an arbitrary value
>> (e.g.
>>>> void *) per key. Looks like
>>>> it's char *'s.
>>> Bummer! So is this a show stopper?
>>> Can we cast another pointer into a char *?
>>> Seems like this is an opportunity to improve our hash
> implementation
>> so 
>>> it can be used by other pieces of the code. Or we could just use
> the
>> GNU 
>>> system has routines.
>>>
>>> And label_drawn_1 below probably needs to be a struct or array of
> the
>>> pixel x,y where the label was drawn so it can be used for the
>> distance 
>>> calculation.
>>>
>>> -Steve W
>>>
>>>> Steve
>>>>
>>>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/8/2007 2:09:26
> PM
>>>> Steve,
>>>>
>>>> The hash key for say
>>>>     "Main St" => Array(label_drawn_1, label_drawn_2, ..]
>>>> We only need to remember the the items that were actually drawn on
>> the
>>>> map since these are the only ones that need to be compared against
>> when
>>>> considering a potentially new label.
>>>>
>>>> Since the MINDISTANCE test is localized to a LAYER and possibly a
>> CLASS
>>>> modifying the key to include just the CLASS will suffice, like 
>>>> "<class_num>:<label_text>". The hash would be created for each
>> layer
>>>> and then disposed and recreated for the next layer. This is ok
>> because 
>>>> MINDISTANCE does not span LAYERs unless I am mistaken.
>>>>
>>>> So here is pseudo-code for how this might work. Obviously there
> are
>>>> details omitted and this would be implemented to be efficient. My
>> goal
>>>> here is to express the concept:
>>>>
>>>> for each layer {
>>>>      if MINDISTANCE {
>>>>          hash = create_hash()
>>>>      } else {
>>>>          hash = NULL;
>>>>      }
>>>>      for each label {
>>>>          if (mindistance_ok(this.label, hash) &&
>>>>              label_fits(this.label, bitmap)) {
>>>>              # assumes it will update hash
>>>>              draw_label(this.label, hash, bitmap)
>>>>          }
>>>>      }
>>>> }
>>>>
>>>> mindistance_ok() returns true if hash == NULL
>>>> label_fits() compares the label polygon to the bitmap and returns
>> true
>>>> if it fits
>>>> draw_label() renders the label and updates the hash if it is not
>> NULL 
>>>> and updates the bitmap.
>>>>
>>>> I can expand the pseudo-code for mindistance_ok() and label_fits()
>> is 
>>>> needed, but it seems that we should agree in principal that this
>> will 
>>>> work for all existing cases. I have not heard any case that this
>> can
>>>> not be easily adapted to support.
>>>>
>>>> Steve - Does this cover the case you were concerned about? If not
>> can 
>>>> you explain the scenario a little better, because I'm missing it.
>>>>
>>>> -Steve
>>>>
>>>> Steve Lime wrote:
>>>>> Was thinking about this more and don't think a hash for
>> MINDISTANCE
>>>> will
>>>>> work since
>>>>> the hash keys need to be unique and label/class/layer certainly
>>>> isn't.
>>>>> You might have
>>>>> 50 occurances of the same label for a series of road segments.
> I'm
>>>> sure
>>>>> there's a workaround but it would undoubtedly be complex...
>>>>>
>>>>> Steve
>>>>>
>>>>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/3/2007 9:35:26
>> PM
>>>>> Steve Lime wrote:
>>>>>> We use the loops for MINDISTANCE because we obviously are
> already
>>>>> doing
>>>>>> so for collisions.
>>>>>>
>>>>>> I thought about a hash too. Trouble is it's based not solely on
>> the
>>>>>> label text, but
>>>>>> also class and layer indexes. For example you wouldn't want to
>>>> ommit
>>>>> a
>>>>>> road
>>>>>> label 'Washington' because of it's proximity to 'Washington'
>> lake.
>>>>>> Those types of
>>>>>> situations are not all that rare...
>>>>> Sure, but if you do the bitmap collision detection and use the
>> hash
>>>> for
>>>>> MINDISTANCE, the hash would only exist while you are processing a
>>>>> single layer and then get dumped and a new hash created for the
>> next 
>>>>> layer.
>>>> If
>>>>> you need to keep hash entries unique by CLASS also, then you
> could
>>>>> either create an array of hashes and index by classnum or have a
>>>> single
>>>>> hash and prefix ot postfix the keys with the class like
>>>>> "classnum:key".
>>>>>
>>>>> This should be very straight forward and I think very, very fast 
>>>>> compared to the way it works now. I'm sure there are other little
> 
>>>>> adjustments that would need to be made, but I can't think of any
>> show
>>>>> stoppers.
>>>>>
>>>>> -Steve
>>>>>
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/3/2007
> 12:14:06
>>>> PM
>>>>>> If I were dealing with the MINDISTANCE issue in Perl I would
>> create
>>>> a
>>>>>> hash where the key was the label text, because that is what
>>>>> MINDISTANCE
>>>>>> is restricting, and make the hash value an array of locations.
>> This
>>>>>> would much faster than any search algorithm and you would narrow
>>>> the
>>>>>> search to the array of placed text items of the same name.
>>>>>>
>>>>>> There is no reason this approach could not be used for the
>>>>> MINDISTANCE
>>>>>> case in mapserver. Using this and the bitmap collision detection
>>>>> should
>>>>>> remove most of the overhead of the label cache processing and
>> make
>>>> it
>>>>>> very scalable with respect to number of items to be processed.
>>>>>>
>>>>>> -Steve
>>>>>>
>>>>>> Steve Lime wrote:
>>>>>>> I did think of one other gotcha. The cache is not just used for
>>>>>> physical
>>>>>>> overlap, there
>>>>>>> are other tests applied. For example, MINDISTANCE, which
> filters
>>>>> out
>>>>>>> like labels based
>>>>>>> on proximity. That requires direct comparison of the bounding
>>>>>> polygons.
>>>>>>> One might be
>>>>>>> able to reduce that to a test on label points (in fact that
>> might
>>>>> be
>>>>>>> what's there now,
>>>>>>> I'd have to check) but that still requires traversing placed
>>>> labels
>>>>>>> (albeit only those
>>>>>>> within a particular layer)...
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>>>>> Stephen Woodbridge <woodbri at swoodbridge.com> 1/2/2007
> 1:27:47
>>>> PM
>>>>>>> Yes, it works for all cases. You have to think about the
> problem
>>>> as
>>>>> a
>>>>>>> raster problem not a vector or character problem.
>>>>>>>
>>>>>>> So you do this by creating an image the size of rectangular
> bbox
>>>> of
>>>>>> the
>>>>>>> label. Then you render the the character or label polygons
> boxes
>>>>> into
>>>>>>> the image. The image now becomes a mask representing the pixels
>> of
>>>>>>> where the label will fall if it was rendered.
>>>>>>>
>>>>>>> the test: Logical AND this mask image with the map label image
>>>>> given
>>>>>>> its position in that image. You have a collision on the first
>> AND
>>>>> result
>>>>>>> the it true.
>>>>>>>
>>>>>>> to update the map label image, just OR this label mask with the
>>>> map
>>>>>>> label image given its position in that image.
>>>>>>>
>>>>>>> if you want to test another location, ie: you are searching for
>> an
>>>>>>> location nearby that the label will fit, then change the
>> position
>>>>> by
>>>>>> a
>>>>>>> pixel and repeat the test. This is more costly than just a
> test,
>>>>> but
>>>>>> is
>>>>>>> still pretty efficient. I don't think you would want to apply
>> this
>>>>> to
>>>>>>> every label, but having it as an option for a layer might be 
>>>>>>> interesting. You could constrain this to say search up/down or 
>>>>>>> right/left 10 pixels and then something like lot labels could
>>>>> easily
>>>>>> be
>>>>>>> staggered. This is just an idea and it might not farm out to be
>>>>>> useful,
>>>>>>> but once the machinery is in, it falls out as a simple
> variation
>>>> on
>>>>>> the
>>>>>>> theme.
>>>>>>>
>>>>>>> -Steve
>>>>>>>
>>>>>>> Paul Spencer wrote:
>>>>>>>> This would need to work for 'curved' labels too.  Would it
> work
>>>> to
>>>>>> do
>>>>>>>> things per-character?
>>>>>>>>
>>>>>>>> Paul
>>>>>>>>
>>>>>>>> On 2-Jan-07, at 12:40 PM, Steve Lime wrote:
>>>>>>>>
>>>>>>>>> Remember also that you're not dealing strictly with bounding
>>>>> boxes
>>>>>>> (e.g.
>>>>>>>>> rects) but
>>>>>>>>> actual polygons whenever rotation is applied. This is so you
>> get
>>>>>>>>> relatively tight fits
>>>>>>>>> around labels and can cram labels on a map.
>>>>>>>>>
>>>>>>>>>>>> Stephen Woodbridge <woodbri at swoodbridge.com> 1/2/2007
>>>> 10:25:49
>>>>>>> AM
>>>>>>>>> Steve Lime wrote:
>>>>>>>>>> Interesting idea. No, it's not been looked at before, not by
>> me
>>>>>>>>>> anyway. Frank might have some thoughts on performance. You'd
>>>>>> still
>>>>>>>>>> have to loop through the affected area to find collisions
> but
>> I
>>>>>>>>> think
>>>>>>>>>> your correct in that the search time would basically be
>>>> constant
>>>>>>>>>> (affected slightly by the size of the label).
>>>>>>>>>>
>>>>>>>>>> Wouldn't be that hard to implement. I wonder if GD holding
>> the
>>>>>>> cache
>>>>>>>>>> image could be useful or if a custom 2-D bit array would be
>>>>>>> faster-
>>>>>>>>>> probably the latter...
>>>>>>>>> I thin a 2-D bit array would be much faster and require
>>>>>>> significantly
>>>>>>>>> less memory, but would require writing a little additional
>> code.
>>>>>>> The
>>>>>>>>> upside to this is that the code could be optimized for this
>>>>>>> specific
>>>>>>>>> task, unless there are other potential uses for this.
>>>>>>>>>
>>>>>>>>> Another idea I had related to this scheme would be that you
>>>> could
>>>>>>>>> implement a function could search for a placement position
>>>>> withing
>>>>>>> some
>>>>>>>>> number of pixels. Here the idea would be that you use the
>>>> polygon
>>>>>>> to
>>>>>>>>> create a mask and see if the mask collides, if not you are
>> done,
>>>>>>>>> otherwise you do a spiral search outwards from the anchor
>> point
>>>>>> for
>>>>>>>>> some
>>>>>>>>> number of pixels looking for a place the label will fit. This
>> is
>>>>>>> easy
>>>>>>>>> to
>>>>>>>>> do in some nested loops and very fast.
>>>>>>>>>
>>>>>>>>> This idea of a moving mask is fundamental to a lot of the
>> raster
>>>>>>>>> manipulation algorithm and is easy to implement and very fast
>> to
>>>>>>>>> execute.
>>>>>>>>>
>>>>>>>>> -Steve W
>>>>>>>>>
>>>>>>>>>> Steve
>>>>>>>>>>
>>>>>>>>>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/2/2007
>>>> 9:46:46
>>>>>>>>>>>>> AM
>>>>>>>>>>>>>
>>>>>>>>>> HNY All,
>>>>>>>>>>
>>>>>>>>>> Last year I spent a fair amount of time optimizing a bunch
> of
>>>>>>>>>> mapfiles
>>>>>>>>>>
>>>>>>>>>> for performance. One of the areas that seems to always be a
>>>>>>>>>> performance
>>>>>>>>>>
>>>>>>>>>> hog if processing the Label Cache. I have been successful
> are
>>>>>>>>>> thinning
>>>>>>>>>>
>>>>>>>>>> down the number of labels going into the cache, but I
>> wondering
>>>>>> if
>>>>>>>>>> there might be a more efficient way of dealing with it.
> Since
>> I
>>>>>>> have
>>>>>>>>>> been playing with rasters over the holidays I was wondering
>> if
>>>>> it
>>>>>>>>>> might be possible to implement the label cache as a bitmap
>> mask
>>>>>>> and
>>>>>>>>>> us a raster
>>>>>>>>>>
>>>>>>>>>> OP to see if there is a collision?
>>>>>>>>>>
>>>>>>>>>> OK, I'm sure it is possible, I guess my real question(s)
> are:
>>>>>>>>>> Has this approach been looked at before? Any thoughts on
>>>>>>>>> performance?
>>>>>>>>>> My idea is:
>>>>>>>>>>
>>>>>>>>>> 1) create an label cache image 2) to check a label, get its
>>>> bbox
>>>>>>>>>> polyon check if it overlaps and set pixels 3) polygon fill
>> the
>>>>>>> bbox
>>>>>>>>>> of a label into the label cache image when a label is
>> rendered
>>>>>>>>>> we might be able to use a 2:1 or 4:1 reduced size label
> cache
>>>>>>> image
>>>>>>>>>> with some thought about how to work around the issues.
>>>>>>>>>>
>>>>>>>>>> I would think that this would be much faster than searching
> a
>>>>>> list
>>>>>>>>> of
>>>>>>>>>>  drawn labels and checking for intersections. And it should
>> be
>>>>>>>>> linear
>>>>>>>>>>  with the number of labels in the cache.
>>>>>>>>>>
>>>>>>>>>> -Steve
>> +-----------------------------------------------------------------+
>>>>>>>> |Paul Spencer                          pspencer at dmsolutions.ca 
> 
>>>>> |
>>>>>
>> +-----------------------------------------------------------------+
>>>>>>>> |Chief Technology Officer                                      
> 
>>>>> |
>>>>>>>> |DM Solutions Group Inc               
>> http://www.dmsolutions.ca/ 
>>>>> |
>>>>>
>> +-----------------------------------------------------------------+



More information about the mapserver-dev mailing list