Label Cache Performance
Stephen Woodbridge
woodbri at SWOODBRIDGE.COM
Wed Jan 17 16:32:58 EST 2007
Steve,
Hope you had a good vacation. As you wade through your email, any
thoughts on this? I didn't see any follow up post from your on it.
-Steve W.
Stephen Woodbridge wrote:
> Steve Lime wrote:
>> Too bad our hash implmentation doesn't allow an arbitrary value (e.g.
>> void *) per key. Looks like
>> it's char *'s.
>
> Bummer! So is this a show stopper?
> Can we cast another pointer into a char *?
> Seems like this is an opportunity to improve our hash implementation so
> it can be used by other pieces of the code. Or we could just use the GNU
> system has routines.
>
> And label_drawn_1 below probably needs to be a struct or array of the
> pixel x,y where the label was drawn so it can be used for the distance
> calculation.
>
> -Steve W
>
>> Steve
>>
>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/8/2007 2:09:26 PM
>>>>>
>> Steve,
>>
>> The hash key for say
>> "Main St" => Array(label_drawn_1, label_drawn_2, ..]
>> We only need to remember the the items that were actually drawn on the
>>
>> map since these are the only ones that need to be compared against when
>>
>> considering a potentially new label.
>>
>> Since the MINDISTANCE test is localized to a LAYER and possibly a CLASS
>>
>> modifying the key to include just the CLASS will suffice, like
>> "<class_num>:<label_text>". The hash would be created for each layer
>> and then disposed and recreated for the next layer. This is ok because
>> MINDISTANCE does not span LAYERs unless I am mistaken.
>>
>> So here is pseudo-code for how this might work. Obviously there are
>> details omitted and this would be implemented to be efficient. My goal
>>
>> here is to express the concept:
>>
>> for each layer {
>> if MINDISTANCE {
>> hash = create_hash()
>> } else {
>> hash = NULL;
>> }
>> for each label {
>> if (mindistance_ok(this.label, hash) &&
>> label_fits(this.label, bitmap)) {
>> # assumes it will update hash
>> draw_label(this.label, hash, bitmap)
>> }
>> }
>> }
>>
>> mindistance_ok() returns true if hash == NULL
>> label_fits() compares the label polygon to the bitmap and returns true
>>
>> if it fits
>> draw_label() renders the label and updates the hash if it is not NULL
>> and updates the bitmap.
>>
>> I can expand the pseudo-code for mindistance_ok() and label_fits() is
>> needed, but it seems that we should agree in principal that this will
>> work for all existing cases. I have not heard any case that this can
>> not be easily adapted to support.
>>
>> Steve - Does this cover the case you were concerned about? If not can
>> you explain the scenario a little better, because I'm missing it.
>>
>> -Steve
>>
>> Steve Lime wrote:
>>> Was thinking about this more and don't think a hash for MINDISTANCE
>> will
>>> work since
>>> the hash keys need to be unique and label/class/layer certainly
>> isn't.
>>> You might have
>>> 50 occurances of the same label for a series of road segments. I'm
>> sure
>>> there's a workaround but it would undoubtedly be complex...
>>>
>>> Steve
>>>
>>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/3/2007 9:35:26 PM
>>>>>>
>>> Steve Lime wrote:
>>>> We use the loops for MINDISTANCE because we obviously are already
>>> doing
>>>> so for collisions.
>>>>
>>>> I thought about a hash too. Trouble is it's based not solely on the
>>>> label text, but
>>>> also class and layer indexes. For example you wouldn't want to
>> ommit
>>> a
>>>> road
>>>> label 'Washington' because of it's proximity to 'Washington' lake.
>>>> Those types of
>>>> situations are not all that rare...
>>> Sure, but if you do the bitmap collision detection and use the hash
>> for
>>> MINDISTANCE, the hash would only exist while you are processing a
>>> single layer and then get dumped and a new hash created for the next
>>> layer.
>> If
>>> you need to keep hash entries unique by CLASS also, then you could
>>> either create an array of hashes and index by classnum or have a
>> single
>>> hash and prefix ot postfix the keys with the class like
>>> "classnum:key".
>>>
>>> This should be very straight forward and I think very, very fast
>>> compared to the way it works now. I'm sure there are other little
>>> adjustments that would need to be made, but I can't think of any show
>>
>>> stoppers.
>>>
>>> -Steve
>>>
>>>> Steve
>>>>
>>>>
>>>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/3/2007 12:14:06
>> PM
>>>> If I were dealing with the MINDISTANCE issue in Perl I would create
>> a
>>>> hash where the key was the label text, because that is what
>>> MINDISTANCE
>>>> is restricting, and make the hash value an array of locations. This
>>
>>>> would much faster than any search algorithm and you would narrow
>> the
>>>> search to the array of placed text items of the same name.
>>>>
>>>> There is no reason this approach could not be used for the
>>> MINDISTANCE
>>>> case in mapserver. Using this and the bitmap collision detection
>>> should
>>>> remove most of the overhead of the label cache processing and make
>> it
>>>> very scalable with respect to number of items to be processed.
>>>>
>>>> -Steve
>>>>
>>>> Steve Lime wrote:
>>>>> I did think of one other gotcha. The cache is not just used for
>>>> physical
>>>>> overlap, there
>>>>> are other tests applied. For example, MINDISTANCE, which filters
>>> out
>>>>> like labels based
>>>>> on proximity. That requires direct comparison of the bounding
>>>> polygons.
>>>>> One might be
>>>>> able to reduce that to a test on label points (in fact that might
>>> be
>>>>> what's there now,
>>>>> I'd have to check) but that still requires traversing placed
>> labels
>>>>> (albeit only those
>>>>> within a particular layer)...
>>>>>
>>>>> Steve
>>>>>
>>>>>>>> Stephen Woodbridge <woodbri at swoodbridge.com> 1/2/2007 1:27:47
>> PM
>>>>> Yes, it works for all cases. You have to think about the problem
>> as
>>> a
>>>>> raster problem not a vector or character problem.
>>>>>
>>>>> So you do this by creating an image the size of rectangular bbox
>> of
>>>> the
>>>>> label. Then you render the the character or label polygons boxes
>>> into
>>>>> the image. The image now becomes a mask representing the pixels of
>>>>> where the label will fall if it was rendered.
>>>>>
>>>>> the test: Logical AND this mask image with the map label image
>>> given
>>>>> its position in that image. You have a collision on the first AND
>>> result
>>>>> the it true.
>>>>>
>>>>> to update the map label image, just OR this label mask with the
>> map
>>>>> label image given its position in that image.
>>>>>
>>>>> if you want to test another location, ie: you are searching for an
>>
>>>>> location nearby that the label will fit, then change the position
>>> by
>>>> a
>>>>> pixel and repeat the test. This is more costly than just a test,
>>> but
>>>> is
>>>>> still pretty efficient. I don't think you would want to apply this
>>> to
>>>>> every label, but having it as an option for a layer might be
>>>>> interesting. You could constrain this to say search up/down or
>>>>> right/left 10 pixels and then something like lot labels could
>>> easily
>>>> be
>>>>> staggered. This is just an idea and it might not farm out to be
>>>> useful,
>>>>> but once the machinery is in, it falls out as a simple variation
>> on
>>>> the
>>>>> theme.
>>>>>
>>>>> -Steve
>>>>>
>>>>> Paul Spencer wrote:
>>>>>> This would need to work for 'curved' labels too. Would it work
>> to
>>>> do
>>>>>> things per-character?
>>>>>>
>>>>>> Paul
>>>>>>
>>>>>> On 2-Jan-07, at 12:40 PM, Steve Lime wrote:
>>>>>>
>>>>>>> Remember also that you're not dealing strictly with bounding
>>> boxes
>>>>> (e.g.
>>>>>>> rects) but
>>>>>>> actual polygons whenever rotation is applied. This is so you get
>>>>>>> relatively tight fits
>>>>>>> around labels and can cram labels on a map.
>>>>>>>
>>>>>>>>>> Stephen Woodbridge <woodbri at swoodbridge.com> 1/2/2007
>> 10:25:49
>>>>> AM
>>>>>>> Steve Lime wrote:
>>>>>>>> Interesting idea. No, it's not been looked at before, not by me
>>>>>>>> anyway. Frank might have some thoughts on performance. You'd
>>>> still
>>>>>>>> have to loop through the affected area to find collisions but I
>>>>>>> think
>>>>>>>> your correct in that the search time would basically be
>> constant
>>>>>>>> (affected slightly by the size of the label).
>>>>>>>>
>>>>>>>> Wouldn't be that hard to implement. I wonder if GD holding the
>>>>> cache
>>>>>>>> image could be useful or if a custom 2-D bit array would be
>>>>> faster-
>>>>>>>> probably the latter...
>>>>>>> I thin a 2-D bit array would be much faster and require
>>>>> significantly
>>>>>>> less memory, but would require writing a little additional code.
>>>>> The
>>>>>>> upside to this is that the code could be optimized for this
>>>>> specific
>>>>>>> task, unless there are other potential uses for this.
>>>>>>>
>>>>>>> Another idea I had related to this scheme would be that you
>> could
>>>>>>> implement a function could search for a placement position
>>> withing
>>>>> some
>>>>>>> number of pixels. Here the idea would be that you use the
>> polygon
>>>>> to
>>>>>>> create a mask and see if the mask collides, if not you are done,
>>>>>>> otherwise you do a spiral search outwards from the anchor point
>>>> for
>>>>>>> some
>>>>>>> number of pixels looking for a place the label will fit. This is
>>>>> easy
>>>>>>> to
>>>>>>> do in some nested loops and very fast.
>>>>>>>
>>>>>>> This idea of a moving mask is fundamental to a lot of the raster
>>>>>>> manipulation algorithm and is easy to implement and very fast to
>>>>>>> execute.
>>>>>>>
>>>>>>> -Steve W
>>>>>>>
>>>>>>>> Steve
>>>>>>>>
>>>>>>>>>>> Stephen Woodbridge <woodbri at SWOODBRIDGE.COM> 1/2/2007
>> 9:46:46
>>>>>>>>>>> AM
>>>>>>>>>>>
>>>>>>>> HNY All,
>>>>>>>>
>>>>>>>> Last year I spent a fair amount of time optimizing a bunch of
>>>>>>>> mapfiles
>>>>>>>>
>>>>>>>> for performance. One of the areas that seems to always be a
>>>>>>>> performance
>>>>>>>>
>>>>>>>> hog if processing the Label Cache. I have been successful are
>>>>>>>> thinning
>>>>>>>>
>>>>>>>> down the number of labels going into the cache, but I wondering
>>>> if
>>>>>>>> there might be a more efficient way of dealing with it. Since I
>>>>> have
>>>>>>>> been playing with rasters over the holidays I was wondering if
>>> it
>>>>>>>> might be possible to implement the label cache as a bitmap mask
>>>>> and
>>>>>>>> us a raster
>>>>>>>>
>>>>>>>> OP to see if there is a collision?
>>>>>>>>
>>>>>>>> OK, I'm sure it is possible, I guess my real question(s) are:
>>>>>>>>
>>>>>>>> Has this approach been looked at before? Any thoughts on
>>>>>>> performance?
>>>>>>>> My idea is:
>>>>>>>>
>>>>>>>> 1) create an label cache image 2) to check a label, get its
>> bbox
>>>>>>>> polyon check if it overlaps and set pixels 3) polygon fill the
>>>>> bbox
>>>>>>>> of a label into the label cache image when a label is rendered
>>>>>>>>
>>>>>>>> we might be able to use a 2:1 or 4:1 reduced size label cache
>>>>> image
>>>>>>>> with some thought about how to work around the issues.
>>>>>>>>
>>>>>>>> I would think that this would be much faster than searching a
>>>> list
>>>>>>> of
>>>>>>>> drawn labels and checking for intersections. And it should be
>>>>>>> linear
>>>>>>>> with the number of labels in the cache.
>>>>>>>>
>>>>>>>> -Steve
>>> +-----------------------------------------------------------------+
>>>>>> |Paul Spencer pspencer at dmsolutions.ca
>>> |
>>> +-----------------------------------------------------------------+
>>>>>> |Chief Technology Officer
>>> |
>>>>>> |DM Solutions Group Inc http://www.dmsolutions.ca/
>>
>>> |
>>> +-----------------------------------------------------------------+
More information about the mapserver-dev
mailing list