[mapserver-dev] RFC98 status update

Tue Jul 9 06:58:36 PDT 2013

Devs,

Here's a status update about the rather major changes that are
happening for and around RFC98. As a reminder, RFC98 is about
refactoring the handling of text inside mapserver, to move the layout
of individual glyphs out from the renderers (agg, cairo), and into a
common framework. The user-visible changes will be the correct
handling of complex scripts (arabic, hindi variants, etc...) and
usually a speedup of the labelcache phase.
All this will be incorporated in the RFC, but here it is anyway as it
is probably easier to read than a diff. Most of you can skip to the
summary at the end.

Dependencies:
===========
- Harfbuzz is the library used for shaping and is an added dependency
- Fribidi stays here, but is only used for determining bidi runs and
not for shaping anymore. Hopefully the thread safety issues in fribidi
were in the shaping part, and we may be able to remove the thread
locks around fribidi calls. ICU could have been a replacement for
bidi, but I preferred to keep using a current dependency rather than
replace it with a new one.
- UTHash (http://troydhanson.github.io/uthash/) is a header only
hashtable implementation that works with arbitrary keys and values,
and is used for accessing cached fonts and glyphs. Given some
performance testing, we might want to replace our own hashtable
implementation with this one some day.
- RFC99 (dropping GD support) is inside the rfc98 branch, so GD is no
longer a dependency.

Font and Glyph cache:
=================
(fontcache.c)
I'm still pondering as to where to store a font and glyph cache.
Making it global would ensure that cached glyphs are reusable across
multiple requests for the fastcgi case, but in turn requires some
thread-level protection and probably some pruning in order for it to
remain of reasonable size. For now its lifetime is tied to the
lifetime of the mapObj. Some APIs have changed in order to have the
fontcache accessible.
The fontcache contains caches for:
- font faces (i.e. the representation of of a truetype file)
- glyphs (i.e. the metrics of an individual glyph at a given size)
- glyph bitmaps (the gray level rasterization of a glyph at a given
size, with no rotation)
Rotated bitmap glyphs do not get cached, as rotated text usually
happens due to data driven parameters (i.e follow or auto labels), and
thus are not candidates for caching.

Text/Glyph representation:
====================
Text is represented by a "textPathObj" which is basically a list of
positioned glyphs. (e.g the word "Label" at size 10 for an arial font
is represented by arial font's glyph "L" at position (x,y)= (0,0),
glyph "a" at position (10,0) , "b" at (18,0), etc...). Multiline text
is handled transparently by having glyphs positioned at different y
values. A textPath can be either "absolute" (i.e. the glyph positions
are in absolute image coordinates, used to position glyphs for angle
follow labels), or "relative", in which case they must be offset by
their labeling point.

Renderer implications:
=================
- Functions to render a string of text (ex msDrawText), to render a
positioned string of text (ex msDrawTextLine), to render a truetype
marker symbol, and to compute the extents of a string of text are
removed.
- A function to render a textPathObj is added. A rendererer may take
advantage of cached glyph bitmaps if needed.

LabelCache Implications:
===================
Work has been done to trim down the labelcache computations as much as possible:

When inserting features into the labelcache:
- We'll insert a reference to the original labelObj instead of a copy
if the labelObj and it's child styleObjs don't contain any bindings.
This cuts dow on memory usage when attribute bindings aren't in use.
- We don't insert features that will never get rendered (e.g. out of
scale, too large for their feature (minfeaturesize keyword) )

At the msDrawLabelCache phase:
* We delay computation of the label text bounding box to after we have
checked conditions that would cause it not be renderered, i.e.
  - if they have a MINDISTANCE set and a neighbouring label with
identical text has already been rendered
  - for labels without markers, we first check that the labelpoint
doesn't collide with an existing label.
* The Collision detection has been optimized:
 - We keep a list of rendered labels and loop through those instead of
checking the status of all the labels in the labelcache for each
member
 - The bounding metrics for a label has been cut down from a full
shapeObj to a struct containing a bounding rect and an optional
lineObj. For non rotated labels, there's no information needed more
than the bounding box, which makes intersection detection much easier
for two labels like this (i.e. the overlapping of these two labels is
the same as the overlapping of their bounding boxes, no need to go
into further geometric intersection primitives).
The speedups for these changes are extremely important for cluttered
maps, c.f. https://plus.google.com/u/0/118271009221580171800/posts/PrwhFYSkhea
(e.g. rendering time goes from 800 to 1 second for 500.000 labels)

Miscellaneous libmapserver changes
===========================
- The code in mapprimitive to compute positions for angle auto and
angle follow has been sanitized and now directly uses computed glyph
metrics
- A number of functions have had their signature changed to accomodate
for the change of architecture

Text shaping
==========
All of the previous changes did not affect text shaping, which is a
major component of RFC98. All the shaping happens in textlayout.c,
who's principal role is to take a string of text as input, and return
a list of positioned glyphs as output. The input string goes through
multiple steps, and is plit into multiple run. Each run will have a
distinct line number, bidi direction, and script "language".

As an example, we'll be working with the input unicode string "this is
some text in english, ARABIC and JAPANESE". Capital letters are used
to denote non latin glyphs, also note that ARABIC is stored in logical
(=reading) order, whereas it would be written as CIBARA.

- iconv encoding conversion to convert the string to unicode
 run1 = "this is some text in english, ARABIC and JAPANESE", line=0

- line wrapping: break on wrap character, break long lines on spaces
run1 = "this is some text in english,", line=0
run2 = "ARABIC and JAPANESE", line=1

- bidi levels
 => each run has a single bidi direction (i.e. left-to-right or right-to-left)
run1 = "this is some text in english,", line=0, direction=LTR
run2 = "ARABIC" line=1, direction=RTL
run3 = " and JAPANESE", line=1, direction = LTR

- script detection is applied to enable language dependant shaping,
and also to refine which fonts will be used (more on that later)
run1 = "this is some text in english,", line=0, direction=LTR, script=latin
run2 = "ARABIC" line=1, direction=RTL, script=arabic
run3 = " and " line=1, direction=LTR, script=latin
run4 = "JAPANESE" line=1, direction=LTR, script=hiragana

- for each run, we select which font should be used, in order to use
the same font inside a given run. A previous RFC allowed to specify
multiple fonts for a LABEL, this has been extended to be able to fine
tune which fonts are to be preferably used for a given script:

LABEL "arialuni,arial,cjk,arabic"

can now be written prefixed by a script identifier, i.e.

LABEL "arialuni,en:arial,ja:cjk,ar:arabic"

This is needed there is and will be overlap between font glyph
coverages, and it should be possible to prioritize which font is used
for which language.

- Each run is then fed into harfbuzz, which returns a list of
positionned glyphs. The number of returned glyphs is not meant to be
identical to the number of glyphs we had in the unicode string, and
are ordered from left to right.

- The glyphs of each run are reassembled to account for line numbers
and run positions (e.g. run 3 is offset down by one line, and placed
to the right of run 2)

- Each line is horizontally offseted to account for ALIGN. LABEL ALIGN
now stops defaulting to LEFT, so right-to-left runs will be right
aligned instead of left as is now.

========
Summary
========

What we've gained
==============
- A consistent framework for laying out text. Future enhancements
concerning e.g. doublespacing text shall be simplified.
- Correct (I hope) renderings of non latin scripts (multiline arabic
text, complex shaping)
- Fine grained control on which fonts to use. I also have some code
ready to use fontconfig for font selection, not used for now as it has
performance issues.
- Correct ALIGN support for RTL languages.
- Labelcache speedups and memory consumption decrease
- Correct handling of glyph placements with respect to their baseline

Backwards incompatible changes
=========================
- GD support has been dropped
- ANNOTATION layers are dropped (they've been deprecated since 6.0)
- HTML encoded entities inside attributes are not supported (e.g.
é, ඀). They are still supported for specifying which
character to use in truetype symbols.
- API changes (not sure how and if this affects mapscripts yet)
- Arabic shaping will depend on harfbuzz. A build without harfbuzz
will not fallback to fribidi shaping.
- All text related autotests fail, and output is slightly different
(hinting is disabled, metrics calculation may have slightly varied)

Open Issues
=========
- Where to store font and glyph cache
- For the kml renderer, is the labelcache phase necessary (in which
case label bounds should be computed, however we should be feeding the
original text not individual glyphs to the kml renderer).

regards,
Thomas