[GRASS-dev] HTML files
Michael Barton
michael.barton at asu.edu
Wed Aug 20 02:02:58 EDT 2008
Glynn,
Can you also put this important information into the WIKI programming
guide so that it doesn't get lost as easily?
Michael
On Aug 19, 2008, at 3:46 PM, <grass-dev-request at lists.osgeo.org> <grass-dev-request at lists.osgeo.org
> wrote:
> Date: Tue, 19 Aug 2008 17:55:03 +0100
> From: Glynn Clements <glynn at gclements.plus.com>
> Subject: [GRASS-dev] HTML files
> To: <grass-dev at lists.osgeo.org>
> Message-ID: <18602.64231.566897.339008 at cerise.gclements.plus.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> I have been through and fixed some problems which prevented some of
> the HTML files from validating. AFAICT, everything now validates (with
> the sole exception of missing "alt" attributes within <img> tags).
>
> Please ensure that all HTML files continue to validate against the
> HTML 4.0 Transitional DTD. At some point, I want to replace g.html2man
> with something more robust (e.g. something which handles tables), and
> I don't particularly want to make a "smart" (i.e. fault-tolerant) HTML
> parser (e.g. Beautiful Soup) a required dependency.
>
> If you have OpenSP or OpenJade, you can validate an HTML file with
> e.g.:
>
> nsgmls -s -c /usr/share/sgml/openjade-1.3.2/pubtext/HTML4.soc
> <filename>.html
>
> [The program may be called nsgmls or onsgmls, and the exact location
> where the catalogues are installed will vary.]
>
> This needs to be done on the completed HTML file in
> dist.<arch>/docs/html; the <module>.html files in the module
> directories won't normally validate, as they lack the header which is
> added by running the module with the --html-description.
>
> FWIW, the most common error was using block elements (e.g. <div>,
> <pre>, <p>) in contexts where only inline elements are allowed
> (primarily <dt>).
>
> You can determine which elements are allowed where from the DTD:
>
> http://www.w3.org/TR/1998/REC-html40-19980424/sgml/loosedtd.html
>
> E.g. the definition:
>
> <!ELEMENT DT - O (%inline;)* -- definition term -->
>
> indicates that only inline elements are allowed inside DT, while e.g.:
>
> <!ELEMENT DD - O (%flow;)* -- definition description -->
>
> indicates that both block and inline elements are allowed inside DD.
>
> If you don't want to read the DTD, here's a rough summary:
>
> Entity classes:
>
> %StyleSheet = <CSS stylesheet>
> %Script = <JavaScript code>
>
> %html.content = HEAD, BODY
> %head.content = TITLE, ISINDEX, BASE
> %heading = H1, H2, H3, H4, H5, H6
> %fontstyle = TT, I, B, U, S, STRIKE, BIG, SMALL
> %phrase = EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR,
> ACRONYM
> %special = A, IMG, APPLET, OBJECT, FONT, BASEFONT, BR, SCRIPT,
> MAP, Q, SUB, SUP, SPAN, BDO, IFRAME
> %formctrl = INPUT, SELECT, TEXTAREA, LABEL, BUTTON
> %list = UL, OL, DIR, MENU
> %head.misc = SCRIPT, STYLE, META, LINK, OBJECT
> %pre.exclusion = IMG, OBJECT, APPLET, BIG, SMALL, SUB, SUP,
> FONT, BASEFONT
> %preformatted = PRE
> %block = P, DL, DIV, CENTER, NOSCRIPT, NOFRAMES,
> BLOCKQUOTE, FORM, ISINDEX, HR, TABLE, FIELDSET,
> ADDRESS, %heading, %list, %preformatted
> %inline = #PCDATA, %fontstyle, %phrase, %special, %formctrl
> %flow = %block, %inline
>
> The immediate children permitted for each element are:
>
> A: %inline
> ABBR: %inline
> ACRONYM: %inline
> ADDRESS: %inline, P
> APPLET: %flow, PARAM
> B: %inline
> BDO: %inline
> BIG: %inline
> BLOCKQUOTE: %flow
> BODY: %flow, INS, DEL
> BUTTON: %flow
> CAPTION: %inline
> CENTER: %flow
> CITE: %inline
> CODE: %inline
> COLGROUP: COL
> DD: %flow
> DEL: %flow
> DFN: %inline
> DIR: LI
> DIV: %flow
> DL: DT, DD
> DT: %inline
> EM: %inline
> FIELDSET: %flow, LEGEND
> FONT: %inline
> FORM: %flow
> FRAMESET: FRAMESET, FRAME, NOFRAMES
> H1: %inline
> H2: %inline
> H3: %inline
> H4: %inline
> H5: %inline
> H6: %inline
> HEAD: %head.content, %head.misc
> HTML: %html.content
> I: %inline
> IFRAME: %flow
> INS: %flow
> KBD: %inline
> LABEL: %inline
> LEGEND: %inline
> LI: %flow
> MAP: %block, AREA
> MENU: LI
> NOFRAMES: %flow
> NOSCRIPT: %flow
> OBJECT: %flow, PARAM
> OL: LI
> OPTGROUP: OPTION
> OPTION: #PCDATA
> P: %inline
> PRE: %inline
> Q: %inline
> S: %inline
> SAMP: %inline
> SCRIPT: %Script
> SELECT: OPTGROUP, OPTION
> SMALL: %inline
> SPAN: %inline
> STRIKE: %inline
> STRONG: %inline
> STYLE: %StyleSheet
> SUB: %inline
> SUP: %inline
> TABLE: CAPTION, COL, COLGROUP, THEAD, TFOOT, TBODY
> TBODY: TR
> TD: %flow
> TEXTAREA: #PCDATA
> TFOOT: TR
> TH: %flow
> THEAD: TR
> TITLE: #PCDATA
> TR: TH, TD
> TT: %inline
> U: %inline
> UL: LI
> VAR: %inline
>
> Some elements don't allow certain elements as descendents:
>
> A: A
> BUTTON: %formctrl, A, FORM, ISINDEX, FIELDSET, IFRAME
> DIR: %block
> FORM: FORM
> LABEL: LABEL
> MENU: %block
> PRE: %pre.exclusion
> TITLE: %head.misc
>
> Notes:
>
> 1. The children of DIR/MENU are LI, which is a block element, but
> those LI can't contain block elements. UL/OL don't have this
> restriction.
>
> 2. DT cannot contain block elements, but DD can. This means that you
> can't use <div class="code"><pre> in a DT; use <span class="code"><tt>
> instead. DIV and PRE are block elements; SPAN and TT are inline.
>
> 3. TABLE cannot have TR as a child. But TBODY can have TR, and TBODY
> allows both the start and end tags to be omitted, so
> <table><tr>....</tr></table> is really just a shorthand for
> <table><tbody><tr>....</tr></tbody></table>.
>
> 4. P cannot contain blocks. So <p>...<div> is actually shorthand for
> <p>...</p><div>. But <p>...<div>...</div>...</p> is an error, as the
> </p> doesn't match any open element (the <div> implicitly closed the
> original <p>, and P doesn't allow the start tag to be omitted).
>
> 5. HTML, HEAD, BODY, and TBODY allow the start tag to be omitted. With
> the exception of TBODY, this feature shouldn't be used (it's a
> nuisance to implement if the number of valid child tags is large).
>
> --
> Glynn Clements <glynn at gclements.plus.com>
More information about the grass-dev
mailing list