[GRASS-dev] HTML files

Michael Barton michael.barton at asu.edu
Wed Aug 20 02:02:58 EDT 2008


Glynn,

Can you also put this important information into the WIKI programming  
guide so that it doesn't get lost as easily?

Michael

On Aug 19, 2008, at 3:46 PM, <grass-dev-request at lists.osgeo.org> <grass-dev-request at lists.osgeo.org 
 > wrote:

> Date: Tue, 19 Aug 2008 17:55:03 +0100
> From: Glynn Clements <glynn at gclements.plus.com>
> Subject: [GRASS-dev] HTML files
> To: <grass-dev at lists.osgeo.org>
> Message-ID: <18602.64231.566897.339008 at cerise.gclements.plus.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> I have been through and fixed some problems which prevented some of
> the HTML files from validating. AFAICT, everything now validates (with
> the sole exception of missing "alt" attributes within <img> tags).
>
> Please ensure that all HTML files continue to validate against the
> HTML 4.0 Transitional DTD. At some point, I want to replace g.html2man
> with something more robust (e.g. something which handles tables), and
> I don't particularly want to make a "smart" (i.e. fault-tolerant) HTML
> parser (e.g. Beautiful Soup) a required dependency.
>
> If you have OpenSP or OpenJade, you can validate an HTML file with
> e.g.:
>
> 	nsgmls -s -c /usr/share/sgml/openjade-1.3.2/pubtext/HTML4.soc  
> <filename>.html
>
> [The program may be called nsgmls or onsgmls, and the exact location
> where the catalogues are installed will vary.]
>
> This needs to be done on the completed HTML file in
> dist.<arch>/docs/html; the <module>.html files in the module
> directories won't normally validate, as they lack the header which is
> added by running the module with the --html-description.
>
> FWIW, the most common error was using block elements (e.g. <div>,
> <pre>, <p>) in contexts where only inline elements are allowed
> (primarily <dt>).
>
> You can determine which elements are allowed where from the DTD:
>
> http://www.w3.org/TR/1998/REC-html40-19980424/sgml/loosedtd.html
>
> E.g. the definition:
>
> <!ELEMENT DT - O (%inline;)*           -- definition term -->
>
> indicates that only inline elements are allowed inside DT, while e.g.:
>
> <!ELEMENT DD - O (%flow;)*             -- definition description -->
>
> indicates that both block and inline elements are allowed inside DD.
>
> If you don't want to read the DTD, here's a rough summary:
>
> Entity classes:
>
> 	%StyleSheet	= <CSS stylesheet>
> 	%Script		= <JavaScript code>
> 	
> 	%html.content	= HEAD, BODY
> 	%head.content	= TITLE, ISINDEX, BASE
> 	%heading	= H1, H2, H3, H4, H5, H6
> 	%fontstyle	= TT, I, B, U, S, STRIKE, BIG, SMALL
> 	%phrase		= EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR,
> 			  ACRONYM
> 	%special	= A, IMG, APPLET, OBJECT, FONT, BASEFONT, BR, SCRIPT,
> 			  MAP, Q, SUB, SUP, SPAN, BDO, IFRAME
> 	%formctrl	= INPUT, SELECT, TEXTAREA, LABEL, BUTTON
> 	%list		= UL, OL,  DIR, MENU
> 	%head.misc	= SCRIPT, STYLE, META, LINK, OBJECT
> 	%pre.exclusion	= IMG, OBJECT, APPLET, BIG, SMALL, SUB, SUP,
> 			  FONT, BASEFONT
> 	%preformatted	= PRE
> 	%block		= P, DL, DIV, CENTER, NOSCRIPT, NOFRAMES,
> 			  BLOCKQUOTE, FORM, ISINDEX, HR, TABLE, FIELDSET,
> 			  ADDRESS, %heading, %list, %preformatted
> 	%inline		= #PCDATA, %fontstyle, %phrase, %special, %formctrl
> 	%flow		= %block, %inline
>
> The immediate children permitted for each element are:
> 	
> 	A:		%inline
> 	ABBR:		%inline
> 	ACRONYM:	%inline
> 	ADDRESS:	%inline, P
> 	APPLET:		%flow, PARAM
> 	B:		%inline
> 	BDO:		%inline
> 	BIG:		%inline
> 	BLOCKQUOTE:	%flow
> 	BODY:		%flow, INS, DEL
> 	BUTTON:		%flow
> 	CAPTION:	%inline
> 	CENTER:		%flow
> 	CITE:		%inline
> 	CODE:		%inline
> 	COLGROUP:	COL
> 	DD:		%flow
> 	DEL:		%flow
> 	DFN:		%inline
> 	DIR:		LI
> 	DIV:		%flow
> 	DL:		DT, DD
> 	DT:		%inline
> 	EM:		%inline
> 	FIELDSET:	%flow, LEGEND
> 	FONT:		%inline
> 	FORM:		%flow
> 	FRAMESET:	FRAMESET, FRAME, NOFRAMES
> 	H1:		%inline
> 	H2:		%inline
> 	H3:		%inline
> 	H4:		%inline
> 	H5:		%inline
> 	H6:		%inline
> 	HEAD:		%head.content, %head.misc
> 	HTML:		%html.content
> 	I:		%inline
> 	IFRAME:		%flow
> 	INS:		%flow
> 	KBD:		%inline
> 	LABEL:		%inline
> 	LEGEND:		%inline
> 	LI:		%flow
> 	MAP:		%block, AREA
> 	MENU:		LI
> 	NOFRAMES:	%flow
> 	NOSCRIPT:	%flow
> 	OBJECT:		%flow, PARAM
> 	OL:		LI
> 	OPTGROUP:	OPTION
> 	OPTION:		#PCDATA
> 	P:		%inline
> 	PRE:		%inline
> 	Q:		%inline
> 	S:		%inline
> 	SAMP:		%inline
> 	SCRIPT:		%Script
> 	SELECT:		OPTGROUP, OPTION
> 	SMALL:		%inline
> 	SPAN:		%inline
> 	STRIKE:		%inline
> 	STRONG:		%inline
> 	STYLE:		%StyleSheet
> 	SUB:		%inline
> 	SUP:		%inline
> 	TABLE:		CAPTION, COL, COLGROUP, THEAD, TFOOT, TBODY
> 	TBODY:		TR
> 	TD:		%flow
> 	TEXTAREA:	#PCDATA
> 	TFOOT:		TR
> 	TH:		%flow
> 	THEAD:		TR
> 	TITLE:		#PCDATA
> 	TR:		TH, TD
> 	TT:		%inline
> 	U:		%inline
> 	UL:		LI
> 	VAR:		%inline
>
> Some elements don't allow certain elements as descendents:
>
> 	A:		A
> 	BUTTON:		%formctrl, A, FORM, ISINDEX, FIELDSET, IFRAME
> 	DIR:		%block
> 	FORM:		FORM
> 	LABEL:		LABEL
> 	MENU:		%block
> 	PRE:		%pre.exclusion
> 	TITLE:		%head.misc
>
> Notes:
>
> 1. The children of DIR/MENU are LI, which is a block element, but
> those LI can't contain block elements. UL/OL don't have this
> restriction.
>
> 2. DT cannot contain block elements, but DD can. This means that you
> can't use <div class="code"><pre> in a DT; use <span class="code"><tt>
> instead. DIV and PRE are block elements; SPAN and TT are inline.
>
> 3. TABLE cannot have TR as a child. But TBODY can have TR, and TBODY
> allows both the start and end tags to be omitted, so
> <table><tr>....</tr></table> is really just a shorthand for
> <table><tbody><tr>....</tr></tbody></table>.
>
> 4. P cannot contain blocks. So <p>...<div> is actually shorthand for
> <p>...</p><div>. But <p>...<div>...</div>...</p> is an error, as the
> </p> doesn't match any open element (the <div> implicitly closed the
> original <p>, and P doesn't allow the start tag to be omitted).
>
> 5. HTML, HEAD, BODY, and TBODY allow the start tag to be omitted. With
> the exception of TBODY, this feature shouldn't be used (it's a
> nuisance to implement if the number of valid child tags is large).
>
> -- 
> Glynn Clements <glynn at gclements.plus.com>



More information about the grass-dev mailing list