[Live-demo] Re: [OSGeo] #896: sphinx doc build is broken because of BOM
edgar.soldin at web.de
edgar.soldin at web.de
Sat May 12 02:55:29 PDT 2012
On 12.05.2012 03:00, OSGeo wrote:
> #896: sphinx doc build is broken because of BOM
> ---------------------+------------------------------------------------------
> Reporter: fgdrf | Owner: live-demo@…
> Type: defect | Status: new
> Priority: major | Milestone:
> Component: LiveDVD | Keywords:
> ---------------------+------------------------------------------------------
>
> Comment(by hamish):
>
> the Byte Order Mark has been added and removed from the .csv lists of
> contributers for a while now.
>
> I haven't really been sure if they should be there or not so only did a
> quick edit just before the last release to stop the table creation from
> breaking.
>
> It's easy enough to open with vi and delete the first two chars in the
> file if needed.. Converting UTF back to ISO-8859-1 isn't too bad either:
> `iconv -f UTF-8 -t ISO_8859-1 utf_file > iso_file`
>
>
> Qs:
> * Should the BOM be there or not?
according to
http://en.wikipedia.org/wiki/Byte_order_mark
it is maningless for UTF-8 but allowed.
> * What files (if any) should be saved in UTF-8, and why? (ISO will not
> handle non-Western multibytes, but that doesn't necessitate that the
> English/Western pages also be in UTF)
>
> this is out of my area of expertise, but the constant "last committer
> wins" back and forth of text file variants is as we see here causing
> problems.
>
WHICH:
i'd suggest to keep realms where everything is in *one* character encoding which can be announced so people can use the proper editor e.g. UTF-8 for the docs.
WHY:
users from languages with characters not in latin-1 aka. ISO_8859-1 can eventually write names and texts natively without having to escape convert them.
as UTF-8 is backwards compatible with ASCII it also keeps at least this (currently most important user-base-wise) area intact even on misconversion.
editor software usually warns when trying to open or save unsupported characters into a different character set.
we could actually use svn properties to effectively assign MIME-TYPE and character set to specific files which is respected by most svn clients.
for the BOM issue:
i don't know the sphinx internals, but would it be too difficult to strip the BOM on each file read conditionally? just for safety?
..ede
More information about the Osgeolive
mailing list