[GRASS-dev] Re: new v.in.geonames: problems with UTF-8 Unicode text

Glynn Clements glynn at gclements.plus.com
Mon Jul 7 14:34:20 EDT 2008


Ivan Shmakov wrote:

>  > Operators such as || which operate on strings normally accept any
>  > operand types. Anything that isn't a string will be converted to one.
>  > But the result is always a string.
> 
>  > Typically, an operator is selected according to its operand types,
>  > and the operator's result type determines the type of the result
>  > column.  E.g. suppose that you have:
> 
>  > SELECT a + b FROM foo;
> 
>  > If a and/or b are REAL then the operator would be REAL * REAL -> REAL
>  > and the result would be REAL. If a and b were both INT then the
>  > operator would be INT * INT -> INT and the result would be INT.
> 
> 	From the above I could conclude that the other RDBMS are
> 	generally representing the results of expressions in SELECTs as
> 	having one type out of a narrow set of types (in absence of
> 	explicit type casts.)  This way, SQLite doesn't look much
> 	different to those.
> 
> 	Or, to put it differently, I see the discussed problem as not
> 	being specific to SQLite.

The problem is that SQLite doesn't type columns, but the values which
are in them. Typical RDBMS implementations type the columns. IOW,
dynamic typing (SQLite) versus static typing (everything else).

This means that, with SQLite, you need at least one row of data to
determine the types. It also means that if a column is null for that
particular row, the column will appear to be null.

It also means that, for VARCHAR columns, you cannot determine the
maximum width that can be stored in the column.

>  > One issue is that SQLite doesn't validate the column types. It uses
>  > some heuristics to determine the affinity type, but ultimately it
>  > allows any sequence of words along with an optional size specifier,
>  > e.g.:
> 
>  > CREATE TABLE foo ( i INT, t TEXT, b ROAST BEEF(99) );
> 
> 	Of course.  So, it becomes the user's responsibility to make
> 	sure that the definition of the table in question is compatible
> 	with the RDBMS to which it's copied.
> 
> 	I guess that much the same way, once the table is defined to use
> 	any types specific to an RDBMS implementation, it cannot be
> 	readily copied to any other RDBMS.
> 
> 	The only SQLite-specific problem that I see there is that it
> 	doesn't actually treat the type name as a constraint on the
> 	values put into that field.  But then, ``garbage in, garbage
> 	out'' is the guiding principle.

Yep. This isn't a problem for tables which are created through the
DBMI, as we can ensure that it accepts whatever it creates. The
problem arises if the user creates or modifies tables externally.

The other problem is that various DBMI drivers and clients don't
understand TEXT fields. Although that isn't inherently specific to the
SQLite driver, it does bring up the issue is how the driver should
treat fields whose decltype isn't recognised (given that, ultimately,
all SQLite columns are effectively TEXT columns).

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list