[GRASS-dev] Re: new v.in.geonames: problems with UTF-8 Unicode text

Glynn Clements glynn at gclements.plus.com
Thu Jul 3 16:29:00 EDT 2008


Ivan Shmakov wrote:

> >>>>> Glynn Clements <glynn at gclements.plus.com> writes:
> 
>  >>>>> SQLite doesn't have VARCHAR. If you create a VARCHAR column, the
>  >>>>> driver translates it to TEXT. The size you specify when the table
>  >>>>> is created isn't stored anywhere.
> 
>  >> Not exactly.  While I don't know of the GRASS SQLite driver
>  >> specifics, the CREATE TABLE commands are preserved by SQLite:
> 
> [...]
> 
>  >> Of course, any width limit specified via VARCHAR () is ignored by
>  >> SQLite itself.
> 
>  > Interesting. But not particularly useful unless we add our own SQL
>  > parser to the SQLite driver.
> 
>  > Well, it should only need to handle
>  > CREATE TABLE statements, but it would still need to understand
>  > exactly the syntax which SQLite supports (and the table might have
>  > been created by a future version).
> 
> [...]
> 
>  > IOW, change db/drivers/sqlite/describe.c to parse the sqlite_master
>  > table according to the syntax used by
>  > db/drivers/sqlite/create_table.c.
> 
> 	Hm.  Now I see sqlite3_column_decltype () in the SQLite API
> 	reference?

Er, right. ISTR that there are reasons why the SQLite driver doesn't
rely upon that much. However, I suspect that it will produce the same
result as parsing the string from sqlite_master.

IIRC, the main problem is that sqlite3_column_decltype() only works
for actual columns, not expressions, subselects etc, but the code in
question has to be able to describe the format of rows returned by
arbitrary SELECT statements, not just tables.

So it falls back to sqlite3_column_type(), which returns the type of
the column's data. but that only works if you have a valid row. If a
select doesn't return any rows, you lose. Also, sqlite3_column_type()
only understands null, integer, float, text and blob types.

Apart from needing to fall-back to the data type for expressions,
another problem is that it coerces the decltype to one of SQLite's
limited set of types, then converts that to one of the DB_SQL_* types,
losing information in the process.

However, I don't know if it does this for a specific reason. I suspect
the only way to find out is to try it and see what breaks.

>  >> IIUC, the very reason to allow SQL type specifications to be used in
>  >> CREATE TABLE was the interoperability with other SQL
>  >> implementations.
> 
>  > SQLite allows them, but essentially ignores them (other than the
>  > information it puts into sqlite_master).
> 
> 	That's the point -- while it behaves its own way, it allows
> 	standards-compliant database schemata to be reused with little
> 	or no modification.  (Or, rather, it relies on de-facto
> 	standards, for that matter.)

This works well enough if the application already knows the schema,
but it's problematic if you're trying to obtain it from the database
itself.

>  >> But then, don't the major SQL RDBMS support the TEXT type?  It has
>  >> unlimited width and thus implies no incompatibility with SQLite.
> 
>  > They support the TEXT type, but the DBMI seems to assume fixed sizes,
>  > e.g. the aforementioned problem with the PostgreSQL driver failing to
>  > report the size for TEXT fields.
> 
>  > The MySQL driver simply reports the length reported by MySQL; I have
>  > no idea what that will be for TEXT columns.
> 
> 	Shouldn't there be a separate value for the ``no size limit''
> 	case?

Probably. But that potentially means re-writing the clients to handle
this case. OTOH, so far no-one seems to have actually noticed that you
get garbage for the length of a TEXT field with the PG driver:

	$ db.describe test
	table:test
	description:
	insert:?
	delete:?
	ncols:1
	nrows:0
	
	column:col
	description:
	type:TEXT
====>	len:134591376
	scale:0
	precision:0
	default:
	nullok:yes
	select:?
	update:?

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list