[postgis-users] MySQL 4.1 GIS Functionality

Paul Ramsey pramsey at refractions.net
Thu Jun 19 14:42:08 PDT 2003


Dave and I have spent a little time over the last few days investigating 
the spatial capabilities which have been added to the MySQL database as 
of the 4.1 version. 4.1 is the current "testing" version, and 4.0 is the 
current "shipping" version.

1. Notes from Testing
---------------------

* Spatial data (geometry) only works with the MyISAM table type.  MySQL 
supports multiple tables types, with different performance 
characteristics. MyISAM is supposed to have high performance, but no 
transactional capability. So that means that you cannot have 
transactionally managed spatial updates.

* Trying to create create an InnoDB table with a geometry column, will 
crash the server and muddle the system tables. The systems tables must 
be MyISAM, since this is an effect of a failed transaction.

* The Spatial Index is almost certainly either tied to MyISAM or stored 
in a MyISAM-type structure.  At best this leaves your index vulnerable 
to ACID transactional issues.

* We used a 200K segment road database to test data handling.

* Data load time for the test table was significantly faster in MySQL. 
To load 200K rows it took 34 seconds on MySQL and 71 seconds on PgSQL. 
  MyISAM tables are noted for being fast at digesting nontransactional 
inserts. PgSQL has to manage the inserts within a transactional context.

* Spatial index creation took about 14 seconds in MySQL and 9 seconds in 
PostGIS.  This is probably because MySQL uses the quadatic pick-split 
algorthym and PostGIS uses a linear-time version.

* The MySQL spatial index is not null-safe. Marking the geometry column 
as "NOT NULL" is required in order to build a MySQL spatial index.  The 
PostGIS spatial index is null safe.

* We did three test queries to compare the retrieval times on the 
spatial index: first with a very small bounding box (enclosing 6 
records), then with a medium box (70K records), then with a huge box 
(all 200K records).

* To search for a very small area (6 records) in the 200,000 row table 
took about 20 milliseconds on MySQL and about 2 milliseconds on PostGIS.
* To search for a medium sized area (70,000 records) in the 200,000 row 
table took about 1.03 seconds in MySQL and 0.25 seconds in PostGIS.
* To search for a very large area (all records) in the 200,000 row table 
took about 2.25 seconds in MySQL and 0.75 seconds in PostGIS.

* MySQL spatial supports only 2D geometries.

* MySQL stores the spatial data internally as WKB. Retrieving WKB should 
be very fast, but we were not able to test that directly at this time.

* Because the internal representation of MySQL geometries is simple WKB, 
un-indexed bounding box searches are fairly slow. The database must 
calculate the bounding box for each feature in order to do the comparisons.


2. Notes from Examining the Source Code
---------------------------------------

* The MySQL code contains an assumption in the WKB handling that all WBK 
will be little endian. If you run MySQL spatial on a big endian machine 
(Sparc, PowerPC, PA-RISC) you will find things go amiss when you try to 
send WKB to little endian machines (x86), because your big endian 
machine will be creating invalid data (big endian data incorrectly 
flagged as little endian). Hopefully MySQL will fix this soon.

* Spatial types are stored as WKB Strings in the MySQL database.  They 
seem to have allocated a header on these for the SRID, but access to 
this is unimplemented.

* Inside the MySQL database, the geometries are stored in exactly the 
same type as String. The system does an Internal String -> Geometry 
conversion (which very easy because the Geometry type is just WKB).

* This String/Geometry duality causes the structure of the code to 
fairly complex. Geometry processing is done mostly in the String 
handling section of the code base, but also appears in the SQL parser 
and other scattered parts of the system.  We have no idea why they did 
this, but it sure makes it difficult to follow the logic in the system.

* We did not look too deeply at the indexing source code, but it in the 
myisam/ directory, so we assume it is either built inside a myisam 
structure or is only useable on myisam tables.  Either way, transactions 
  and ACID guarantees will be lacking.

* MySQL support for user defined functions has some limitations as well. 
  It appears that in order to create a user defined function, the 
programmer must allocate the memory for the results before seeing the 
return objects. For large and variable sized things like geometries, 
this could be problematic.  There might be work-arounds for this.

* MySQL does not offer any real support for object-relational 
functionality.  PostGIS is built entirely on PostgreSQL’s 
object-relational structure, making it very isolated from the postgresql 
development and very easy to maintain.  In fact, the only PgSQL-related 
maintenance to PostGIS has occurred because of significant improvements 
to the GIST index.

3. General Impressions
----------------------

* 4.1 is still just the "testing" release, but we did find it 
surprisingly brittle.  Our attempts to create geometries in InnoDB 
tables failed (fine) but created inconsistent system tables in the 
process (not fine).  Spatial indexes seemed impossible to create, until 
we realized the creation command seemed be sensitive to the case of the 
referenced table name.

* The MySQL developers made a real effort to follow the OpenGIS 
specification, sometimes to their own detriment. There are four 
different MBR comparison functions, with the correct (and confusing) 
OpenGIS-style function names (MBRContains, MBRTouches, MBRIntersects, 
MBRWithin). Intersects and Touches are subtly different things.

* There are no full predicates in this implementation, the predicate 
functions are all stubs against bounding box operations.

* There are no operators either (Intersection, Union, etc).

* Basically, it is a start. With some data loading tools and a little 
more stability for general purpose stuff, it could be a passable web 
serving platform.


-- 
       __
      /
      | Paul Ramsey
      | Refractions Research
      | Email: pramsey at refractions.net
      | Phone: (250) 885-0632
      \_




More information about the postgis-users mailing list