[postgis-users] Re: MySQL 4.1 GIS Functionality
Paul DuBois
paul at kitebird.com
Tue Aug 5 10:25:09 PDT 2003
At 10:17 -0700 8/5/03, Paul Ramsey wrote:
>In order to retain context, I'll not edit out any of the document,
>which means the nesting is all now +1. :)
Okay, I'll do the same.
>
>On Tuesday, August 5, 2003, at 09:49 AM, Paul DuBois wrote:
>
>>Sorry for this delayed response to a message that was posted some
>>weeks ago. I only recently became aware of the original message,
>>because I subscribed after June 19, 2003 when it was posted.
>>
>>Thanks to Paul Ramsey (and Dave Blasby?) for taking a look at
>
>The both of us. Dave did the lion's share of the write up.
>
>>MySQL's GIS support and giving it a test. I have a couple of pieces
>>of information in reply from Alexey Botchkov, one of the MySQL AB
>>developers. Replies to him can be sent through me, either at
>>paul at kitebird.com or paul at mysql.com. Or you can write to Alexey
>>directly at hf at mysql.com, but please also cc: me so that I can
>>follow the discussion. Thanks.
>>
>>The doubly quoted sections below are by Paul Ramsey, the singly quoted
>>sections are by Alexey. I have also added a few comments of my own.
>>
>>>> * Spatial data (geometry) only works with the MyISAM table type.
>>>
>>>We need a kind of RTree indexes implemented in InnoDB to make spatial
>>>extentions working with transactional bases.
>>
>>To confirm: Spatial column types indeed currently are supported only
>>for MyISAM tables.
>>
>>>
>>>> * Trying to create create an InnoDB table with a geometry column, will
>>>> crash the server and muddle the system tables.
>>>
>>>Yes, we should prevent creating spatial fields in InnoDB now.
>>
>>This means that you now will get an error if you try to create a
>>table other than a MyISAM table that includes spatial columns.
>>For example:
>>
>>mysql> CREATE TABLE t (g GEOMETRY) TYPE = INNODB;
>>ERROR 1178 (42000): The storage engine for the table doesn't support GEOMETRY
>>
>>Both Alexey and I were surprised that Paul experienced a server crash
>>and muddled system tables. (By "system tables", I assume what was meant
>>was the grant tables in the mysql database? If not, can you clarify?)
>>Alexey suspects this was a problem in an earlier version that has been
>>corrected.
>
>Dave can perhaps clarify. It has been a couple months. As I recall,
>the system thought we had a table which did not in fact exist
>(because the creation had failed).
>
>>By the way, what was the test system (or systems)?
>
>Linux 2.4 (Mandrake 8) on a dual AMD 2000+ w/ 2Gb RAM.
Did you compile MySQL yourself or use a precompiled distribution?
(Reason I ask is that we have observed some problems with GCC
and GIS columns unless optimization is turned off, or at least down
to -O1.)
>
>>
>>>
>>>> * The MySQL spatial index is not null-safe. Marking the geometry column
>>>> as "NOT NULL" is required in order to build a MySQL spatial index. The
>>>> PostGIS spatial index is null safe.
>>>
>>>We're working on this.
>>
>>To confirm: Spatial columns included in an index must indeed currently
>>be declared NOT NULL.
>>
>>>
>>>> * We did three test queries to compare the retrieval times on the
>>>> spatial index: first with a very small bounding box (enclosing 6
>>>> records), then with a medium box (70K records), then with a huge box
>>>> (all 200K records).
>>>
>>>When we tried PostGIS against MySQL we got almost equal productivity on
>>>selects.
>>>Can i get data you used for these tests?
>
>I'll have to check. I fear that the data we used was proprietary to
>one of our clients (but convenient to us). We can create a new
>similar test set from a non-proprietary data source.
>
>After publishing the note, we had some feedback from people to the
>effect that we were probably seeing cache effects between the two
>systems. The PostGIS index is smaller (alot smaller) so it was
>cached in memory (my correspondent pointed out that our 2ms result
>was faster than an expected disk seek time).
>
>So I repeated the (small area select) test in two ways. First I
>forced both systems to read off the disk (by cold starting both
>systems and feeding them the initial query right away). And they
>both gave the same times, about 20ms.
>
>But cold reading off the disk is not normal operational behavior. So
>I ensured that MySQL had enough (300Mb) shared memory space to cache
>the whole index (which at 22Mb was actually larger than the table
>from which it was derived) and then ran the test. The cold search
>for both systems remained 20ms, as the indexes were pulled off the
>disk. After that postgis (150Mb of shared memory) was doing 2-3ms
>searches, and the mysql searches varied from 10-20ms.
You could also enable MySQL's query cache, in which case the second and
subsequent execution of a query should yield almost instantaneous
results. :-)
>
>>>
>>>> * MySQL spatial supports only 2D geometries.
>>>
>>>We don't have any plans here.
>>>
>>>> * This String/Geometry duality causes the structure of the code to
>>>> fairly complex. Geometry processing is done mostly in the String
>>>> handling section of the code base, but also appears in the SQL parser
>>>> and other scattered parts of the system. We have no idea why they did
>>>> this, but it sure makes it difficult to follow the logic in the system.
>>>
>>>We couldn't find a better way to put OpenGIS recomendation in MySQL parser.
>>>Probably there's better solution. We plan to review this part of the code.
>>>
>>>> * We did not look too deeply at the indexing source code, but it in the
>>>> myisam/ directory, so we assume it is either built inside a myisam
>>>> structure or is only useable on myisam tables. Either way, transactions
>>>> and ACID guarantees will be lacking.
>>>
>>>I don't know the InnoDB code, but I suspect that implementation of
>>>spatial indexes will differ significally here.
>>>
>>>Regards.
>>>HF
>>
>>
>>Some additional comments in reply to other points in Paul's original
>>message:
>>
>>>* MySQL stores the spatial data internally as WKB. Retrieving WKB should
>>>be very fast, but we were not able to test that directly at this time.
>>
>>The internal structure is closer to WKB + SRID.
>>
>>>* The MySQL code contains an assumption in the WKB handling that all WBK
>>>will be little endian. If you run MySQL spatial on a big endian machine
>>>(Sparc, PowerPC, PA-RISC) you will find things go amiss when you try to
>>>send WKB to little endian machines (x86), because your big endian
>>>machine will be creating invalid data (big endian data incorrectly
>>>flagged as little endian). Hopefully MySQL will fix this soon.
>>
>>I'm not sure about this. For example, I've directly copied the
>>files corresponding to tables containing GIS data from Mac OS X
>>(big-endian) to Red Hat on Intel (little-endian) and been able to
>>use them with no problem.
>>
>>If you discovered a problem in this area, I'm sure the developers
>>would like to see a repeatable test case so that they can address it.
>>Thanks.
>>
>
>We would have to do some work for that. Our prices are reasonable :)
>But compiling on Solaris asking for WKB (AsBinary()) and then
>checking the results for validity against the WKB spec should be
>sufficient.
>
>>>
>>>* Spatial types are stored as WKB Strings in the MySQL database. They
>>>seem to have allocated a header on these for the SRID, but access to
>>>this is unimplemented.
>>
>>I'm unclear what this means. There is an SRID() function for obtaining
>>the SRID of a geometry value.
>
>Our mistake then.
>>
>>>* 4.1 is still just the "testing" release, but we did find it
>>>surprisingly brittle. Our attempts to create geometries in InnoDB
>>>tables failed (fine) but created inconsistent system tables in the
>>>process (not fine). Spatial indexes seemed impossible to create, until
>>>we realized the creation command seemed be sensitive to the case of the
>>>referenced table name.
>>
>>This is most likely a manifestation of MySQL's general behavior that
>>table (and database) names are case sensitive on machines with case
>>sensitive filesystems, unless you run the server with the
>>--lower-case-table-names=1 option.
>
>Yes, one of our in-house MySQL experts pointed out that we were
>going to look a little foolish about mentioning that as a "problem"
>and we do. I don't know what the SQL92 spec says about case
>sensitivity. It was a surprise at the time :)
>
>>>
>>>* The MySQL developers made a real effort to follow the OpenGIS
>>>specification, sometimes to their own detriment. There are four
>>>different MBR comparison functions, with the correct (and confusing)
>>>OpenGIS-style function names (MBRContains, MBRTouches, MBRIntersects,
>>>MBRWithin). Intersects and Touches are subtly different things.
>>
>>Can you amplify more on what "to their own detriment" means?
>>That it would have been better to implement these as operators
>>rather than as functions?
>
>It means that I think the OGC spec for naming the predicates is a
>fairly confusing, and that is about all it means. The MySQL guys
>implemented the OGC naming scheme correctly for their bbox operators
>(although they did not have to, it was a nice touch of consistency)
>and because the spec is confusing, using it is detrimental (but
>probably still a good idea). Clear as mud? It is just that the
>scheme is confusing. Our complete spatial predicates will use the
>same scheme (it's the spec, dammit!), and we'll have the same
>problems with it. Do you know what Intersects() means? Touches()?
>:) Subtle enough differences that you basically need diagrams to
>explain the damn things.
Gotcha. Thanks.
>
> Paul Ramsey
> Refractions Research
> Email: pramsey at refractions.net
> Phone: (250) 885-0632
>
>
>_______________________________________________
>postgis-users mailing list
>postgis-users at postgis.refractions.net
>http://postgis.refractions.net/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list