[MetaCRS] Common SQLite-based dictionaries

Sun Aug 2 06:15:12 PDT 2015

All,

libgeotiff and GDAL all use a system of bespoke CSV files for their coordinate system dictionaries. proj.4 uses derived dictionaries made from GDAL's. Each is a slightly different subset and/or mix of the EPSG db along with other catalogs, customizations, and overrides. The situation is messy, fragile, and incomplete, especially for folks like me, who are interested in support of ever more complex systems of horizontal and vertical datums, time epochs associated with them, and direct transformation.

There have been multiple attempts to build a C tribe API that handles the coordinate system description problem, but all have failed for various reasons.The one true library to rule them all is probably a pipe dream, but maybe it is possible to collaborate in a slightly messier way -- at the dictionary level.

One significant technology that was not widely available when GDAL, proj.4, and libgeotiff all originated is SQLite. The idea of a single file, sql'able database is a standard assumption in today's software, especially in things like HTML5 (wars between WebSQL and IndexedDB), just about every significant phone application, and your favorite OGC super format [1]. 

I'd like to propose an attempt to standardize the GDAL, proj.4, and libgeotiff SRS coordinate system handling dictionaries on a SQLite database that starts with EPSG, with each project adding its own auxiliary tables as necessary. I am writing this message to MetaCRS to see if there is support for such an effort, and to determine if there are other related projects who would like to collaborate at this level. 

For the GDAL stack, the benefits of this approach are significant. Multiply-defined, potentially conflicting definitions no longer need to be resolved. The dictionaries could release on their own schedule, rather than with each individual project. Powerful new functionality would be much closer to software developers instead of hidden behind a rather opaque and fragile CSV dictionary generating process. Mundane but important details like multithreaded access get handed off to a library and project who do that stuff all the time instead of one-off implementations inside of each individual project. 

Database views/queries could be standardized for common lookups across. Lookups would be faster due to indexed query access. Transformation validation, based on EPSG or other databases, could be provided across all three projects. More complex topics, like those described above, could be developed in a way that have impact across all three projects without tedious implementation. 

Consider this email a mix of 

1) is this a good idea? What other benefits do you see this approach providing?
2) Does your project want to collaborate on this?
3) Does this belong in MetaCRS?
4) What are the pitfalls that make this untenable?

windmill-tilting'ly yours,

Howard

[1] http://www.geopackage.org/