<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" id="owaParaStyle"></style>
</head>
<body fpstyle="1" ocsi="0">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">
<div><font face="Arial">Hi All,</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">I think i need a mentor working with me and help me make gdal under mongodb support.</font></div>
<div><font face="Arial">Below is the proposal i wrote, hopefully you find it worth a trial.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">Thanks,</font></div>
<div><font face="Arial">shuai</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b>Title: OGR Driver for MongoDB</b></font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b>Short description: </b></font></div>
<div><font face="Arial">MongoDB, a document database that provides high performance, high availability, and easy scalability, can be a good platform for storing extremely large spatial datasets, to support high performance geo-computation and real-time spatial
analysis in a large scale.This project aims at developing a OGR Driver for MongoDB to help applications or softwares based on GDAL, such QGIS, Geoserver, Mapserver, and so on, read & write the spatial data in it, and thus enable the Open Source GIS Ecosystem
powered by the advanced NoSQL database.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">Describe your idea</font></div>
<div><font face="Arial"><b>1. Introduction</b></font></div>
<div><font face="Arial">MongoDB, a document database that provides high performance, high availability, and easy scalability, can be a good platform for storing extremely large spatial datasets, to support high performance geo-computation and real-time spatial
analysis in a large scale. Yet, there is little attention so far that GIS fields pay to make most of its strength. This project aims at developing a OGR Driver for MongoDB to help applications or softwares based on GDAL read & write the spatial data in it,
and thus enable the Open Source GIS Ecosystem powered by the advanced NoSQL database.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b> 2. Background</b></font></div>
<div><font face="Arial">Since we are living in the era of big data, tools and equipment today for capturing spatial data both at the mega-scale and the milli-scale are just dreadful. The magnitude of this data volume is well beyond the capability of any mainstream
geographic information systems. Yet, we, GIS fields, have no off-the-shelf solutions to manage these massive spatial data. Relational spatial databases have taken in charge for decades but now the situation seems a little different.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">A computing pattern shift can be seen throughout the IT industry in recent years and GIS would be no exception. Especially, data analytics may not be achievable within a reasonable amount of time without resorting to high-performance
computing strategies. However, relational spatial databases are kind of slow to support these high-performance computing scenarios, and often lack of flexible scalability to handle a growing amount of work in a capable manner. </font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">Fortunately, there are several groups trying to address the problem, and MongoDB is an apparent leader in this direction. MongoDB, which has native support for maintaining geospatial data, using a document-oriented model, lies in fifth
place in the DB-Engines Ranking of database management systems classed according to popularity and the highest rated non-relational system. From version 2.4 (released on March 19, 2013), MongoDB introduces support for a subset of GeoJSON geometries including
basic shapes like points, linestrings, polygons. And quite a number of partners related with big data, NoSQL, cloud, mobile and high performance computing join the MongoDB ecosystem. Foursquare is featured one of them which benefits from MongoDB’s support
for geospatial indexing, allowing it to easily query for large location-based data.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b>3. The idea</b></font></div>
<div><font face="Arial">MongoDB employs GeoJSON to store spatial data and concurrently GDAL supports for access to features encoded in GeoJSON format, which can be reusable. This project is trying to implement a MongoDB Driver according to the OGR format driver
interfaces with subclasses of OGRSFDriver, OGRDataSource and OGRLayer, and registered with the OGRSFDriverRegistrar at runtime, so that GDAL may use MongoDB as a datasource to access large scale spatial data. </font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b>4. Project plan (detailed timeline: how do you plan to spend your summer?)</b></font></div>
<div><font face="Arial">The first thing in the list is to design the structure inside of MongoDB spatial database. In the context of OGR data model, we got Datasource, Layer and Feature, so accordingly every database in MongoDB is regarded as a Datasource,
and the Collections within should be treated as Layers, thus every Document as a Feature. PostGIS and other spatial databases often harness some system tables to maintain the metadata, but since MongoDB is schema free metadata such as spatial reference can
be stored within the particular Layer, in this case a Collection.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">The most important part of a data format driver is to define how to read and write the data format in the specific driver, especially the Open and Create method in the Datasource Class. As MongoDB organizes its spatial data in GeoJSON
model, the GeoJSON driver already supported by current GDAL can be reused to code or decode the GeoJSON fetched from MongoDB database. Therefore, there would be totally four files to implement, including ogr_mongo.h, ogrmongodriver.cpp, ogrmongodatasource.cpp,
and ogrmongolayer.cpp.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b>Test Plan</b></font></div>
<div><font face="Arial">[1] After the MongoDB Driver is compiled into the OGR framework, the utility ogr2ogr can be used as the test program to import and output spatial data between shapefile and MongoDB.</font></div>
<div><font face="Arial">[2] Conduct a parallel transformation process to find how fast MongoDB Driver can be compared to file system and PostGIS.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b>Time Line</b></font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><u>May 19- June 8 (Coding - Phase 1 - 3 weeks)</u></font></div>
<div><font face="Arial">Preparing the developing environment and bringing GDAL, MongoDB C++ driver and C++ together, Implementing OGRMongoDriver, OGRMongoDataSource, OGRMongoLayer according to the interfaces defined by OGRSFDriver, OGRDataSource and OGRLayer.</font></div>
<div><font face="Arial"><u>June 9 - June 23 (Coding - Phase 2 - 2 weeks)</u></font></div>
<div><font face="Arial">Build MongoDB into the OGR framework, and may first support to exchange a small size of spatial data with MongoDB, Simultaneously bug fixing.</font></div>
<div><font face="Arial"><u>July 24 - July 13 (Coding - Phase 3 - 3 weeks)</u></font></div>
<div><font face="Arial">Passing the query string (a JSON style document) for both spatial and attribute data into MongoDB to select features as requested. Compile all the codes and conduct several tests, fix bugs and make it faster.</font></div>
<div><font face="Arial"><u>July 14 - July 27 (Testing - Phase 1 - 2 weeks)</u></font></div>
<div><font face="Arial">Transfer large scale spatial data with MongoDB using ogr2ogr to see the driver efficiency. Improve its efficiency and fix bugs.</font></div>
<div><font face="Arial"><u>July 28 - August 10 (Testing - Phase 2 - 2 weeks)</u></font></div>
<div><font face="Arial">Conduct a parallel transformation experiment to find how fast MongoDB Driver can be compared to file system and PostGIS, and fix bugs.</font></div>
<div><font face="Arial"><u>August 11 - August 18 (pencils down)</u></font></div>
<div><font face="Arial">Write code documentation, includes doxygen comments and techbase/userbase articles.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial"><b>5. Future ideas / How can your idea be expanded? </b></font></div>
<div><font face="Arial">MongoDB is also an ideal platform for storing massive geo-raster data, so next job would be writing a MongoDB Driver for raster dataset.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Explain how your SoC task would benefit the OSGeo member project and more generally the OSGeo Foundation as a whole:</b></font></div>
<div><font face="Arial">MongoDB can be a distributed and parallel NoSQL spatial database with high performance, high availability, and easy scalability, thus extremely suitable for large scale data-intensive computing. By implementing the MongoDB Driver in
the OGR framework, the whole OSGeo ecosystem based on GDAL/OGR will be benefit from it and powered by MongoDB.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Please provide details of general computing experience: (operating systems you use on a day-to-day basis, languages you could write a program in, hardware, networking experience, etc.)</b></font></div>
<div><font face="Arial">During my college time, I mainly used .NET languages such as C#,VB.net, to build GIS software running on the Windows platform, while after that and my PhD program beginning most of my work were done in standard C++ on Linux environment.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Please provide details of previous GIS experience:</b></font></div>
<div><font face="Arial">I’m a GIS student ever since I attend college. Right now I'm a Ph.D candidate in Cartography and Geographic Information System, School of Geographic and Oceanographic Sciences, Nanjing University, China, and a visiting scholar at Geography
& GIScience and NCSA (The National Center for Supercomputing Applications), UIUC, IL, USA.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Please provide details of any previous involvement with GIS programming and other software programming:</b></font></div>
<div><font face="Arial">[1] Climate Information Management System of Shanxi Province: Outstanding Award in ESRI Chinese College Student Software Development Contest, 2009.</font></div>
<div><font face="Arial">[2] Forest Fire Simulation Model based on Geographic Cellular Automata: Third Prize in ESRI Chinese College Student Software Development Contest, 2009.</font></div>
<div><font face="Arial">[3] High Performance Geospatial Computing System: HiGIS, (2011-2013)Supported by the National High Technology Research and Development Program of China (863 project), in construction.</font></div>
<div><font face="Arial">[4] NoSQL Expression of Massive Geospatial Information in the era of Big Data, (2013-2015) Supported by the Scientific Research Foundation of Graduate School of Nanjing University, in construction</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Please tell us why you are interested in GIS and open source software:</b></font></div>
<div><font face="Arial">They are powerful and beautiful treasures of humankind, and I want to be part of it.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Please tell us why you are interested in working for OSGeo and the software project you have selected:</b></font></div>
<div><font face="Arial">It’s part of my research, since I was trying to harness MongoDB to support high performance geo-computing.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Please tell us why you are interested in your specific coding project:</b></font></div>
<div><font face="Arial">I spent lots of time in the past three years learning how GDAL works and how to employ them into high performance computing applications. So I believe a new GDAL with MongoDB support will do much good to my current research.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Would your application contribute to your ongoing studies/ degree? If so, how?</b></font></div>
<div><font face="Arial">Yes. MongoDB cluster is a good way to handle large quantities of spatial data, and if OGR provides MongoDB Driver, lots of tools we developed based on GDAL can be reusable, and powered by MongoDB, thus much faster.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Please explain how you intend to continue being an active member of your project and/or OSGeo AFTER the summer is over:</b></font></div>
<div><font face="Arial">I’ll try my best to keep following this thread to make MongoDB Driver stable and efficient.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Do you understand this is a serious commitment, equivalent to a full-time paid summer internship or summer job?</b></font></div>
<div><font face="Arial">Yes, I understand. I’ll give my best.</font></div>
<div><font face="Arial"> </font></div>
<div><font face="Arial"><b>Do you have any known time conflicts during the official coding period? (May 19 to August 19)</b></font></div>
<div><font face="Arial">No, I don't.</font></div>
</div>
</body>
</html>