[mapguide-internals] MapGuide RFC 112 - sqlite based tile cache

Zac Spitzer zac.spitzer at gmail.com
Fri May 13 12:10:57 EDT 2011


can we split off the repository discussion into a seperate thread?

On Sat, May 14, 2011 at 2:06 AM, Trevor Wekel
<trevor_wekel at otxsystems.com>wrote:

> Hmm... Now how many times have I heard "Repository Corruption" on the
> mailing lists.  Moving to files on disk would make repository hacking easier
> and guarantee that you would not lose the entire library at once.  I do have
> a couple of concerns about moving to a file based approach.  We may run out
> of file handles could happen if we keep the files open and there may be some
> overhead for executing fopen/fclose on every resource access.
>
>
> Regards,
> Trevor
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> Sent: May 13, 2011 9:44 AM
> To: MapGuide Internals Mail List
> Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
>
> For ultimate serving speed of static tile caches, it's would be best to
> bypass everything MapGuide and serve the tile cache directly from a
> directory exposed via Apache. This way you would automatically get browser
> caching as well. This is also why file storage of the tile cache is best for
> optimal serving speed.
>
> As far as the resource repository, IMO it's small enough to use a direct
> storage of XML files on the file system (for example, mapping what are
> currently XMLdb paths to file system paths). Using a database to store blobs
> in there would complicate things more than necessary and also adds an
> unnecessary dependency to what could be a really simple piece of code
> (reading and writing files).
>
> Traian
>
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Trevor Wekel
> Sent: Friday, May 13, 2011 11:33 AM
> To: MapGuide Internals Mail List
> Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
> Since MapGuide is targeted for both Windows and Linux, I think SQLite is
> the only choice.  If we are going to introduce another local database into
> MapGuide, perhaps we should consider other use cases for it.  Here's a few
> just off the top of my head:
>
> Response caching for mapagent and possibly web extensions
> - Operations like getting tiles and dynamic map overlays (for initial
> views) may be cacheable if we remove SESSION from the HTTP GET/POST and put
> it in a cookie.  We would have to implement time to live logic for this to
> be truly effective.
>
> Move log files to database storage
> - This could make query and analysis of log files easier
>
> Serving tiles directly from the mapagent
> - Copying/propagating the SQLite database files to the web tier would
> eliminate the agent/server hop
>
> Move to a SQLite backend for MgResourceService
> - Maintaining multiple database technologies in one product could be
> additional overhead
> - BerkeleyDB doesn't seem to be great fit for the Session repository.
>  After six years, we are still working on it.  Write-Ahead logging in SQLite
> could be effective for the Session repository
> http://www.sqlite.org/wal.html.
>
>
> A new service "MgStorageService" implemented in MapGuideCommon could wrap
> the SQLite database and make it accessible to the server, agent, and web
> extensions.   The API to MgStorageService would have to be considered
> carefully based on expected use cases.
>
>
>
> Regards,
> Trevor
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> Sent: May 13, 2011 8:49 AM
> To: 'MapGuide Internals Mail List'
> Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
>
> It's very strange that you are only getting <20% space efficiency out of
> the file system -- the tiles must be really tiny or the block size really
> huge. Perhaps this is another thing to look into (i.e. pick a better file
> system, but I guess sqlite is just going to be used as file system in a file
> in this case).
>
> Yes, copying one file once is faster due to less seeking involved compared
> to 300K files. But, if you want to do incremental backups, where there are
> only a few changed tiles, things will likely reverse.
>
> Anyway, if you are looking for a file system in a file, on Windows sqlite
> is probably the only choice. On Linux, there would be more options (ext3 in
> a file with B-tree indexing and tail-packing enabled, for example).
>
> Traian
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Zac Spitzer
> Sent: Friday, May 13, 2011 3:49 AM
> To: MapGuide Internals Mail List
> Subject: Re: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
> Lets take a tiny small tiled map as an example, using win7 x64, default
> configurations, quad core machine, fully seeded tile cache.
>
> Samples_Sheboygan_MapsTiled_Sheboygan
>
> Size: 267 MB (280,474,999 bytes)
> Size on disk: 1.50 GB (1,613,549,568 bytes)
> Contains: 375,084 Files, 527 Folders
>
> as a single zip file, store aka zero compression
>
> Samples_Sheboygan_MapsTiled_Sheboygan_2.zip
> 361 MB (379,239,309 bytes)
>
> Block size is the issue, but you can't really optimise this as raster tiles
> will require a different blocksize than vector tiles.
>
> Some simple Robocopy backup tests (same disk/non raided/default options)
>
> copying a tile cache in a zip file
>
>              Total    Copied   Skipped  Mismatch    FAILED    Extras
>    Dirs :         1         0         1         0         0         0
>  Files :         1         1         0         0         0         0
>   Bytes :  361.67 m  361.67 m         0         0         0         0
>  Times :   0:00:06   0:00:06                       0:00:00   0:00:00
>
>
>  Speed :            62632420 Bytes/sec.
>   Speed :            3583.855 MegaBytes/min.
>
> copying the raw tilecache using /MIR  (requiring a 65% CPU utilisation)
>
>               Total    Copied   Skipped  Mismatch    FAILED    Extras
>    Dirs :       528       527         1         0         0         0
>   Files :    375084    375084         0         0         0         0
>   Bytes :  267.48 m  267.48 m         0         0         0         0
>   Times :   0:27:30   0:19:54                       0:00:00   0:07:35
>
>
>   Speed :              234717 Bytes/sec.
>   Speed :              13.430 MegaBytes/min.
>
> z
>
>
> On Fri, May 13, 2011 at 9:28 AM, Trevor Wekel <trevor_wekel at otxsystems.com
> >
> wrote:
> > I agree with Traian.  There are alternative solutions for replication
> > and
> backup.
> >
> > If we are considering replication and backup for the tile sets, we
> > should
> also consider replication for the XML definitions (layer,feature,map) used
> to generate those tiles.  In other words, I would like to consider tile
> replication and repository replication together.
> >
> > The replication and backup functionality in MapGuide is certainly
> lacking.
>  MGP files do not propagate user/group/role information.  The only "easy"
> way to back up or replicate an entire server is to stop the MapGuide Server
> and copy all the files around.  I doubt that Rsync or robocopy could
> replicate live BerkeleyDB files.
> >
> > I also took a quick look at SQLite replication.  Google didn't turn up
> anything that was LGPL and actively maintained.  SQLite does have an
> internal hook that we could use to replicate stuff stored in SQLite
> http://www.sqlite.org/capi3ref.html#sqlite3_update_hook.  We could roll
> our own.
> >
> > Since we allow access to external data sources (SHP files, ECW files,
> etc), replication of file based data from server to server would have to be
> considered as part of the solution.  And replication to a UNC path would be
> an easy way to implement backup.
> >
> > Master/Slave replication based on files and SQLite could be
> > implemented in
> phases.  Here's a very rough outline:
> >
> > Phase 1 - Tile and external data replication
> > - Reintroduce master/slave concept for MapGuide Server
> > - Implement server to server TCP/IP communication logic to transfer
> > files
> > - Implement local "file copy (UNC backup)" logic
> >
> > Phase 2 - Switch to SQLite for tiles
> > - Add SQLite to the MapGuide Server.  Recode MgTileService to populate
> > the
> database
> > - Implement a SQLite update hook
> > - Implement server to server TCP/IP communication logic for
> > propagating
> SQLite updates
> >
> > Phase 3 - Full repository replication
> > - Rip out BerkeleyDB and replace it with SQLite
> > - Use existing mechanism from Phase 2 to implement full replication of
> repository
> >
> > Regards,
> > Trevor
> >
> > -----Original Message-----
> > From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> > Sent: May 12, 2011 11:59 AM
> > To: 'MapGuide Internals Mail List'
> > Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
> >
> >
> > Hi Tom,
> >
> > It would depend on what exactly the problem is -- sure if one is using
> Windows Explorer drag and drop to backup the files it would be faster to
> have one file. But if one is using rsync or robocopy (or similar), it still
> makes sense to use files, since those programs know how to copy only the
> changed files (or even changed parts of files).
> >
> > Traian
> >
> >
> > _______________________________________________
> > mapguide-internals mailing list
> > mapguide-internals at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> >
> >
>
>
>
> --
> Zac Spitzer
> Solution Architect / Director
> Ennoble Consultancy Australia
> http://www.ennoble.com.au
> http://zacster.blogspot.com
> +61 405 847 168
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
>


-- 
Zac Spitzer
Solution Architect / Director
Ennoble Consultancy Australia
http://www.ennoble.com.au
http://zacster.blogspot.com
+61 405 847 168


More information about the mapguide-internals mailing list