[mapguide-internals] MapGuide RFC 112 - sqlite based tile cache
Trevor Wekel
trevor_wekel at otxsystems.com
Fri May 13 12:41:30 EDT 2011
We most certainly can. Let's get back to the tile cache then. SQLite will reduce the number of files we need to manage for a tile cache. A mapagent plus SQLite tile cache solution should be faster than the existing web tier / server / disk solution due to the elimination of the TCP/IP hop.
Exposing/copying the entire tile directory structure to a webserver would be faster than the mapagent plus SQLite solution and would not improve manageability. Still tons of files to manage.
There may be a trade off between performance and manageability.
Regards,
Trevor
-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Zac Spitzer
Sent: May 13, 2011 10:11 AM
To: MapGuide Internals Mail List
Subject: Re: [mapguide-internals] MapGuide RFC 112 - sqlite based tile cache
can we split off the repository discussion into a seperate thread?
On Sat, May 14, 2011 at 2:06 AM, Trevor Wekel
<trevor_wekel at otxsystems.com>wrote:
> Hmm... Now how many times have I heard "Repository Corruption" on the
> mailing lists. Moving to files on disk would make repository hacking easier
> and guarantee that you would not lose the entire library at once. I do have
> a couple of concerns about moving to a file based approach. We may run out
> of file handles could happen if we keep the files open and there may be some
> overhead for executing fopen/fclose on every resource access.
>
>
> Regards,
> Trevor
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> Sent: May 13, 2011 9:44 AM
> To: MapGuide Internals Mail List
> Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
>
> For ultimate serving speed of static tile caches, it's would be best to
> bypass everything MapGuide and serve the tile cache directly from a
> directory exposed via Apache. This way you would automatically get browser
> caching as well. This is also why file storage of the tile cache is best for
> optimal serving speed.
>
> As far as the resource repository, IMO it's small enough to use a direct
> storage of XML files on the file system (for example, mapping what are
> currently XMLdb paths to file system paths). Using a database to store blobs
> in there would complicate things more than necessary and also adds an
> unnecessary dependency to what could be a really simple piece of code
> (reading and writing files).
>
> Traian
>
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Trevor Wekel
> Sent: Friday, May 13, 2011 11:33 AM
> To: MapGuide Internals Mail List
> Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
> Since MapGuide is targeted for both Windows and Linux, I think SQLite is
> the only choice. If we are going to introduce another local database into
> MapGuide, perhaps we should consider other use cases for it. Here's a few
> just off the top of my head:
>
> Response caching for mapagent and possibly web extensions
> - Operations like getting tiles and dynamic map overlays (for initial
> views) may be cacheable if we remove SESSION from the HTTP GET/POST and put
> it in a cookie. We would have to implement time to live logic for this to
> be truly effective.
>
> Move log files to database storage
> - This could make query and analysis of log files easier
>
> Serving tiles directly from the mapagent
> - Copying/propagating the SQLite database files to the web tier would
> eliminate the agent/server hop
>
> Move to a SQLite backend for MgResourceService
> - Maintaining multiple database technologies in one product could be
> additional overhead
> - BerkeleyDB doesn't seem to be great fit for the Session repository.
> After six years, we are still working on it. Write-Ahead logging in SQLite
> could be effective for the Session repository
> http://www.sqlite.org/wal.html.
>
>
> A new service "MgStorageService" implemented in MapGuideCommon could wrap
> the SQLite database and make it accessible to the server, agent, and web
> extensions. The API to MgStorageService would have to be considered
> carefully based on expected use cases.
>
>
>
> Regards,
> Trevor
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> Sent: May 13, 2011 8:49 AM
> To: 'MapGuide Internals Mail List'
> Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
>
> It's very strange that you are only getting <20% space efficiency out of
> the file system -- the tiles must be really tiny or the block size really
> huge. Perhaps this is another thing to look into (i.e. pick a better file
> system, but I guess sqlite is just going to be used as file system in a file
> in this case).
>
> Yes, copying one file once is faster due to less seeking involved compared
> to 300K files. But, if you want to do incremental backups, where there are
> only a few changed tiles, things will likely reverse.
>
> Anyway, if you are looking for a file system in a file, on Windows sqlite
> is probably the only choice. On Linux, there would be more options (ext3 in
> a file with B-tree indexing and tail-packing enabled, for example).
>
> Traian
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Zac Spitzer
> Sent: Friday, May 13, 2011 3:49 AM
> To: MapGuide Internals Mail List
> Subject: Re: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
> Lets take a tiny small tiled map as an example, using win7 x64, default
> configurations, quad core machine, fully seeded tile cache.
>
> Samples_Sheboygan_MapsTiled_Sheboygan
>
> Size: 267 MB (280,474,999 bytes)
> Size on disk: 1.50 GB (1,613,549,568 bytes)
> Contains: 375,084 Files, 527 Folders
>
> as a single zip file, store aka zero compression
>
> Samples_Sheboygan_MapsTiled_Sheboygan_2.zip
> 361 MB (379,239,309 bytes)
>
> Block size is the issue, but you can't really optimise this as raster tiles
> will require a different blocksize than vector tiles.
>
> Some simple Robocopy backup tests (same disk/non raided/default options)
>
> copying a tile cache in a zip file
>
> Total Copied Skipped Mismatch FAILED Extras
> Dirs : 1 0 1 0 0 0
> Files : 1 1 0 0 0 0
> Bytes : 361.67 m 361.67 m 0 0 0 0
> Times : 0:00:06 0:00:06 0:00:00 0:00:00
>
>
> Speed : 62632420 Bytes/sec.
> Speed : 3583.855 MegaBytes/min.
>
> copying the raw tilecache using /MIR (requiring a 65% CPU utilisation)
>
> Total Copied Skipped Mismatch FAILED Extras
> Dirs : 528 527 1 0 0 0
> Files : 375084 375084 0 0 0 0
> Bytes : 267.48 m 267.48 m 0 0 0 0
> Times : 0:27:30 0:19:54 0:00:00 0:07:35
>
>
> Speed : 234717 Bytes/sec.
> Speed : 13.430 MegaBytes/min.
>
> z
>
>
> On Fri, May 13, 2011 at 9:28 AM, Trevor Wekel <trevor_wekel at otxsystems.com
> >
> wrote:
> > I agree with Traian. There are alternative solutions for replication
> > and
> backup.
> >
> > If we are considering replication and backup for the tile sets, we
> > should
> also consider replication for the XML definitions (layer,feature,map) used
> to generate those tiles. In other words, I would like to consider tile
> replication and repository replication together.
> >
> > The replication and backup functionality in MapGuide is certainly
> lacking.
> MGP files do not propagate user/group/role information. The only "easy"
> way to back up or replicate an entire server is to stop the MapGuide Server
> and copy all the files around. I doubt that Rsync or robocopy could
> replicate live BerkeleyDB files.
> >
> > I also took a quick look at SQLite replication. Google didn't turn up
> anything that was LGPL and actively maintained. SQLite does have an
> internal hook that we could use to replicate stuff stored in SQLite
> http://www.sqlite.org/capi3ref.html#sqlite3_update_hook. We could roll
> our own.
> >
> > Since we allow access to external data sources (SHP files, ECW files,
> etc), replication of file based data from server to server would have to be
> considered as part of the solution. And replication to a UNC path would be
> an easy way to implement backup.
> >
> > Master/Slave replication based on files and SQLite could be
> > implemented in
> phases. Here's a very rough outline:
> >
> > Phase 1 - Tile and external data replication
> > - Reintroduce master/slave concept for MapGuide Server
> > - Implement server to server TCP/IP communication logic to transfer
> > files
> > - Implement local "file copy (UNC backup)" logic
> >
> > Phase 2 - Switch to SQLite for tiles
> > - Add SQLite to the MapGuide Server. Recode MgTileService to populate
> > the
> database
> > - Implement a SQLite update hook
> > - Implement server to server TCP/IP communication logic for
> > propagating
> SQLite updates
> >
> > Phase 3 - Full repository replication
> > - Rip out BerkeleyDB and replace it with SQLite
> > - Use existing mechanism from Phase 2 to implement full replication of
> repository
> >
> > Regards,
> > Trevor
> >
> > -----Original Message-----
> > From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> > Sent: May 12, 2011 11:59 AM
> > To: 'MapGuide Internals Mail List'
> > Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
> >
> >
> > Hi Tom,
> >
> > It would depend on what exactly the problem is -- sure if one is using
> Windows Explorer drag and drop to backup the files it would be faster to
> have one file. But if one is using rsync or robocopy (or similar), it still
> makes sense to use files, since those programs know how to copy only the
> changed files (or even changed parts of files).
> >
> > Traian
> >
> >
> > _______________________________________________
> > mapguide-internals mailing list
> > mapguide-internals at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> >
> >
>
>
>
> --
> Zac Spitzer
> Solution Architect / Director
> Ennoble Consultancy Australia
> http://www.ennoble.com.au
> http://zacster.blogspot.com
> +61 405 847 168
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
>
--
Zac Spitzer
Solution Architect / Director
Ennoble Consultancy Australia
http://www.ennoble.com.au
http://zacster.blogspot.com
+61 405 847 168
_______________________________________________
mapguide-internals mailing list
mapguide-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
More information about the mapguide-internals
mailing list