[mapguide-internals] MapGuide RFC 112 - sqlite based tile cache

Trevor Wekel trevor_wekel at otxsystems.com
Fri May 13 13:58:08 EDT 2011


Ok.  TMS makes sense from a mapagent perspective.  For the SQLite schema, we may want to include any metadata required to generate the various TMS xml responses.  A single SQLite database per map would be the easiest to manage.  A directory of map databases could be queried for information like the name of the map definition assuming we define common metadata table(s).

This could be a fairly involved project to implement - database integration in MapGuide, SQLite database schema, new MgTileService APIs and "under the covers" implementation, mapagent support for TMS, scripts to move file based tile caches into SQLite-based tile caches, etc.  This feels like a large project to me - possibly weeks or months of work.

As part of this work, we may also want to consider embedding PNG8 palette information into the MapDefinition and enhancing the MapDefinition to reference external tile sets via TMS.  TMS support in the Ajax and Fusion clients would have to be implemented if it's not already present.  We may also want to look at the OGC Web Map Tile Service as a future item.

Regards,
Trevor


-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Zac Spitzer
Sent: May 13, 2011 11:16 AM
To: MapGuide Internals Mail List
Subject: Re: [mapguide-internals] MapGuide RFC 112 - sqlite based tile cache

The block size I'm using is 4096, but there are many files in the tile cache
which are
one color and only 400 odd bytes, but others are 40k or more. It's
impossible
to tune the disk use effectively.

With a map base the size of Australia, it becomes really impractical to
backup the tile cache.

A nightly backup of 40GB of files is no problem for my clients, but the
sheer volume
of files presents a lot of trouble in terms of processing time. Disks also
get extremely
fragmented due to the sparodic way the tiles are written out to disk. Also
some IT
departments run everything on a SAN, so the waste of disk space does become
a real
issue cost wise.

The existing performance is good, there's much more to gain from adding http
cache
headers to the tile responses and offloading requests to proxy/cdn.

TMS makes consuming tiled maps really easy via ajax/openlayers
http://wiki.osgeo.org/wiki/Tile_Map_Service_Specification

z


On Sat, May 14, 2011 at 2:41 AM, Trevor Wekel
<trevor_wekel at otxsystems.com>wrote:

> We most certainly can.  Let's get back to the tile cache then.  SQLite will
> reduce the number of files we need to manage for a tile cache.  A mapagent
> plus SQLite tile cache solution should be faster than the existing web tier
> / server / disk solution due to the elimination of the TCP/IP hop.
>
> Exposing/copying the entire tile directory structure to a webserver would
> be faster than the mapagent plus SQLite solution and would not improve
> manageability.  Still tons of files to manage.
>
> There may be a trade off between performance and manageability.
>
> Regards,
> Trevor
>
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Zac Spitzer
> Sent: May 13, 2011 10:11 AM
> To: MapGuide Internals Mail List
> Subject: Re: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> cache
>
> can we split off the repository discussion into a seperate thread?
>
> On Sat, May 14, 2011 at 2:06 AM, Trevor Wekel
> <trevor_wekel at otxsystems.com>wrote:
>
> > Hmm... Now how many times have I heard "Repository Corruption" on the
> > mailing lists.  Moving to files on disk would make repository hacking
> easier
> > and guarantee that you would not lose the entire library at once.  I do
> have
> > a couple of concerns about moving to a file based approach.  We may run
> out
> > of file handles could happen if we keep the files open and there may be
> some
> > overhead for executing fopen/fclose on every resource access.
> >
> >
> > Regards,
> > Trevor
> >
> >
> > -----Original Message-----
> > From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> > mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> > Sent: May 13, 2011 9:44 AM
> > To: MapGuide Internals Mail List
> > Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> > cache
> >
> >
> > For ultimate serving speed of static tile caches, it's would be best to
> > bypass everything MapGuide and serve the tile cache directly from a
> > directory exposed via Apache. This way you would automatically get
> browser
> > caching as well. This is also why file storage of the tile cache is best
> for
> > optimal serving speed.
> >
> > As far as the resource repository, IMO it's small enough to use a direct
> > storage of XML files on the file system (for example, mapping what are
> > currently XMLdb paths to file system paths). Using a database to store
> blobs
> > in there would complicate things more than necessary and also adds an
> > unnecessary dependency to what could be a really simple piece of code
> > (reading and writing files).
> >
> > Traian
> >
> >
> >
> > -----Original Message-----
> > From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> > mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Trevor Wekel
> > Sent: Friday, May 13, 2011 11:33 AM
> > To: MapGuide Internals Mail List
> > Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> > cache
> >
> > Since MapGuide is targeted for both Windows and Linux, I think SQLite is
> > the only choice.  If we are going to introduce another local database
> into
> > MapGuide, perhaps we should consider other use cases for it.  Here's a
> few
> > just off the top of my head:
> >
> > Response caching for mapagent and possibly web extensions
> > - Operations like getting tiles and dynamic map overlays (for initial
> > views) may be cacheable if we remove SESSION from the HTTP GET/POST and
> put
> > it in a cookie.  We would have to implement time to live logic for this
> to
> > be truly effective.
> >
> > Move log files to database storage
> > - This could make query and analysis of log files easier
> >
> > Serving tiles directly from the mapagent
> > - Copying/propagating the SQLite database files to the web tier would
> > eliminate the agent/server hop
> >
> > Move to a SQLite backend for MgResourceService
> > - Maintaining multiple database technologies in one product could be
> > additional overhead
> > - BerkeleyDB doesn't seem to be great fit for the Session repository.
> >  After six years, we are still working on it.  Write-Ahead logging in
> SQLite
> > could be effective for the Session repository
> > http://www.sqlite.org/wal.html.
> >
> >
> > A new service "MgStorageService" implemented in MapGuideCommon could wrap
> > the SQLite database and make it accessible to the server, agent, and web
> > extensions.   The API to MgStorageService would have to be considered
> > carefully based on expected use cases.
> >
> >
> >
> > Regards,
> > Trevor
> >
> > -----Original Message-----
> > From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> > mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> > Sent: May 13, 2011 8:49 AM
> > To: 'MapGuide Internals Mail List'
> > Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> > cache
> >
> >
> > It's very strange that you are only getting <20% space efficiency out of
> > the file system -- the tiles must be really tiny or the block size really
> > huge. Perhaps this is another thing to look into (i.e. pick a better file
> > system, but I guess sqlite is just going to be used as file system in a
> file
> > in this case).
> >
> > Yes, copying one file once is faster due to less seeking involved
> compared
> > to 300K files. But, if you want to do incremental backups, where there
> are
> > only a few changed tiles, things will likely reverse.
> >
> > Anyway, if you are looking for a file system in a file, on Windows sqlite
> > is probably the only choice. On Linux, there would be more options (ext3
> in
> > a file with B-tree indexing and tail-packing enabled, for example).
> >
> > Traian
> >
> >
> > -----Original Message-----
> > From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> > mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Zac Spitzer
> > Sent: Friday, May 13, 2011 3:49 AM
> > To: MapGuide Internals Mail List
> > Subject: Re: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> > cache
> >
> > Lets take a tiny small tiled map as an example, using win7 x64, default
> > configurations, quad core machine, fully seeded tile cache.
> >
> > Samples_Sheboygan_MapsTiled_Sheboygan
> >
> > Size: 267 MB (280,474,999 bytes)
> > Size on disk: 1.50 GB (1,613,549,568 bytes)
> > Contains: 375,084 Files, 527 Folders
> >
> > as a single zip file, store aka zero compression
> >
> > Samples_Sheboygan_MapsTiled_Sheboygan_2.zip
> > 361 MB (379,239,309 bytes)
> >
> > Block size is the issue, but you can't really optimise this as raster
> tiles
> > will require a different blocksize than vector tiles.
> >
> > Some simple Robocopy backup tests (same disk/non raided/default options)
> >
> > copying a tile cache in a zip file
> >
> >              Total    Copied   Skipped  Mismatch    FAILED    Extras
> >    Dirs :         1         0         1         0         0         0
> >  Files :         1         1         0         0         0         0
> >   Bytes :  361.67 m  361.67 m         0         0         0         0
> >  Times :   0:00:06   0:00:06                       0:00:00   0:00:00
> >
> >
> >  Speed :            62632420 Bytes/sec.
> >   Speed :            3583.855 MegaBytes/min.
> >
> > copying the raw tilecache using /MIR  (requiring a 65% CPU utilisation)
> >
> >               Total    Copied   Skipped  Mismatch    FAILED    Extras
> >    Dirs :       528       527         1         0         0         0
> >   Files :    375084    375084         0         0         0         0
> >   Bytes :  267.48 m  267.48 m         0         0         0         0
> >   Times :   0:27:30   0:19:54                       0:00:00   0:07:35
> >
> >
> >   Speed :              234717 Bytes/sec.
> >   Speed :              13.430 MegaBytes/min.
> >
> > z
> >
> >
> > On Fri, May 13, 2011 at 9:28 AM, Trevor Wekel <
> trevor_wekel at otxsystems.com
> > >
> > wrote:
> > > I agree with Traian.  There are alternative solutions for replication
> > > and
> > backup.
> > >
> > > If we are considering replication and backup for the tile sets, we
> > > should
> > also consider replication for the XML definitions (layer,feature,map)
> used
> > to generate those tiles.  In other words, I would like to consider tile
> > replication and repository replication together.
> > >
> > > The replication and backup functionality in MapGuide is certainly
> > lacking.
> >  MGP files do not propagate user/group/role information.  The only "easy"
> > way to back up or replicate an entire server is to stop the MapGuide
> Server
> > and copy all the files around.  I doubt that Rsync or robocopy could
> > replicate live BerkeleyDB files.
> > >
> > > I also took a quick look at SQLite replication.  Google didn't turn up
> > anything that was LGPL and actively maintained.  SQLite does have an
> > internal hook that we could use to replicate stuff stored in SQLite
> > http://www.sqlite.org/capi3ref.html#sqlite3_update_hook.  We could roll
> > our own.
> > >
> > > Since we allow access to external data sources (SHP files, ECW files,
> > etc), replication of file based data from server to server would have to
> be
> > considered as part of the solution.  And replication to a UNC path would
> be
> > an easy way to implement backup.
> > >
> > > Master/Slave replication based on files and SQLite could be
> > > implemented in
> > phases.  Here's a very rough outline:
> > >
> > > Phase 1 - Tile and external data replication
> > > - Reintroduce master/slave concept for MapGuide Server
> > > - Implement server to server TCP/IP communication logic to transfer
> > > files
> > > - Implement local "file copy (UNC backup)" logic
> > >
> > > Phase 2 - Switch to SQLite for tiles
> > > - Add SQLite to the MapGuide Server.  Recode MgTileService to populate
> > > the
> > database
> > > - Implement a SQLite update hook
> > > - Implement server to server TCP/IP communication logic for
> > > propagating
> > SQLite updates
> > >
> > > Phase 3 - Full repository replication
> > > - Rip out BerkeleyDB and replace it with SQLite
> > > - Use existing mechanism from Phase 2 to implement full replication of
> > repository
> > >
> > > Regards,
> > > Trevor
> > >
> > > -----Original Message-----
> > > From: mapguide-internals-bounces at lists.osgeo.org [mailto:
> > mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> > > Sent: May 12, 2011 11:59 AM
> > > To: 'MapGuide Internals Mail List'
> > > Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
> > cache
> > >
> > >
> > > Hi Tom,
> > >
> > > It would depend on what exactly the problem is -- sure if one is using
> > Windows Explorer drag and drop to backup the files it would be faster to
> > have one file. But if one is using rsync or robocopy (or similar), it
> still
> > makes sense to use files, since those programs know how to copy only the
> > changed files (or even changed parts of files).
> > >
> > > Traian
> > >
> > >
> > > _______________________________________________
> > > mapguide-internals mailing list
> > > mapguide-internals at lists.osgeo.org
> > > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> > >
> > >
> >
> >
> >
> > --
> > Zac Spitzer
> > Solution Architect / Director
> > Ennoble Consultancy Australia
> > http://www.ennoble.com.au
> > http://zacster.blogspot.com
> > +61 405 847 168
> > _______________________________________________
> > mapguide-internals mailing list
> > mapguide-internals at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> >
> > _______________________________________________
> > mapguide-internals mailing list
> > mapguide-internals at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/mapguide-internals
> >
> >
>
>
> --
> Zac Spitzer
> Solution Architect / Director
> Ennoble Consultancy Australia
> http://www.ennoble.com.au
> http://zacster.blogspot.com
> +61 405 847 168
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
>
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
>


--
Zac Spitzer
Solution Architect / Director
Ennoble Consultancy Australia
http://www.ennoble.com.au
http://zacster.blogspot.com
+61 405 847 168
_______________________________________________
mapguide-internals mailing list
mapguide-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapguide-internals



More information about the mapguide-internals mailing list