[mapguide-internals] MapGuide RFC 112 - sqlite based tile cache
Traian Stanev
traian.stanev at autodesk.com
Fri May 13 10:48:59 EDT 2011
It's very strange that you are only getting <20% space efficiency out of the file system -- the tiles must be really tiny or the block size really huge. Perhaps this is another thing to look into (i.e. pick a better file system, but I guess sqlite is just going to be used as file system in a file in this case).
Yes, copying one file once is faster due to less seeking involved compared to 300K files. But, if you want to do incremental backups, where there are only a few changed tiles, things will likely reverse.
Anyway, if you are looking for a file system in a file, on Windows sqlite is probably the only choice. On Linux, there would be more options (ext3 in a file with B-tree indexing and tail-packing enabled, for example).
Traian
-----Original Message-----
From: mapguide-internals-bounces at lists.osgeo.org [mailto:mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Zac Spitzer
Sent: Friday, May 13, 2011 3:49 AM
To: MapGuide Internals Mail List
Subject: Re: [mapguide-internals] MapGuide RFC 112 - sqlite based tile cache
Lets take a tiny small tiled map as an example, using win7 x64, default configurations, quad core machine, fully seeded tile cache.
Samples_Sheboygan_MapsTiled_Sheboygan
Size: 267 MB (280,474,999 bytes)
Size on disk: 1.50 GB (1,613,549,568 bytes)
Contains: 375,084 Files, 527 Folders
as a single zip file, store aka zero compression
Samples_Sheboygan_MapsTiled_Sheboygan_2.zip
361 MB (379,239,309 bytes)
Block size is the issue, but you can't really optimise this as raster tiles will require a different blocksize than vector tiles.
Some simple Robocopy backup tests (same disk/non raided/default options)
copying a tile cache in a zip file
Total Copied Skipped Mismatch FAILED Extras
Dirs : 1 0 1 0 0 0
Files : 1 1 0 0 0 0
Bytes : 361.67 m 361.67 m 0 0 0 0
Times : 0:00:06 0:00:06 0:00:00 0:00:00
Speed : 62632420 Bytes/sec.
Speed : 3583.855 MegaBytes/min.
copying the raw tilecache using /MIR (requiring a 65% CPU utilisation)
Total Copied Skipped Mismatch FAILED Extras
Dirs : 528 527 1 0 0 0
Files : 375084 375084 0 0 0 0
Bytes : 267.48 m 267.48 m 0 0 0 0
Times : 0:27:30 0:19:54 0:00:00 0:07:35
Speed : 234717 Bytes/sec.
Speed : 13.430 MegaBytes/min.
z
On Fri, May 13, 2011 at 9:28 AM, Trevor Wekel <trevor_wekel at otxsystems.com>
wrote:
> I agree with Traian. There are alternative solutions for replication
> and
backup.
>
> If we are considering replication and backup for the tile sets, we
> should
also consider replication for the XML definitions (layer,feature,map) used to generate those tiles. In other words, I would like to consider tile replication and repository replication together.
>
> The replication and backup functionality in MapGuide is certainly lacking.
MGP files do not propagate user/group/role information. The only "easy"
way to back up or replicate an entire server is to stop the MapGuide Server and copy all the files around. I doubt that Rsync or robocopy could replicate live BerkeleyDB files.
>
> I also took a quick look at SQLite replication. Google didn't turn up
anything that was LGPL and actively maintained. SQLite does have an internal hook that we could use to replicate stuff stored in SQLite http://www.sqlite.org/capi3ref.html#sqlite3_update_hook. We could roll our own.
>
> Since we allow access to external data sources (SHP files, ECW files,
etc), replication of file based data from server to server would have to be considered as part of the solution. And replication to a UNC path would be an easy way to implement backup.
>
> Master/Slave replication based on files and SQLite could be
> implemented in
phases. Here's a very rough outline:
>
> Phase 1 - Tile and external data replication
> - Reintroduce master/slave concept for MapGuide Server
> - Implement server to server TCP/IP communication logic to transfer
> files
> - Implement local "file copy (UNC backup)" logic
>
> Phase 2 - Switch to SQLite for tiles
> - Add SQLite to the MapGuide Server. Recode MgTileService to populate
> the
database
> - Implement a SQLite update hook
> - Implement server to server TCP/IP communication logic for
> propagating
SQLite updates
>
> Phase 3 - Full repository replication
> - Rip out BerkeleyDB and replace it with SQLite
> - Use existing mechanism from Phase 2 to implement full replication of
repository
>
> Regards,
> Trevor
>
> -----Original Message-----
> From: mapguide-internals-bounces at lists.osgeo.org [mailto:
mapguide-internals-bounces at lists.osgeo.org] On Behalf Of Traian Stanev
> Sent: May 12, 2011 11:59 AM
> To: 'MapGuide Internals Mail List'
> Subject: RE: [mapguide-internals] MapGuide RFC 112 - sqlite based tile
cache
>
>
> Hi Tom,
>
> It would depend on what exactly the problem is -- sure if one is using
Windows Explorer drag and drop to backup the files it would be faster to have one file. But if one is using rsync or robocopy (or similar), it still makes sense to use files, since those programs know how to copy only the changed files (or even changed parts of files).
>
> Traian
>
>
> _______________________________________________
> mapguide-internals mailing list
> mapguide-internals at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapguide-internals
>
>
--
Zac Spitzer
Solution Architect / Director
Ennoble Consultancy Australia
http://www.ennoble.com.au
http://zacster.blogspot.com
+61 405 847 168
_______________________________________________
mapguide-internals mailing list
mapguide-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/mapguide-internals
More information about the mapguide-internals
mailing list