<div dir="ltr"><div dir="ltr">Walt and everyone,<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 12, 2020 at 9:00 AM Walt Ludwick <<a href="mailto:walt@valedalama.net">walt@valedalama.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Oh, i get it (duh!): overhead = minimum file size. Makes sense, since every .gpkg is its own SQLite instance -and 5mb is a small price to pay for a RDBMS in a single file, essentially. Still, something to bear in mind, while designing one's information architecture for the GIS. Thanks, Charles!</div></blockquote><div><br></div><div>Walt, since you state that you "aim to open it up to colleagues" at some point, you might just want to bite the bullet right from the start and stuff it all into PostGIS. Not to belittle all the good stuff in SQLite at all, but that's no kind of multi-user database. Another cool thing you get for free with PostGIS - when you get around to building your web-based access, you can do a lot of spatial processing right in PostGIS, without requiring any client or middleware libraries / integration.<br></div><div><div><br></div><div>I don't know about the OS/X setup but on my Ubuntu 20.04 I see I have somehow managed to install a "shp2pgsql-gui" tool that looks like it might be of use in facilitating a migration.<br></div></div><div><br></div><div>Worth checking out is Regina Obe's and Leo Hsu's Manning book, PostGIS in Action, which is in <a href="https://www.manning.com/books/postgis-in-action-third-edition?utm_source=google&utm_medium=search&utm_campaign=dynamicsearch&gclid=EAIaIQobChMI2bWl9YuW6wIVchh9Ch1JfwrcEAAYAiAAEgIix_D_BwE">early access third edition</a>. Chapter 5 "Using PostGIS on the desktop" gives an overview of using PostGIS-hosted data with OpenJUMP, QGIS, gvSIG and Jupyter.<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Aug 11, 2020 at 9:02 PM Charles Dixon-Paver <<a href="mailto:charles@kartoza.com" target="_blank">charles@kartoza.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Sorry for the confusion Walt, but the "overhead" I was referring to here is actually the fact that gpkg is implemented as a SQLite container with a <i>minimum</i> filesize which adds a couple MB. I think the "overhead" will vary depending on the type of data stored. Basically, if you make one for every shapefile you could probably expect to end up with an additional ~5MB of bloat to your existing data store for each shapefile converted...<div><br></div><div>Upper limits as you stated should be (in theory) ~140TB, or at least somewhere upwards from whatever I would usually consider practical to store in a database that's stored as a single flat file...</div><div><br></div><div>Regarding geomoose on Mac, you could try use docker to test it out <a href="https://github.com/geomoose/docker-geomoose" target="_blank">https://github.com/geomoose/docker-geomoose</a></div><div><br></div><div>In terms of the specifics on how to restructure your data infrastructure, it seems like it's going to depend a lot on the specifics of your use case and is probably outside the scope of this mailing list, or at least this thread... Migrating projects is another beast altogether, so maybe someone else can offer advice on that.</div><div><br></div><div>Regards</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, 11 Aug 2020 at 20:20, Walt Ludwick <<a href="mailto:walt@valedalama.net" target="_blank">walt@valedalama.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>This makes good sense to me, Charles. I've got enough experience with databases (tho not so much with geographic ones) that i'm comfortable w/ SQL query tools. Unless a list or directory is small enough to eyeball with ease (certainly the case with this legacy QGIS instance i've inherited), i'd much rather search than dig for the data, so... In this sense at least, less fragmentation is more.</div><div><br></div><div>That being said: i don't know if i can bundle all into a single .gpkg; if there is a size limit as low as 5MB on each one, then certainly not. Google search on string "Geopackage size limit" returns multiple credible-looking pages that cite a limit (subject to filesystem constraints) of 140TB. Can you clarify about the "~5MB of storage overhead for each unique .gpkg" comment?</div><div><br></div><div>In any case: if i go for selective consolidation -selection scheme still TBD[1]- then i must certainly bear in mind your caution about the data loss risk associated with careless use of certain processing tools/ configurations. If there be tools & configs oriented to one & only one .gpkg file, i don't yet know about them... But i'll certainly watch out for that and keep a good backup!</div><div><br></div><div>[1] As to selection (or classification, i should say) and naming of .gpkg files that will consolidate any number of .shp files: i am thinking along lines of either data type (raster and vector being two high-level groupings, with subtypes that might have more to do with the schema of tabular data), or else data source (which often has much to do with data reliability, maintainability -and value, ultimately). Need to think a bit more deeply on this, and would be happy for any guidance from more experienced GIS admins.</div><div><br></div><div><br></div><div>On Tue, Aug 11, 2020 at 2:31 PM Charles Dixon-Paver <<a href="mailto:charles@kartoza.com" target="_blank">charles@kartoza.com</a>> wrote:</div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div>Regarding the one-vs-many approach to gpkgs, I recommend consolidation (within reason). I feel that the temptation to use gpkg as a drop-in replacement for shp is familiarity with processes I personally consider to be largely outmoded. I think it's worth getting over the initial (relatively shallow) learning curve so that when you start working with db oriented systems like PostGIS, everything makes sense right out of the gate.<div><br></div><div><div>Basically it boils down to how you want to manage or distribute them as you don't have traditional db roles. Personally, I try to package things into "data.gpkg/something" and
"data.gpkg/somethingelse"
wherever possible, rather than
"a.gpkg/a" and "b.gpkg/b". It usually makes moving data around easier for me. If you have a lot of inputs, maybe split it into unique gpkgs based on some categorising criteria (like you might do with a schema) rather than one monolithic gpkg. Performing maintenance (vacuum) on a large number of unique gpkgs seems like an unnecessary chore.</div><div></div><div><br></div><div>One limitation for gpkg is that certain processing tools/ configurations will only support writing to an entire gpkg, so if you lack experience you'll need to be careful not to overwrite all of your data and also have a decent backup plan in place. Usually you can get away with utilising a scratch.gpkg for that purpose with no risk to your primary datastore.</div><div><br></div><div>Using the one-per-item feature offers little data management benefit from shapefiles aside from removing the auxiliary files and being able to store styles (as well as lae). There is little performance benefit over shp directly from what I understand (both use WKB), but there is ~5MB of storage overhead for each unique gpkg (if I remember correctly), but this will depend on your use case.</div><div></div><div></div><div></div><div><br></div><div>Hope that helps.</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, 11 Aug 2020 at 15:13, Basques, Bob (CI-StPaul) <<a href="mailto:bob.basques@ci.stpaul.mn.us" target="_blank">bob.basques@ci.stpaul.mn.us</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)">Depending on your end goal, you might be more suited to leaving things as they are and using some sort of content explorer to organize the existing data. Then worry about migrating to different formats as
needed.<u></u><u></u></span></b></p>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)"><u></u> <u></u></span></b></p>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)">We’ve been using GeoMoose for this purpose. It can connect to just about any data source on the back end, such as SHP, Postgres, and GeoPackage to name a few, but also can connect to proprietary services
as well. Because it can use Mapserver as a display engine and data query tool, it lends itself to online exploration of the data without the need for a full blown GIS tool. This allows for wide spread use by non-GIS pros. The datasets can still be managed
by you with QGIS and/or in Postgres/postgis, or whatever you prefer for that purpose. The Mapserver setup allow for connecting to just about any type of service behind the scenes, and with the right configuration, you can also enable each dataset in the GeoMoose
catalog as a WMS/WFS data source, thee standard for open data format access and publishing.<u></u><u></u></span></b></p>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)"><u></u> <u></u></span></b></p>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)">Bobb<u></u><u></u></span></b></p>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)"><u></u> <u></u></span></b></p>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)"><u></u> <u></u></span></b></p>
<p class="MsoNormal"><b><span style="color:rgb(0,176,240)"><u></u> <u></u></span></b></p>
<div style="border-color:currentcolor currentcolor currentcolor blue;border-style:none none none solid;border-width:medium medium medium 1.5pt;padding:0in 0in 0in 4pt">
<div>
<div style="border-color:rgb(225,225,225) currentcolor currentcolor;border-style:solid none none;border-width:1pt medium medium;padding:3pt 0in 0in">
<p class="MsoNormal"><b>From:</b> Qgis-user <<a href="mailto:qgis-user-bounces@lists.osgeo.org" target="_blank">qgis-user-bounces@lists.osgeo.org</a>> <b>
On Behalf Of </b>Walt Ludwick<br>
<b>Sent:</b> Tuesday, August 11, 2020 7:45 AM<br>
<b>To:</b> <a href="mailto:qgis-user@lists.osgeo.org" target="_blank">qgis-user@lists.osgeo.org</a><br>
<b>Subject:</b> Re: [Qgis-user] Migrating legacy QGIS instance<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<table style="background:rgb(226,247,0) none repeat scroll 0% 0%;border:1pt solid black" cellpadding="0" border="1">
<tbody>
<tr>
<td style="border:medium none;padding:0.75pt">
<p class="MsoNormal"><b><span style="font-family:Calibri,sans-serif;color:rgb(255,51,51)">Think Before You Click:
</span></b><span style="color:rgb(255,51,51)">This email originated <b><span style="font-family:Calibri,sans-serif">outside
</span></b>our organization.</span><u></u><u></u></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<div>
<p class="MsoNormal">I'm on MacOS -and not so very comfortable with command line scripting- so it looks like i might have to go the drag&drop way to import these .shp files. Will take some time, but at least that way i can be sure about what i've put where,
and in what form. <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">But i do wonder about the (a) "stick multiple shps into a single gpkg" OR (b) "create one per feature" decision, since i'm not experienced enough to have a clear preference about this. Can you say anything about pros & cons of going one
way vs the other?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">On Tue, Aug 11, 2020 at 11:45 AM Charles Dixon-Paver <<a href="mailto:charles@kartoza.com" target="_blank">charles@kartoza.com</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-color:currentcolor currentcolor currentcolor rgb(204,204,204);border-style:none none none solid;border-width:medium medium medium 1pt;padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in">
<div>
<p class="MsoNormal">Easiest way for me is to use the GDAL <a href="https://gdal.org/programs/ogr2ogr.html" target="_blank">
ogr2ogr</a> command using a bash script or cmd batch to traverse your directories (depending on how you installed QGIS this should be on your path). I don't know what environment you're running though.<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">You can either stick multiple shps into a single gpkg or create one per feature as you prefer. ogr2ogr can also push shp files directly into PostGIS. When you want to consolidate or migrate data (between gpkgs or from gpkg to PostGIS) you
can simply select the feature layers you want and use drag and drop from the QGIS 3 Browser panel to copy multiple features to a target location.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Others might have different approaches though.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Regards<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">On Tue, 11 Aug 2020 at 12:24, Walt Ludwick <<a href="mailto:walt@valedalama.net" target="_blank">walt@valedalama.net</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-color:currentcolor currentcolor currentcolor rgb(204,204,204);border-style:none none none solid;border-width:medium medium medium 1pt;padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in">
<div>
<p class="MsoNormal">I've inherited a legacy GIS, built up over some years in versions 2.x, that i'm now responsible to maintain. Being an almost complete n00b (did take a short course in QGIS a good few years ago, but still..), i could really use some advice
about migration.<br>
<br>
i've created a new QGIS instance in version 3.14, into which i am trying to bring all useful content from our old system: oodles of shapefiles, essentially, plus all those other files (each .shp file appears to bring with it a set of.shx, .dbf, .prj, qpj files,
plus a .cpg file for each layer, it seems). This is a significant dataset- 14gb, >1000 files -and that is just base data, not counting Projects built on this data or Layouts used for presenting these projects in various ways. Some of this is cruft that i
can happily do without, but still: i've got a lot of porting-over to do, without a clear idea of how best to do it.
<br>
<br>
The one thing i'm clear about is: i want it all in a non-proprietary database (i.e. no more mess of .shp and related files) that is above all quick & easy to navigate & manage. It is a single-user system at this point, but i do aim to open it up to colleagues
(off-LAN, i.e. via Internet) as soon as i've developed simple apps for them to use. No idea how long it'll take me to get there, so...<br>
<br>
Big question at this point is: What should be the new storage format for all this data? Having read a few related opinions on StackOverflow, i get the sense that GeoPackage will probably make for easiest migration (per
<a href="https://medium.com/@GispoFinland/learn-spatial-sql-and-master-geopackage-with-qgis-3-16b1e17f0291" target="_blank">
this encouraging article</a>, it's a simple matter of drag&drop -simple if you have just a few, i guess! [1]), and can easily support my needs in the short term, but then i wonder: How will i manage migration to PostGIS when i eventually put this system online
with different users/ roles enabled?<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">[1] Given that i need to pull in some hundreds of .shp files that are stored in a tree of many folders & subfolders, i also wonder: is there a simple way that i can ask QGIS to traverse a certain directory, pull in all the .shp files -each
as its own .gpkg layer, i suppose?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Any advice about managing this migration would be much appreciated!<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal">_______________________________________________<br>
Qgis-user mailing list<br>
<a href="mailto:Qgis-user@lists.osgeo.org" target="_blank">Qgis-user@lists.osgeo.org</a><br>
List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" target="_blank">
https://lists.osgeo.org/mailman/listinfo/qgis-user</a><br>
Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" target="_blank">
https://lists.osgeo.org/mailman/listinfo/qgis-user</a><u></u><u></u></p>
</blockquote>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</div>
_______________________________________________<br>
Qgis-user mailing list<br>
<a href="mailto:Qgis-user@lists.osgeo.org" target="_blank">Qgis-user@lists.osgeo.org</a><br>
List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-user</a><br>
Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-user</a></blockquote></div>
</blockquote></div></div>
_______________________________________________<br>
Qgis-user mailing list<br>
<a href="mailto:Qgis-user@lists.osgeo.org" target="_blank">Qgis-user@lists.osgeo.org</a><br>
List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-user</a><br>
Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-user</a></blockquote></div>
</blockquote></div>
_______________________________________________<br>
Qgis-user mailing list<br>
<a href="mailto:Qgis-user@lists.osgeo.org" target="_blank">Qgis-user@lists.osgeo.org</a><br>
List info: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-user</a><br>
Unsubscribe: <a href="https://lists.osgeo.org/mailman/listinfo/qgis-user" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/qgis-user</a></blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr">Chris Hermansen · clhermansen "at" gmail "dot" com<br><br>C'est ma façon de parler.</div></div></div>