[Qgis-user] Migrating legacy QGIS instance

Walt Ludwick walt at valedalama.net
Tue Aug 11 11:20:23 PDT 2020


This makes good sense to me, Charles.  I've got enough experience with
databases (tho not so much with geographic ones) that i'm comfortable w/
SQL query tools. Unless a list or directory is small enough to eyeball with
ease (certainly the case with this legacy QGIS instance i've inherited),
i'd much rather search than dig for the data, so... In this sense at least,
less fragmentation is more.

That being said: i don't know if i can bundle all into a single .gpkg; if
there is a size limit as low as 5MB on each one, then certainly not.
Google search on string "Geopackage size limit" returns multiple
credible-looking pages that cite a limit (subject to filesystem
constraints) of 140TB.  Can you clarify about the "~5MB of storage overhead
for each unique .gpkg" comment?

In any case: if i go for selective consolidation -selection scheme still
TBD[1]- then i must certainly bear in mind your caution about the data loss
risk associated with careless use of certain processing tools/
configurations.  If there be tools & configs oriented to one & only one
.gpkg file, i don't yet know about them... But i'll certainly watch out for
that and keep a good backup!

[1] As to selection (or classification, i should say) and naming of .gpkg
files that will consolidate any number of .shp files: i am thinking along
lines of either data type (raster and vector being two high-level
groupings, with subtypes that might have more to do with the schema of
tabular data), or else data source (which often has much to do with data
reliability, maintainability -and value, ultimately).  Need to think a bit
more deeply on this, and would be happy for any guidance from more
experienced GIS admins.


On Tue, Aug 11, 2020 at 2:31 PM Charles Dixon-Paver <charles at kartoza.com>
wrote:

> Regarding the one-vs-many approach to gpkgs, I recommend consolidation
> (within reason). I feel that the temptation to use gpkg as a drop-in
> replacement for shp is familiarity with processes I personally consider to
> be largely outmoded. I think it's worth getting over the initial
> (relatively shallow) learning curve so that when you start working with db
> oriented systems like PostGIS, everything makes sense right out of the gate.
>
> Basically it boils down to how you want to manage or distribute them as
> you don't have traditional db roles. Personally, I try to package things
> into "data.gpkg/something" and "data.gpkg/somethingelse" wherever possible,
> rather than "a.gpkg/a" and "b.gpkg/b". It usually makes moving data around
> easier for me. If you have a lot of inputs, maybe split it into unique
> gpkgs based on some categorising criteria (like you might do with a schema)
> rather than one monolithic gpkg. Performing maintenance (vacuum) on a large
> number of unique gpkgs seems like an unnecessary chore.
>
> One limitation for gpkg is that certain processing tools/ configurations
> will only support writing to an entire gpkg, so if you lack experience
> you'll need to be careful not to overwrite all of your data and also have a
> decent backup plan in place. Usually you can get away with utilising a
> scratch.gpkg for that purpose with no risk to your primary datastore.
>
> Using the one-per-item feature offers little data management benefit from
> shapefiles aside from removing the auxiliary files and being able to store
> styles (as well as lae). There is little performance benefit over shp
> directly from what I understand (both use WKB), but there is ~5MB of
> storage overhead for each unique gpkg (if I remember correctly), but this
> will depend on your use case.
>
> Hope that helps.
>
> On Tue, 11 Aug 2020 at 15:13, Basques, Bob (CI-StPaul) <
> bob.basques at ci.stpaul.mn.us> wrote:
>
>> *Depending on your end goal, you might be more suited to leaving things
>> as they are and using  some sort of content explorer to organize the
>> existing data.  Then worry about migrating to different formats as needed.*
>>
>>
>>
>> *We’ve been using GeoMoose for this purpose.  It can connect to just
>> about any data source on the back end, such as SHP, Postgres, and
>> GeoPackage to name a few, but also can connect to proprietary services as
>> well.  Because it can use Mapserver as a display engine and data query
>> tool, it lends itself to online exploration of the data without the need
>> for a full blown GIS tool.  This allows for wide spread use by non-GIS
>> pros.  The datasets can still be managed by you with QGIS and/or in
>> Postgres/postgis, or whatever you prefer for that purpose.  The Mapserver
>> setup allow for connecting to just about any type of service behind the
>> scenes, and with the right configuration, you can also enable each dataset
>> in the GeoMoose catalog as a WMS/WFS data source, thee standard for open
>> data format access and publishing.*
>>
>>
>>
>> *Bobb*
>>
>>
>>
>>
>>
>>
>>
>> *From:* Qgis-user <qgis-user-bounces at lists.osgeo.org> * On Behalf Of *Walt
>> Ludwick
>> *Sent:* Tuesday, August 11, 2020 7:45 AM
>> *To:* qgis-user at lists.osgeo.org
>> *Subject:* Re: [Qgis-user] Migrating legacy QGIS instance
>>
>>
>>
>> *Think Before You Click: *This email originated *outside *our
>> organization.
>>
>>
>>
>> I'm on MacOS -and not so very comfortable with command line scripting- so
>> it looks like i might have to go the drag&drop way to import these .shp
>> files. Will take some time, but at least that way i can be sure about what
>> i've put where, and in what form.
>>
>>
>>
>> But i do wonder about the (a) "stick multiple shps into a single gpkg" OR
>> (b) "create one per feature" decision, since i'm not experienced enough to
>> have a clear preference about this.  Can you say anything about pros & cons
>> of going one way vs the other?
>>
>>
>>
>>
>>
>> On Tue, Aug 11, 2020 at 11:45 AM Charles Dixon-Paver <charles at kartoza.com>
>> wrote:
>>
>> Easiest way for me is to use the GDAL ogr2ogr
>> <https://gdal.org/programs/ogr2ogr.html> command using a bash script or
>> cmd batch to traverse your directories (depending on how you installed QGIS
>> this should be on your path). I don't know what environment you're running
>> though.
>>
>>
>>
>> You can either stick multiple shps into a single gpkg or create one per
>> feature as you prefer. ogr2ogr can also push shp files directly into
>> PostGIS. When you want to consolidate or migrate data (between gpkgs or
>> from gpkg to PostGIS) you can simply select the feature layers you want and
>> use drag and drop from the QGIS 3 Browser panel to copy multiple features
>> to a target location.
>>
>>
>>
>> Others might have different approaches though.
>>
>>
>>
>> Regards
>>
>>
>>
>> On Tue, 11 Aug 2020 at 12:24, Walt Ludwick <walt at valedalama.net> wrote:
>>
>> I've inherited a legacy GIS, built up over some years in versions 2.x,
>> that i'm now responsible to maintain.  Being an almost complete n00b (did
>> take a short course in QGIS a good few years ago, but still..), i could
>> really use some advice about migration.
>>
>> i've created a new QGIS instance in version 3.14, into which i am trying
>> to bring all useful content from our old system: oodles of shapefiles,
>> essentially, plus all those other files (each .shp file appears to bring
>> with it a set of.shx, .dbf, .prj, qpj  files, plus a .cpg file for each
>> layer, it seems).  This is a significant dataset- 14gb, >1000 files -and
>> that is just base data, not counting Projects built on this data or Layouts
>> used for presenting these projects in various ways. Some of this is cruft
>> that i can happily do without, but still:  i've got a lot of porting-over
>> to do, without a clear idea of how best to do it.
>>
>> The one thing i'm clear about is: i want it all in a non-proprietary
>> database (i.e. no more mess of .shp and related files) that is above all
>> quick & easy to navigate & manage. It is a single-user system at this
>> point, but i do aim to open it up to colleagues (off-LAN, i.e. via
>> Internet) as soon as i've developed simple apps for them to use.  No idea
>> how long it'll take me to get there, so...
>>
>> Big question at this point is: What should be the new storage format for
>> all this data?  Having read a few related opinions on StackOverflow, i get
>> the sense that GeoPackage will probably make for easiest migration (per this
>> encouraging article
>> <https://medium.com/@GispoFinland/learn-spatial-sql-and-master-geopackage-with-qgis-3-16b1e17f0291>,
>> it's a simple matter of drag&drop -simple if you have just a few, i guess!
>> [1]), and can easily support my needs in the short term, but then i wonder:
>> How will i manage migration to PostGIS when i eventually put  this system
>> online with different users/ roles enabled?
>>
>>
>>
>> [1] Given that i need to pull in some hundreds of .shp files that are
>> stored in a tree of many folders & subfolders, i also wonder: is there a
>> simple way that i can ask QGIS to traverse a certain directory, pull in all
>> the .shp files -each as its own .gpkg layer, i suppose?
>>
>>
>>
>> Any advice about managing this migration would be much appreciated!
>>
>> _______________________________________________
>> Qgis-user mailing list
>> Qgis-user at lists.osgeo.org
>> List info: https://lists.osgeo.org/mailman/listinfo/qgis-user
>> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-user
>>
>> _______________________________________________
>> Qgis-user mailing list
>> Qgis-user at lists.osgeo.org
>> List info: https://lists.osgeo.org/mailman/listinfo/qgis-user
>> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/qgis-user/attachments/20200811/11a7d49f/attachment-0001.html>


More information about the Qgis-user mailing list