[SAC] [OSGeo] #2771: osgeo7 snapshot failing and secure can't restart

OSGeo trac_osgeo at osgeo.org
Wed Jun 1 22:08:40 PDT 2022


#2771: osgeo7 snapshot failing and secure can't restart
---------------------------+----------------------------------------
 Reporter:  robe           |       Owner:  sac@…
     Type:  task           |      Status:  new
 Priority:  normal         |   Milestone:  Sysadmin Contract 2022-II
Component:  Systems Admin  |  Resolution:
 Keywords:                 |
---------------------------+----------------------------------------
Comment (by robe):

 the first issue was after I shut down secure, it wouldn't start.
 Gave error something to effect:


 {{{
 Failed to run: zfs set mountpoint
 }}}

 To fix I did:


 {{{
 sudo zfs set mountpoint=/var/snap/lxd/common/lxd/storage-
 pools/default/containers/secure canmount=noauto osgeo7/containers/secure
 zfs umount osgeo7/containers/secure
 zfs mount osgeo7/containers/secure
 }}}

 live was having similar issue so I did the same and stated it up.

 secure had an additional issue was one I couldn't find anywhere:

 This was a complicated one, I documented my change here -
 https://discuss.linuxcontainers.org/t/lxc-snapshot-and-lxc-start-error-
 instance-snapshot-record-count-doesnt-match-instance-snapshot-volume-
 record-count/14245/3

 More detail here:

 first I made a backup of the lxd database to inspect, with this:


 {{{
  sudo cp /var/snap/lxd/common/lxd/database/global/db.bin lxd-global-220601
 }}}

 Then I inspected the sql lite backup as follows:


 {{{
 sudo apt install sqlite3
 sqlite3 lxd-global-220601
 }}}

 # in sqlite console
 {{{
 .tables
 .mode column
 .headers on

 SELECT count(*) FROM instances AS v INNER JOIN instances_snapshots AS vs
 ON v.id = vs.instance_id WHERE v.name = 'secure';
 }}}

 output: 32

 {{{
 SELECT count(*) FROM storage_volumes AS v INNER JOIN
 storage_volumes_snapshots AS vs ON v.id = vs.storage_volume_id WHERE
 v.name = 'secure';
 }}}

 output: 37

 {{{
 SELECT v.id
 FROM
 (SELECT vs.* FROM storage_volumes AS v INNER JOIN
 storage_volumes_snapshots AS vs ON v.id = vs.storage_volume_id WHERE
 v.name = 'secure') AS v

  LEFT JOIN
 (SELECT vs.* FROM instances AS v INNER JOIN instances_snapshots AS vs ON
 v.id = vs.instance_id WHERE v.name = 'secure') AS i ON i.name = v.name
 WHERE  i.name IS NULL;
 }}}

 Which resulted in these numbers for storage_volumes_snapshots

 {{{
 4701
 4714
 4737
 4761
 4779
 }}}

 Then ran this:


 {{{
 lxd sql global "DELETE FROM storage_volumes_snapshots WHERE id
 IN(4701,4714,4737,4761,4779)"
 }}}

 Then I was able to do


 {{{
 lxc snapshot secure
 lxc start secure
 }}}


 I'll close this ticket out once I've fixed the other affected containers.
-- 
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/2771#comment:2>
OSGeo <https://osgeo.org/>
OSGeo committee and general foundation issue tracker.


More information about the Sac mailing list