[SAC] Postmortem for LDAP troubles
Sandro Santilli
strk at kbt.io
Tue Mar 2 02:26:03 PST 2021
This got somehow fixed but I'm not sure if it was one of
my actions. What I did:
1. Run the /usr/local/bin/copy_ldap_certs_to_secure.sh
script to update ssl certs if needed
2. Found out that slapd did not restart successfully due
to wrong permissions of the certificates
3. Fixed certificates permissions and successfully restarted
slapd
At the end of the above process things started to work again.
The permission tweaking addition to copy_ldap_certs_to_secure.sh
script I've created a pull request for (please review):
https://git.osgeo.org/gitea/sac/ansible-deployment/pulls/8
Why the copy_ldap_certs_to_secure.sh script invocation was NOT
performed automatically from the crontab of tech_dev is yet
to be understood, and I ticketed it here:
https://git.osgeo.org/gitea/sac/ansible-deployment/issues/9
Looking forward for the new sysadmin contract !
--strk;
On Tue, Mar 02, 2021 at 09:57:09AM +0100, Sandro Santilli wrote:
> Today tracsvn container cannot connect LDAP server.
>
> The current configuration for LDAP client on that machine
> is to use the public DNS name for the service (ldap.osgeo.org)
> but attempts to reach that host on port 389 hangs indefinitely.
> Hitting the host on port 636 is fine, with netcat:
>
> tracsvn:~# nc -vz ldap.osgeo.org 636
> DNS fwd/rev mismatch: ldap.osgeo.org != base.osgeo.osuosl.org
> ldap.osgeo.org [140.211.15.57] 636 (ldaps) open
>
> But "can't contact" with ldapsearch:
>
> tracsvn:~# ldapsearch -H ldaps://ldap.osgeo.org:636 -x 'uid=strk'
> ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
>
> The LXD configuration on osgeo7 requests to listen on port 636
> for the ldap.osgeo.org IP (140.211.15.57) and connect it to port
> 636 of 127.0.0.1 of the "secure" container. Indeed I cannot contact
> the server on that port from secure:
>
> secure:~# ldapsearch -H ldaps://127.0.0.1:636 -x 'uid=strk'
> ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
>
> While I do can see the ports open (both 636 and 389):
>
> secure:~# netstat -tnlp | grep '\(389\|636\)'
> tcp 0 0 0.0.0.0:636 0.0.0.0:* LISTEN 29044/slapd
> tcp 0 0 0.0.0.0:389 0.0.0.0:* LISTEN 29044/slapd
> tcp6 0 0 :::636 :::* LISTEN 29044/slapd
> tcp6 0 0 :::389 :::* LISTEN 29044/slapd
>
> Logs from the journal don't even see attempts to connect, but the
> startup messages do contain some info about failures:
>
> secure:~# journalctl -x -u slapd.service -f
> Mar 02 08:30:05 secure systemd[1]: slapd.service: Failed to reset devices.list: Operation not permitted
> Mar 02 08:30:05 secure systemd[1]: slapd.service: Failed to set invocation ID on control group /system.slice/slapd.service, ignoring: Operation not permitted
>
> Ever saw those messages? Ideas what could we be up to ?
> Shall I blindly try a stop/start cycle on the LXD container ?
>
> --strk;
More information about the Sac
mailing list