[postgis-users] ODBC connection problem, connection lost

Marco Boeringa marco at boeringa.demon.nl
Fri Aug 21 02:24:36 PDT 2020


Hi all,

I am running into an elusive issue that I have great difficulty to 
explain. Please note that, although I can find my way through it, I am 
not an Ubuntu nor network savvy person, so I may be overlooking 
something obvious.

What is my setup:

Ubuntu 20.04 with PostgreSQL 12 + PostGIS 3.0.2 running on an Oracle 
VirtualBox machine, so the Ubuntu OS being the guest operating system. 
This runs on a Windows 10 host.

Both guest and host have fixed, manually set, IP addresses set in their 
network connections. Ubuntu runs the UFW firewall with the proper ports 
set to allow traffic, Windows has the same set in the firewall software 
I use.

This all works fine, I can connect and do lots of stuff without issues.

*The problem*:

I have some code that does a multi-threaded set of INSERTs into my 
database, via Python multi-threading using 'pyodbc'. This needs to 
insert some 175M spatial records into a table. I have tested this with 
smaller tables (+/-15M), and it works fine. However, as soon as I try to 
insert the 175M records, at some point, probably close to 10-15% of 
records, I suddenly loose my network connection with a generic HY000 
error, that as far as I can tell is an ODBC level error, not returned by 
PostgreSQL. I have tried this multiple times, and it consistently 
happens. Note that the code commits in batches of about 10.000 records each.

At this point, I have verified the following:

- The guest Ubuntu OS runs on a 32GB RAM, quad core test machine. Memory 
usage is not at its limits when this happens, neither is swap space 
running out as far as I can tell (although I haven't been able to 
monitor this at the exact moment it fails, this is just an observation 
from logging into the guest after the fact). Note though that I set 72 
GB swap in Ubuntu, and I have never seen Ubuntu use more than a few GB 
of swap in its System Monitor, so the 72 GB is probably totally overkill.

- No problem with disk space, there is 2x 1.4TB SSD available to the 
guest, and only about half of total available disk space is used, so no 
disk space problem.

- The UFW firewall is up and running when I check the status with 'sudo 
ufw status', nothing else to be seen there.

- Not unsurprisingly, when I try to 'ping' the host from the guest or 
the other way around, the network connection is indeed down, and host / 
guest unreachable at this point.

- However, I *can* open Firefox on the Ubuntu guest, and connect to the 
internet and browse the internet no problem

- I have two Ubuntu network connections, both of them report status 
"Connected" at this point, so no error despite not being able to ping 
the host.

- If I open 'pgAdmin 4' on the 'host', it *cannot* connect to the 
database. When I open pgAdmin 4 on the guest, it connects without 
issues, and the database is shown to be up and running. I can open 
tables, view records, and even open the table receiving the INSERTs and 
view its contents. So PostgreSQL seems to be fine.

- I have had a look at the Ubuntu logs, but to be honest, it is like a 
foreign language to me, and I don't see anything obvious.

- When I restart the Ubuntu guest, the issue is resolved and I can 
connect and successfully ping host and guest again.

- As to pyodbc, I have set the connections 'maxwrite' to the recommended 
'1024 * 1024 * 1024' value according to the pyodbc documentation here: 
https://github.com/mkleehammer/pyodbc/wiki/Connecting-to-PostgreSQL, but 
I think this setting is wholly unrelated to this issue, but the 
documentation is in fact unclear as to what this setting really means.

Does anyone have a clue of what may be going on, suggestions how to 
tackle such issues, or ever encountered similar issues and been able to 
deal with it somehow? I am kind of lost at this point.

Marco



More information about the postgis-users mailing list