<div style="font-size:inherit"><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)" dir="auto">Hello GDAL developers,</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">Over the past weeks, while contributing to GDAL and working on Python binding-related issues and PRs, I have been studying the current Python stub generation pipeline in detail. In particular, I explored the <code style="font-family:monospace">docstub</code> integration and the implementation in <code style="font-family:monospace">_analysis.py</code>, <code style="font-family:monospace">_docstrings.py</code>, and <code style="font-family:monospace">_stubs.py</code>, along with recent PRs related to docstring cleanup and stub generation.</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">From examining the code, I understand that:</p><ul style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)"><li><p><code style="font-family:monospace">.pyi</code> files are generated entirely from docstrings using a custom Lark grammar.</p></li><li><p>Type resolution is handled through <code style="font-family:monospace">TypeMatcher</code> and import reconstruction.</p></li><li><p>Unresolved types fall back to <code style="font-family:monospace">_typeshed.Incomplete</code>.</p></li><li><p>There is currently no mechanical validation step ensuring that generated stubs remain consistent with the actual runtime callable signatures produced by SWIG.</p></li></ul><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">This means the stub layer is structurally decoupled from the runtime bindings, and drift between:</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">C++ → SWIG → Python runtime → docstrings → generated stubs</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">is theoretically possible without automated detection.</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">For GSoC, I would like to explore a project focused on hardening and modernizing this pipeline through runtime–stub consistency validation and stricter enforcement mechanisms.</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">A possible scope could include:</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)" dir="auto"><strong>Runtime–Stub Signature Validator</strong></p><ul style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)"><li><p>Import <code style="font-family:monospace">osgeo</code> modules and inspect public callables using <code style="font-family:monospace">inspect.signature()</code>.</p></li><li><p>Parse generated <code style="font-family:monospace">.pyi</code> files.</p></li><li><p>Detect mismatches in parameter names, counts, defaults, and return presence.</p></li><li><p>Produce structured reports of inconsistencies.</p></li></ul><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)" dir="auto"><strong>Stricter Stub Generation Mode</strong></p><ul style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)"><li><p>Optionally fail (or emit stronger diagnostics) on unresolved types instead of silently aliasing to <code style="font-family:monospace">_typeshed.Incomplete</code>.</p></li><li><p>Provide measurable metrics on annotation coverage and unresolved types.</p></li></ul><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)" dir="auto"><strong>CI Integration</strong></p><ul style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)"><li><p>Integrate validation checks into CI to prevent silent drift over time.</p></li><li><p>Keep the approach incremental and compatible with the existing docstring-driven workflow.</p></li></ul><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">The goal would not be to redesign SWIG bindings or replace the current system, but to introduce a validation and enforcement layer that increases confidence in typing correctness, IDE support, and long-term maintainability of the Python bindings.</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">Before developing this into a formal proposal, I would really appreciate feedback on:</p><ul style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)"><li><p>Whether runtime–stub consistency validation aligns with current Python binding priorities.</p></li><li><p>Whether there are known constraints or prior efforts in this direction.</p></li><li><p>Whether this scope would be appropriate and realistic for a GSoC project.</p></li></ul><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)">Thank you very much for your time. I would be happy to refine or narrow this idea based on feedback.</p><p style="font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:inherit;color:rgb(0,0,0)" dir="auto">Best regards,<br>Sionigdha</p></div>