[QGIS-Developer] [Poll created] Re: "Human In The Loop" Policy For AI/Tool-Assisted Contributions

Even Rouault even.rouault at spatialys.com
Fri Feb 6 15:30:59 PST 2026


Vincent,

Thanks for your feedback. Yes copyright / IP issues are a tricky problem 
to deal with and you make good points on how things could go wrong. That 
said, in practice I believe most generated code would be mostly derived 
from QGIS itself, QT code or doc or be non-copyrightable material. I can 
imagine though that contributions including non-trivial algorithms could 
possibly be infringing copyright. In that situation, the reviewers could 
ask the submitter to bring more light on the provenance of such code 
(the contributor may ask the LLM to dig for references for the 
provenance and check they are OK with GPL2 inclusion, being aware that 
the LLM could hallucinate them...), and if no satisfactory answer is 
given, reject the contribution. But that can be admitedly hard to spot 
for reviewers. I'd be happy to amend the QEP with that if someone can 
propose an adequate formulation.

I've doubts a "no AI" policy is achievable in practice, or people will 
lie.  As you mention, we can require people to mention the tool they 
have used, possibly the prompt(s) they use, which manual modifications 
they applied on top of that.  Doesn't the paragraph starting at line 35 
(https://github.com/qgis/QGIS-Enhancement-Proposals/pull/363/changes#diff-4f4102e51f04fdfc82e843c6942abe9965c03ac85a92e9becf21bcca8b5571adR35) 
cover enough your point about "have a mandatory mention and description 
of LLM usage for each contribution" ?

The main driver for this QEP was to give us a tool to be able to quickly 
reject sloppy contributions with a solid reference to back our 
decisions, but we must indeed decide whether we go further than this.

For that purpose, I've created a quick poll at 
https://docs.google.com/forms/d/e/1FAIpQLSdnVWoD5DrwCbNXqPqHsLw2jfbLkPMKBkvfyQfTQOPZkj_EaQ/viewform 
so we can gather opinions on the general direction we want on that 
subject. All, please fill!

Even

Le 06/02/2026 à 18:01, Vincent Picavet via QGIS-Developer a écrit :
> Hi,
>
> I would double-down on Greg Troxel's advice concerning copyright 
> issues, especially concerning the introduction of LLM-generated code 
> into QGIS codebase.
>
> Opensource's success is based on these main characteristics : quality, 
> security, trust.
>
> AI contributions pose a threat to quality, security and trust alike.
>
> A human-in-the-loop policy for contributions written with AI may help 
> for quality and security issues, but will still leaves a huge problem 
> for trust.
>
> Among the various aspects of trust, what worries me most right now is 
> the copyright issue. OpenSource software is based on intellectual 
> property laws, and especially on copyright, to be able to derive 
> copyleft and grant more rights to end-users.
>
> End-user trust opensource software from a legal point of view because :
>
> - they are backed by well-established copyright laws
>
> - they have clear and well established end-users contracts ( 
> opensource licences )
>
> - they have a full record of modifications of the source code, hence a 
> full lineage and certification of IP rights for the code
>
> - also, foundations like OSGeo additionnaly put a stamp on the 
> software to guarantee that process and initial IP can be trusted 
> enough to have a legal insurance concerning the software
>
> Introducing IA black boxes into the development process breaks the 
> ability to control the lineage of the code and guarantee that it is a 
> genuine invention, and therefore allowed to be licenced under the GPL.
>
> For quality and security, a developer can always intrinsically assess 
> that the generated code has the required level of quality, and that it 
> does not include any security flaw.
>
> But **there is no way for a developer to evaluate the IP rights on a 
> code generated by a LLM**. How would one do it, since the code has 
> been generated through a total opaque black box ingesting 
> non-identified enormous volumes of data ?
>
> Today, we definitely know that LLMs ( ChatGPT, Claude and others ) 
> have been trained on illegal copyrighted material. It is proven that 
> they trained LLMs on pirated books. Furthermore, every time someone 
> complaints about IP violation by LLM, big corps settle a financial 
> arrangement with the copyright owners and move on.
>
> There is therefore no doubt that they have also trained LLMs on 
> proprietary code. And also on opensource code not compliant with GPLv2+.
>
> Big corp. currently hide behind a "fair use" argument, but this is 
> clearly rubbish, otherwise why would they bother to settle large 
> financial deals with copyright owners ?
>
> So, LLM-generated code contributed to QGIS will at some point be 
> plagiarized from random code available on the internet, and neither 
> QGIS.org nor the contributor will be able to know.
>
> If we start accepting such code without being able to check provenance 
> or copyright issues, it will end up buried deep inside QGIS, and the 
> day we will discover that it infringes copyright, it will be a 
> nightmare to solve : in this case we will want to revert all 
> incriminated code, and also all code depending on the plagiarized code 
> **and have it rewritten from scratch by someone who has never read the 
> plagiarized code** ( ref : SCO/UNIX for example ). This is almost 
> impossible.
>
> This would be a nightmare, just for one identified contribution.
>
> Even more, if/when the fair-use principle of LLMs falls down, then all 
> LLM-generated code should be removed from QGIS, and all code depending 
> on it. This is a really high risk with high impact.
>
> You may say : "ok but everyone does it, the chances of being caught 
> are low, why not benefit from the opportunity ?"
>
> Then what about "everyone copies GPL code into proprietary code, the 
> chances of being caught are low, why not benefit from the opportunity ?"
>
> Copyright is at the foundation of OpenSource software, and especially 
> GPL-based software. If we choose to deny it, then we loose our core 
> principle.
>
> In the text Even propose, there is a copyright section, pushing the 
> responsibility of IP compliance control back to the contributor. It 
> may protect QGIS.org or other developers from being sued whenever 
> there is a problem, or they could sue back the faulty contributor, but 
> this is not enough :
>
> - the faulty contributor has no way to ensure his generated code has 
> no IP issue ( other than NOT using LLMs ) : responsibility without any 
> mean of action is not fair and sustainable
>
> - even if the QGIS projet can avoid being convicted by transferring 
> responsibility, then the situation would still be open and be a 
> nightmare : removing plagiarized code entangled down the core of the 
> software and all its dependency code, and rewrite it without IP issue 
> is really hard
>
> Therefore, I do not think this mention is enough for IP protection.
>
> This rationale concerns the generated code itself, contributed to QGIS 
> or other software in the ecosystem. LLMs may be useful and without IP 
> risks to help find bugs, write parts of documentations where there is 
> no risk of plagiarism, or other use cases.
>
> But I would definitely **forbid any generated code to be introduced 
> into the main source code because of IP risk**.
>
> Also, the least we can do for any contribution, is not only to have a 
> human in the loop, but also to have a mandatory mention and 
> description of LLM usage for each contribution. This would at least 
> give traceability. It does not solve anything, but in case of a 
> problem, we could at least start to investigate.
>
> A am glad this conversation takes place, and willing to pursue the 
> discussion, sorry for having been long.
>
> Have a nice weekend,
>
> Vincent
>
>
>
>
>
> On 31/01/2026 01:01, Greg Troxel via QGIS-Developer wrote:
>> I would suggest a much stronger policy:
>>
>>    no LLM-generated code or discussion may be submitted to any QGIS 
>> forum
>>
>>
>> The idea that LLM-generated code has been "reviewed" intends to be that
>> it is of high enough quality that it is reasonable for *humans* to spend
>> time reviewing it.  But I don't believe that asking that it be reviewed
>> will achieve that in practice.
>>
>> I've already had the experience (in a different project) of seeing a
>> posted PR(ish, patch on list), taking the time to comment, and getting
>> LLM-generated (vacuous) replies to my comments.
>>
>> Besides the ethical problems with asking humans to review, improve,
>> judge or in any other way pay attention to LLM output, there's the
>> problem of copyright.  While machine-generated text isn't copyrightable
>> as is, LLM output is a derived work of stolen human work, scraped
>> and used without permission, often as DDOS.
>>
>> On the basis of each reason, I believe the policy about LLM should just
>> be "no".
>> _______________________________________________
>> QGIS-Developer mailing list
>> QGIS-Developer at lists.osgeo.org
>> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
>> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> _______________________________________________
> QGIS-Developer mailing list
> QGIS-Developer at lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

-- 
http://www.spatialys.com
My software is free, but my time generally not.



More information about the QGIS-Developer mailing list