<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><font face="monospace">Hello,</font></p>
<div class="moz-cite-prefix">On 11/02/2026 16:03, Martin Dobias
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAC2XbFeMCEiio4D5LTxpdopyXarUaByqWWqqxsPamcH7FBiS2w@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>Hi all</div>
<div><br>
</div>
<div>My perspective on AI tools is that it's just another tool
in a developer's toolbox, just like a debugger or a code
analyzer.</div>
</div>
</blockquote>
<p>AI tools - LLMs have very specific characteristics which are
important to take into account : cost, transparency, vendor
lock-in, social and environmental impacts to just cite a few
aspects, are totally different for LLMs than for other tools like
a debugger or code analyzer.</p>
<p>Having a developer-centric approach to tooling would most
probably leads us to choices not compatible with our global
missions and views.</p>
<blockquote type="cite"
cite="mid:CAC2XbFeMCEiio4D5LTxpdopyXarUaByqWWqqxsPamcH7FBiS2w@mail.gmail.com">
<div dir="ltr">
<div>There are lots of unknowns about the training data, I
agree. But being conservative and adopting a strict no AI
tools policy seems like restricting ourselves from using the
best tools for the job. In the end, it is still the
responsibility of the developer to ensure their contribution
is correct, and not violating copyright, whether they use AI
tools or not.</div>
</div>
</blockquote>
<p>I have not been talking about a strict No-AI policy. I am saying
that **code generation** by LLM pose an existential threat to
OpenSource projects like QGIS.</p>
<p>AI tools may be useful for other tasks than code generation.</p>
<p>Again, pushing the responsibility to individual contributors
would be very hypocritical : there is **no way** for anyone to
assess the correctness of a contribution generated by LLM
concerning copyright violation. Putting responsibility on people
without giving them means of action and verification is definitely
not something we should defend. </p>
<p>Quality and security are concerns that can be intrinsically
assessed by individuals, IP issues are not.</p>
<blockquote type="cite"
cite="mid:CAC2XbFeMCEiio4D5LTxpdopyXarUaByqWWqqxsPamcH7FBiS2w@mail.gmail.com">
<div dir="ltr">
<div>It is also hard to draw a line for a conservative "no AI"
policy:</div>
<div>- is it acceptable to brainstorm design with AI?</div>
<div>- is it acceptable to get a prototype built with AI, for
inspiration?</div>
<div>- is it acceptable to get AI to check code for bugs?</div>
<div>- is it acceptable to ask AI to improve tone of my reviews?</div>
<div><br>
</div>
<div>With strict no AI policy, I guess we would also need to
make sure that all of the 50+ dependencies also have strict no
AI policy, otherwise QGIS builds could still in theory contain
AI-generated copyrighted code? I am not sure that is
realistic...</div>
</div>
</blockquote>
<p>Again, who has been arguing for a no-AI policy ? There are more
balanced policies which can be evaluated.</p>
<blockquote type="cite"
cite="mid:CAC2XbFeMCEiio4D5LTxpdopyXarUaByqWWqqxsPamcH7FBiS2w@mail.gmail.com">
<div dir="ltr">
<div>Let's be pragmatic: AI tools are here to stay, we can
either ignore them, or we can learn to use them responsibly to
deliver even more QGIS goodness :-) And we can expect that
with the increased risk of copyright issues, there will be
automated tools to scan the code for possible copyright
problems, integrated in CI, which will flag any risky
contributions.</div>
</div>
</blockquote>
<p>"AI tools are here to stay" is your own opinion, and an idea LLM
companies are trying to convince everyone it is a fact. Given the
economy of data centers and LLM companies, this may or may not be
true at all. Or at least not at the current costs, which is also a
strong issue. To make a comparison, note that there was a time
when asbestos was there to stay too.</p>
<p>Also "we can expect that... " sounds like magic thoughts,
definitely not something we can really count on.</p>
<p>We cannot ignore LLM and AI though, and this is why this
discussion takes place.</p>
<p>Also, I saw an affirmation about "fair use" being the default
legal position in the US right now. This is again what the 7
magnificents want you to believe, but actual analysis for the US
congress is far away from this statement : </p>
<p><a class="moz-txt-link-freetext" href="https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf">https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf</a></p>
<p>Last but not least, let me point to the latest development of "AI
contributions to OpenSource" : </p>
<p><a class="moz-txt-link-freetext" href="https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/">https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/</a></p>
<p>We are living interesting times…</p>
<p>Vincent</p>
<blockquote type="cite"
cite="mid:CAC2XbFeMCEiio4D5LTxpdopyXarUaByqWWqqxsPamcH7FBiS2w@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>Cheers</div>
<div>Martin</div>
<div><br>
</div>
<br>
<div class="gmail_quote gmail_quote_container">
<div dir="ltr" class="gmail_attr">On Tue, Feb 10, 2026 at
6:06 PM Vincent Picavet via QGIS-Developer <<a
href="mailto:qgis-developer@lists.osgeo.org"
moz-do-not-send="true" class="moz-txt-link-freetext">qgis-developer@lists.osgeo.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello,<br>
<br>
On 07/02/2026 00:30, Even Rouault wrote:<br>
> Thanks for your feedback. Yes copyright / IP issues are
a tricky problem to deal with and you make good points on
how things could go wrong. That said, in practice I believe
most generated code would be mostly derived from QGIS
itself, QT code or doc or be non-copyrightable material. I
can <br>
<br>
I seriously doubt that, given the amount of copyrighted
material that has already been used to train LLMs. And most
important, we have no way to know that, except if the LLM
companies get subpoenaed and disclose hidden practices when
asked in court.<br>
<br>
Believing is not enough when talking about legal stuff.<br>
<br>
> imagine though that contributions including non-trivial
algorithms could possibly be infringing copyright. In that
situation, the reviewers could ask the <br>
Even simpler code can be plagiarized, not only non-trivial
algorithms.<br>
> submitter to bring more light on the provenance of such
code (the contributor may ask the LLM to dig for references
for the provenance and check they are OK with GPL2
inclusion, being aware that the LLM could hallucinate
them...), and if no satisfactory answer is given, reject the
contribution. But that can be admitedly hard to spot for
reviewers. I'd be happy to amend the QEP with that if
someone can propose an adequate formulation.<br>
Again, a reviewer has no way to know or imagine that a code
has been plagiarized : you would have to be aware of the
full training dataset of the LLM, and this is 1. behind
closed fences 2. not humanly possible.<br>
> I've doubts a "no AI" policy is achievable in practice,
or people will lie. As you mention, we can require people
to mention the tool they have used, possibly the prompt(s)
they use, which manual modifications they applied on top of
that. Doesn't the paragraph starting at line 35 (<a
href="https://github.com/qgis/QGIS-Enhancement-Proposals/pull/363/changes#diff-4f4102e51f04fdfc82e843c6942abe9965c03ac85a92e9becf21bcca8b5571adR35"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://github.com/qgis/QGIS-Enhancement-Proposals/pull/363/changes#diff-4f4102e51f04fdfc82e843c6942abe9965c03ac85a92e9becf21bcca8b5571adR35</a>)
cover enough your point about "have a mandatory mention and
description of LLM usage for each contribution" ?<br>
<br>
Lying about not using AI is just like lying about being sure
one has the right to contribute the code ( see contributor's
agreement ) : if this is a rule, we ask people and they lie,
then we should also have sanctions and be strict about it.<br>
<br>
For me, full transparency about usage of a blackbox **for
code generation** is not enough as a protection against
legal matters. For anything else that can be AI-aided, this
is more about resiliency, transparency and trust, and we
should make it clear that explaining exactly how AI has been
used is mandatory and should be given along every
contribution.<br>
<br>
> The main driver for this QEP was to give us a tool to
be able to quickly reject sloppy contributions with a solid
reference to back our decisions, but we must indeed decide
whether we go further than this.<br>
<br>
I guess there is matter to debate and a lot of uncertainty.
Hence the conservative approach, to avoid the worst and
maybe open it later on whenever we see the situation
clearer.<br>
<br>
> For that purpose, I've created a quick poll at <a
href="https://docs.google.com/forms/d/e/1FAIpQLSdnVWoD5DrwCbNXqPqHsLw2jfbLkPMKBkvfyQfTQOPZkj_EaQ/viewform"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://docs.google.com/forms/d/e/1FAIpQLSdnVWoD5DrwCbNXqPqHsLw2jfbLkPMKBkvfyQfTQOPZkj_EaQ/viewform</a>
so we can gather opinions on the general direction we want
on that subject. All, please fill!<br>
<br>
Ok to gather more advice, thanks for running the poll,<br>
<br>
<br>
Vincent<br>
<br>
><br>
> Even<br>
><br>
> Le 06/02/2026 à 18:01, Vincent Picavet via
QGIS-Developer a écrit :<br>
>> Hi,<br>
>><br>
>> I would double-down on Greg Troxel's advice
concerning copyright issues, especially concerning the
introduction of LLM-generated code into QGIS codebase.<br>
>><br>
>> Opensource's success is based on these main
characteristics : quality, security, trust.<br>
>><br>
>> AI contributions pose a threat to quality, security
and trust alike.<br>
>><br>
>> A human-in-the-loop policy for contributions
written with AI may help for quality and security issues,
but will still leaves a huge problem for trust.<br>
>><br>
>> Among the various aspects of trust, what worries me
most right now is the copyright issue. OpenSource software
is based on intellectual property laws, and especially on
copyright, to be able to derive copyleft and grant more
rights to end-users.<br>
>><br>
>> End-user trust opensource software from a legal
point of view because :<br>
>><br>
>> - they are backed by well-established copyright
laws<br>
>><br>
>> - they have clear and well established end-users
contracts ( opensource licences )<br>
>><br>
>> - they have a full record of modifications of the
source code, hence a full lineage and certification of IP
rights for the code<br>
>><br>
>> - also, foundations like OSGeo additionnaly put a
stamp on the software to guarantee that process and initial
IP can be trusted enough to have a legal insurance
concerning the software<br>
>><br>
>> Introducing IA black boxes into the development
process breaks the ability to control the lineage of the
code and guarantee that it is a genuine invention, and
therefore allowed to be licenced under the GPL.<br>
>><br>
>> For quality and security, a developer can always
intrinsically assess that the generated code has the
required level of quality, and that it does not include any
security flaw.<br>
>><br>
>> But **there is no way for a developer to evaluate
the IP rights on a code generated by a LLM**. How would one
do it, since the code has been generated through a total
opaque black box ingesting non-identified enormous volumes
of data ?<br>
>><br>
>> Today, we definitely know that LLMs ( ChatGPT,
Claude and others ) have been trained on illegal copyrighted
material. It is proven that they trained LLMs on pirated
books. Furthermore, every time someone complaints about IP
violation by LLM, big corps settle a financial arrangement
with the copyright owners and move on.<br>
>><br>
>> There is therefore no doubt that they have also
trained LLMs on proprietary code. And also on opensource
code not compliant with GPLv2+.<br>
>><br>
>> Big corp. currently hide behind a "fair use"
argument, but this is clearly rubbish, otherwise why would
they bother to settle large financial deals with copyright
owners ?<br>
>><br>
>> So, LLM-generated code contributed to QGIS will at
some point be plagiarized from random code available on the
internet, and neither QGIS.org nor the contributor will be
able to know.<br>
>><br>
>> If we start accepting such code without being able
to check provenance or copyright issues, it will end up
buried deep inside QGIS, and the day we will discover that
it infringes copyright, it will be a nightmare to solve : in
this case we will want to revert all incriminated code, and
also all code depending on the plagiarized code **and have
it rewritten from scratch by someone who has never read the
plagiarized code** ( ref : SCO/UNIX for example ). This is
almost impossible.<br>
>><br>
>> This would be a nightmare, just for one identified
contribution.<br>
>><br>
>> Even more, if/when the fair-use principle of LLMs
falls down, then all LLM-generated code should be removed
from QGIS, and all code depending on it. This is a really
high risk with high impact.<br>
>><br>
>> You may say : "ok but everyone does it, the chances
of being caught are low, why not benefit from the
opportunity ?"<br>
>><br>
>> Then what about "everyone copies GPL code into
proprietary code, the chances of being caught are low, why
not benefit from the opportunity ?"<br>
>><br>
>> Copyright is at the foundation of OpenSource
software, and especially GPL-based software. If we choose to
deny it, then we loose our core principle.<br>
>><br>
>> In the text Even propose, there is a copyright
section, pushing the responsibility of IP compliance control
back to the contributor. It may protect QGIS.org or other
developers from being sued whenever there is a problem, or
they could sue back the faulty contributor, but this is not
enough :<br>
>><br>
>> - the faulty contributor has no way to ensure his
generated code has no IP issue ( other than NOT using LLMs )
: responsibility without any mean of action is not fair and
sustainable<br>
>><br>
>> - even if the QGIS projet can avoid being convicted
by transferring responsibility, then the situation would
still be open and be a nightmare : removing plagiarized code
entangled down the core of the software and all its
dependency code, and rewrite it without IP issue is really
hard<br>
>><br>
>> Therefore, I do not think this mention is enough
for IP protection.<br>
>><br>
>> This rationale concerns the generated code itself,
contributed to QGIS or other software in the ecosystem. LLMs
may be useful and without IP risks to help find bugs, write
parts of documentations where there is no risk of
plagiarism, or other use cases.<br>
>><br>
>> But I would definitely **forbid any generated code
to be introduced into the main source code because of IP
risk**.<br>
>><br>
>> Also, the least we can do for any contribution, is
not only to have a human in the loop, but also to have a
mandatory mention and description of LLM usage for each
contribution. This would at least give traceability. It does
not solve anything, but in case of a problem, we could at
least start to investigate.<br>
>><br>
>> A am glad this conversation takes place, and
willing to pursue the discussion, sorry for having been
long.<br>
>><br>
>> Have a nice weekend,<br>
>><br>
>> Vincent<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> On 31/01/2026 01:01, Greg Troxel via QGIS-Developer
wrote:<br>
>>> I would suggest a much stronger policy:<br>
>>><br>
>>> no LLM-generated code or discussion may be
submitted to any QGIS forum<br>
>>><br>
>>><br>
>>> The idea that LLM-generated code has been
"reviewed" intends to be that<br>
>>> it is of high enough quality that it is
reasonable for *humans* to spend<br>
>>> time reviewing it. But I don't believe that
asking that it be reviewed<br>
>>> will achieve that in practice.<br>
>>><br>
>>> I've already had the experience (in a different
project) of seeing a<br>
>>> posted PR(ish, patch on list), taking the time
to comment, and getting<br>
>>> LLM-generated (vacuous) replies to my comments.<br>
>>><br>
>>> Besides the ethical problems with asking humans
to review, improve,<br>
>>> judge or in any other way pay attention to LLM
output, there's the<br>
>>> problem of copyright. While machine-generated
text isn't copyrightable<br>
>>> as is, LLM output is a derived work of stolen
human work, scraped<br>
>>> and used without permission, often as DDOS.<br>
>>><br>
>>> On the basis of each reason, I believe the
policy about LLM should just<br>
>>> be "no".<br>
>>> _______________________________________________<br>
>>> QGIS-Developer mailing list<br>
>>> <a
href="mailto:QGIS-Developer@lists.osgeo.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">QGIS-Developer@lists.osgeo.org</a><br>
>>> List info: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
>>> Unsubscribe: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
>> _______________________________________________<br>
>> QGIS-Developer mailing list<br>
>> <a href="mailto:QGIS-Developer@lists.osgeo.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">QGIS-Developer@lists.osgeo.org</a><br>
>> List info: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
>> Unsubscribe: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
><br>
_______________________________________________<br>
QGIS-Developer mailing list<br>
<a href="mailto:QGIS-Developer@lists.osgeo.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">QGIS-Developer@lists.osgeo.org</a><br>
List info: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
Unsubscribe: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
</blockquote>
</div>
</div>
<br>
<div class="gmail_quote gmail_quote_container">
<div dir="ltr" class="gmail_attr">On Tue, Feb 10, 2026 at
6:06 PM Vincent Picavet via QGIS-Developer <<a
href="mailto:qgis-developer@lists.osgeo.org"
moz-do-not-send="true" class="moz-txt-link-freetext">qgis-developer@lists.osgeo.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello,<br>
<br>
On 07/02/2026 00:30, Even Rouault wrote:<br>
> Thanks for your feedback. Yes copyright / IP issues are a
tricky problem to deal with and you make good points on how
things could go wrong. That said, in practice I believe most
generated code would be mostly derived from QGIS itself, QT
code or doc or be non-copyrightable material. I can <br>
<br>
I seriously doubt that, given the amount of copyrighted
material that has already been used to train LLMs. And most
important, we have no way to know that, except if the LLM
companies get subpoenaed and disclose hidden practices when
asked in court.<br>
<br>
Believing is not enough when talking about legal stuff.<br>
<br>
> imagine though that contributions including non-trivial
algorithms could possibly be infringing copyright. In that
situation, the reviewers could ask the <br>
Even simpler code can be plagiarized, not only non-trivial
algorithms.<br>
> submitter to bring more light on the provenance of such
code (the contributor may ask the LLM to dig for references
for the provenance and check they are OK with GPL2 inclusion,
being aware that the LLM could hallucinate them...), and if no
satisfactory answer is given, reject the contribution. But
that can be admitedly hard to spot for reviewers. I'd be happy
to amend the QEP with that if someone can propose an adequate
formulation.<br>
Again, a reviewer has no way to know or imagine that a code
has been plagiarized : you would have to be aware of the full
training dataset of the LLM, and this is 1. behind closed
fences 2. not humanly possible.<br>
> I've doubts a "no AI" policy is achievable in practice,
or people will lie. As you mention, we can require people to
mention the tool they have used, possibly the prompt(s) they
use, which manual modifications they applied on top of that.
Doesn't the paragraph starting at line 35 (<a
href="https://github.com/qgis/QGIS-Enhancement-Proposals/pull/363/changes#diff-4f4102e51f04fdfc82e843c6942abe9965c03ac85a92e9becf21bcca8b5571adR35"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://github.com/qgis/QGIS-Enhancement-Proposals/pull/363/changes#diff-4f4102e51f04fdfc82e843c6942abe9965c03ac85a92e9becf21bcca8b5571adR35</a>)
cover enough your point about "have a mandatory mention and
description of LLM usage for each contribution" ?<br>
<br>
Lying about not using AI is just like lying about being sure
one has the right to contribute the code ( see contributor's
agreement ) : if this is a rule, we ask people and they lie,
then we should also have sanctions and be strict about it.<br>
<br>
For me, full transparency about usage of a blackbox **for code
generation** is not enough as a protection against legal
matters. For anything else that can be AI-aided, this is more
about resiliency, transparency and trust, and we should make
it clear that explaining exactly how AI has been used is
mandatory and should be given along every contribution.<br>
<br>
> The main driver for this QEP was to give us a tool to be
able to quickly reject sloppy contributions with a solid
reference to back our decisions, but we must indeed decide
whether we go further than this.<br>
<br>
I guess there is matter to debate and a lot of uncertainty.
Hence the conservative approach, to avoid the worst and maybe
open it later on whenever we see the situation clearer.<br>
<br>
> For that purpose, I've created a quick poll at <a
href="https://docs.google.com/forms/d/e/1FAIpQLSdnVWoD5DrwCbNXqPqHsLw2jfbLkPMKBkvfyQfTQOPZkj_EaQ/viewform"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://docs.google.com/forms/d/e/1FAIpQLSdnVWoD5DrwCbNXqPqHsLw2jfbLkPMKBkvfyQfTQOPZkj_EaQ/viewform</a>
so we can gather opinions on the general direction we want on
that subject. All, please fill!<br>
<br>
Ok to gather more advice, thanks for running the poll,<br>
<br>
<br>
Vincent<br>
<br>
><br>
> Even<br>
><br>
> Le 06/02/2026 à 18:01, Vincent Picavet via QGIS-Developer
a écrit :<br>
>> Hi,<br>
>><br>
>> I would double-down on Greg Troxel's advice
concerning copyright issues, especially concerning the
introduction of LLM-generated code into QGIS codebase.<br>
>><br>
>> Opensource's success is based on these main
characteristics : quality, security, trust.<br>
>><br>
>> AI contributions pose a threat to quality, security
and trust alike.<br>
>><br>
>> A human-in-the-loop policy for contributions written
with AI may help for quality and security issues, but will
still leaves a huge problem for trust.<br>
>><br>
>> Among the various aspects of trust, what worries me
most right now is the copyright issue. OpenSource software is
based on intellectual property laws, and especially on
copyright, to be able to derive copyleft and grant more rights
to end-users.<br>
>><br>
>> End-user trust opensource software from a legal point
of view because :<br>
>><br>
>> - they are backed by well-established copyright laws<br>
>><br>
>> - they have clear and well established end-users
contracts ( opensource licences )<br>
>><br>
>> - they have a full record of modifications of the
source code, hence a full lineage and certification of IP
rights for the code<br>
>><br>
>> - also, foundations like OSGeo additionnaly put a
stamp on the software to guarantee that process and initial IP
can be trusted enough to have a legal insurance concerning the
software<br>
>><br>
>> Introducing IA black boxes into the development
process breaks the ability to control the lineage of the code
and guarantee that it is a genuine invention, and therefore
allowed to be licenced under the GPL.<br>
>><br>
>> For quality and security, a developer can always
intrinsically assess that the generated code has the required
level of quality, and that it does not include any security
flaw.<br>
>><br>
>> But **there is no way for a developer to evaluate the
IP rights on a code generated by a LLM**. How would one do it,
since the code has been generated through a total opaque black
box ingesting non-identified enormous volumes of data ?<br>
>><br>
>> Today, we definitely know that LLMs ( ChatGPT, Claude
and others ) have been trained on illegal copyrighted
material. It is proven that they trained LLMs on pirated
books. Furthermore, every time someone complaints about IP
violation by LLM, big corps settle a financial arrangement
with the copyright owners and move on.<br>
>><br>
>> There is therefore no doubt that they have also
trained LLMs on proprietary code. And also on opensource code
not compliant with GPLv2+.<br>
>><br>
>> Big corp. currently hide behind a "fair use"
argument, but this is clearly rubbish, otherwise why would
they bother to settle large financial deals with copyright
owners ?<br>
>><br>
>> So, LLM-generated code contributed to QGIS will at
some point be plagiarized from random code available on the
internet, and neither QGIS.org nor the contributor will be
able to know.<br>
>><br>
>> If we start accepting such code without being able to
check provenance or copyright issues, it will end up buried
deep inside QGIS, and the day we will discover that it
infringes copyright, it will be a nightmare to solve : in this
case we will want to revert all incriminated code, and also
all code depending on the plagiarized code **and have it
rewritten from scratch by someone who has never read the
plagiarized code** ( ref : SCO/UNIX for example ). This is
almost impossible.<br>
>><br>
>> This would be a nightmare, just for one identified
contribution.<br>
>><br>
>> Even more, if/when the fair-use principle of LLMs
falls down, then all LLM-generated code should be removed from
QGIS, and all code depending on it. This is a really high risk
with high impact.<br>
>><br>
>> You may say : "ok but everyone does it, the chances
of being caught are low, why not benefit from the opportunity
?"<br>
>><br>
>> Then what about "everyone copies GPL code into
proprietary code, the chances of being caught are low, why not
benefit from the opportunity ?"<br>
>><br>
>> Copyright is at the foundation of OpenSource
software, and especially GPL-based software. If we choose to
deny it, then we loose our core principle.<br>
>><br>
>> In the text Even propose, there is a copyright
section, pushing the responsibility of IP compliance control
back to the contributor. It may protect QGIS.org or other
developers from being sued whenever there is a problem, or
they could sue back the faulty contributor, but this is not
enough :<br>
>><br>
>> - the faulty contributor has no way to ensure his
generated code has no IP issue ( other than NOT using LLMs ) :
responsibility without any mean of action is not fair and
sustainable<br>
>><br>
>> - even if the QGIS projet can avoid being convicted
by transferring responsibility, then the situation would still
be open and be a nightmare : removing plagiarized code
entangled down the core of the software and all its dependency
code, and rewrite it without IP issue is really hard<br>
>><br>
>> Therefore, I do not think this mention is enough for
IP protection.<br>
>><br>
>> This rationale concerns the generated code itself,
contributed to QGIS or other software in the ecosystem. LLMs
may be useful and without IP risks to help find bugs, write
parts of documentations where there is no risk of plagiarism,
or other use cases.<br>
>><br>
>> But I would definitely **forbid any generated code to
be introduced into the main source code because of IP risk**.<br>
>><br>
>> Also, the least we can do for any contribution, is
not only to have a human in the loop, but also to have a
mandatory mention and description of LLM usage for each
contribution. This would at least give traceability. It does
not solve anything, but in case of a problem, we could at
least start to investigate.<br>
>><br>
>> A am glad this conversation takes place, and willing
to pursue the discussion, sorry for having been long.<br>
>><br>
>> Have a nice weekend,<br>
>><br>
>> Vincent<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> On 31/01/2026 01:01, Greg Troxel via QGIS-Developer
wrote:<br>
>>> I would suggest a much stronger policy:<br>
>>><br>
>>> no LLM-generated code or discussion may be
submitted to any QGIS forum<br>
>>><br>
>>><br>
>>> The idea that LLM-generated code has been
"reviewed" intends to be that<br>
>>> it is of high enough quality that it is
reasonable for *humans* to spend<br>
>>> time reviewing it. But I don't believe that
asking that it be reviewed<br>
>>> will achieve that in practice.<br>
>>><br>
>>> I've already had the experience (in a different
project) of seeing a<br>
>>> posted PR(ish, patch on list), taking the time to
comment, and getting<br>
>>> LLM-generated (vacuous) replies to my comments.<br>
>>><br>
>>> Besides the ethical problems with asking humans
to review, improve,<br>
>>> judge or in any other way pay attention to LLM
output, there's the<br>
>>> problem of copyright. While machine-generated
text isn't copyrightable<br>
>>> as is, LLM output is a derived work of stolen
human work, scraped<br>
>>> and used without permission, often as DDOS.<br>
>>><br>
>>> On the basis of each reason, I believe the policy
about LLM should just<br>
>>> be "no".<br>
>>> _______________________________________________<br>
>>> QGIS-Developer mailing list<br>
>>> <a href="mailto:QGIS-Developer@lists.osgeo.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">QGIS-Developer@lists.osgeo.org</a><br>
>>> List info: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
>>> Unsubscribe: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
>> _______________________________________________<br>
>> QGIS-Developer mailing list<br>
>> <a href="mailto:QGIS-Developer@lists.osgeo.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">QGIS-Developer@lists.osgeo.org</a><br>
>> List info: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
>> Unsubscribe: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
><br>
_______________________________________________<br>
QGIS-Developer mailing list<br>
<a href="mailto:QGIS-Developer@lists.osgeo.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">QGIS-Developer@lists.osgeo.org</a><br>
List info: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
Unsubscribe: <a
href="https://lists.osgeo.org/mailman/listinfo/qgis-developer"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
</blockquote>
</div>
</blockquote>
</body>
</html>