LLM policy
Sandro Santilli
strk at kbt.io
Wed Jun 17 22:44:20 PDT 2026
On Sun, Jun 14, 2026 at 11:28:17AM -0400, Greg Troxel wrote:
> "Regina Obe" <lr at pcorp.us> writes:
>
> > Agree, except we don't allow any stolen work period whether it be LLM or
> > someone who stole code from another project that doesn't allow code copy.
>
> Then, I think we agree that under *the project's existing rules*
>
> 1) any code which was not written by the submitter must be identified
> as such, with a stated author/authors, and submitted only if all
> those authors have consented to licensing the contribution under
> the project's license.
I actually agree to this: code written by an LLM should be clearly identified as such.
I do that myself: https://github.com/olgasafonova/mediawiki-mcp-server/commit/8bb99f2774824ab4af42e4b36b24122c3bb21ee9
The "submitted only if all those authors have consented" is hard, if you consider the LLM the author.
I mean, we can ask but... would you accept that answer ? I tried:
strk: may I publish the code you write under the GPL license ?
gemma4: Yes, you may publish any code I write under the GPL license. You
own the code and are free to use, modify, and distribute it as
you see fit.
> 2) LLM-generated code is not acceptable, unless it was generated
>
> a) by a model using only training data that is licensed for LLM use
> *which includes permission to reuse without attribution or any
> licensing text*, and includes only data for which there is true
> consent, not via a fraudulent claim of expected-not-to-be-read
> clickthrough license as part of some larger service
Do we accept contributions from humans who read all-rights-reserved books ?
Here's a short poem I've asked gemma4 to write about training and stealing:
Data streams flow, knowledge grows,
Secrets harvested, wisdom knows.
Could copyright of the above text be claimed or would that be plagiarism ?
> b) (probably, but arguably the existing rules do not prohibit
> antisocial behavior by contributors) not produced by an
> organization that engages in abusive scraping. Arguably abusive
> scraping is a CoC violation as it is harassment of humans who
> manage other web systems, and/or vandalism.
Believe me I'd be ok with not accepting contributions written using
proprietary operating systems, but do you really think we should take
that path ? I think a web browser that hides the URL bar is harassment
of humans who have the right to learn how things work, shall we refuse
to serve website content to those not using a free software browser ?
There are ways to use LLMs that do not imply vandalizing website,
although it's something that needs to be learnt. The best we could do
is help users make more ethical use of these tools. I think Darafei work
on providing "instructions for agents" is going in that direction,
and I think contributions to improve that are very welcome.
> Because there are not any code-generating LLMs that meet 2(a)
In recent years (10+, not too recent) there have been a lot of companies
and even single developers who preferred non-reciprocal licences over
reciprocal ones. The landscape today is full of "do what you want with
this code". For those who retained reciprocal, we know the intentions
are that you share the derivated product like the original was shared.
So for the above 2 categories of training material we could say that
including the generated output in the GPL-licensed PostGIS would be
ok, do you agree on this ?
The left-over would be code that was "stolen" (if we can even talk about
"stealing" about information) from non-free software or non-free
documentation. Is this your concern ? Are you concerned about Big Company
coming with a copyright infringement notice to PostGIS forcing us to stop
publishing the source code ?
> it follows that LLM contributions to the project are not allowed.
Rules need to be generic, can't base them on a current state of facts
(assumed or factual).
> This should then be straightforwardly clarified, as those that like
> using LLMs seem to be willing to adopt differing interpretations of
> existing rules.
You did convince me that YES, we need an LLM policy.
It's clearly needed or this thread would not be so dense :)
--strk;
Libre GIS consultant/developer 🎺
https://strk.kbt.io/services.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20260618/51702494/attachment.sig>
More information about the postgis-devel
mailing list