CONTRIBUTING: add AI contributions policy #32287

delan · 2024-05-15T09:05:49Z

Following the recent TSC discussion (servo/project#85), this patch proposes a policy for AI contributions.

Co-authored-by: Martin Robinson <mrobinson@igalia.com>

mrego · 2024-05-16T11:53:12Z

Just for future reference, this was discussed at https://github.com/servo/project/blob/main/governance/tsc/tsc-2024-04-29.md#ai-generated-prs

gterzian

I don't think we need to outright ban the use of AI to address the issues with experienced, and I don't think we can enforce such a ban. So I propose to reword this to specifically address problems we experienced(like the maintainer burden created by PRs that appeared to contain generated code which didn't seem to make much sense), and to in other cases point out broader issues to raise awareness of these among contributors.

gterzian · 2024-05-27T13:47:49Z

CONTRIBUTING.md

@@ -63,6 +63,20 @@ To run unit tests or Web Platform Tests against a pull request, add one or more
 | `T-macos` | Unit tests: macOS |
 | `T-windows` | Unit tests: Windows |

+## AI contributions
+
+Contributions must not include content generated by large language models or other probabilistic tools, including but not limited to Copilot or ChatGPT. This policy covers code, documentation, pull requests, issues, comments, and any other contributions to the Servo project.


I would reword this as: "Contributions must not include content generated by large language models, unless the contributor has made reasonable efforts to check the correctness of the code."

I would add the sentence: "Contributors are also reminded of the representation they are making with regards to intellectual property when contributing to Servo, see "2.5. Representation" of the MPL license.

I would argue that using large language models, by definition, is a failure to make reasonable efforts to check the correctness of our contributions. When we write code, for example, we act as the author. We are in the driver’s seat, deciding how to solve problems and reason about the correctness of the compiler and its output.

This is not the case for the output of generative AI tools, where we act as something more like a reviewer. Being in the passenger seat is always going to put us at a disadvantage, because we have to fight the incentive to accept the output as correct in order to think critically about it.

gterzian · 2024-05-27T13:50:28Z

CONTRIBUTING.md

+
+**Maintainer burden:** Reviewers depend on contributors to write and test their code before submitting it. We have found that these tools make it easy to generate large amounts of plausible-looking code that the contributor does not understand, is often untested, and does not function properly. This is a drain on the (already limited) time and energy of our reviewers.
+
+**Correctness and security:** Even when code generated by AI tools does seem to function, there is no guarantee that it is correct, and no indication of what security implications it may have. A web browser engine is built to run in hostile execution environments, so all code must take into account potential security issues. Contributors play a large role in considering these issues when creating contributions, something that we cannot trust an AI tool to do.


I would drop this one, because it would appear to me as covered by the "unless the contributor has made reasonable efforts to check the correctness of the code" requirement.

gterzian · 2024-05-27T13:50:57Z

CONTRIBUTING.md

+
+**Correctness and security:** Even when code generated by AI tools does seem to function, there is no guarantee that it is correct, and no indication of what security implications it may have. A web browser engine is built to run in hostile execution environments, so all code must take into account potential security issues. Contributors play a large role in considering these issues when creating contributions, something that we cannot trust an AI tool to do.
+
+**Copyright issues:** Publicly available models are trained on copyrighted content, both accidentally and intentionally, and their output often includes that content verbatim. Since the legality of this is uncertain, these contributions could make the project liable for copyright infringement.


Ditto for this, as it is already covered by the MPL license(but see the reminder I suggested above).

For another example of this logic, see https://github.com/llvm/llvm-project/pull/91014/files#diff-6dc5eb832e108d0f445f3481be924b7e70f33443d86c828c0b3d012a71c22bdaR1310

gterzian · 2024-05-27T14:01:53Z

CONTRIBUTING.md

+
+**Copyright issues:** Publicly available models are trained on copyrighted content, both accidentally and intentionally, and their output often includes that content verbatim. Since the legality of this is uncertain, these contributions could make the project liable for copyright infringement.
+
+**Ethical issues:** AI tools require an unreasonable amount of energy and water to build and operate, their models are built with heavily exploited workers in unacceptable working conditions, and are being used to undermine labor and justify layoffs. These are harms that we do not want to perpetuate, even if only indirectly.


I would reword this one as a plea to contributors to consider the ethics of their AI use, by:

Adding :"We therefore ask contributors to consider these issues before using AI tools to contribute to Servo, especially due to the limit of their current usefulness on a large codebase."

Rephrasing as "It has been reported in the press that certain AI tools require (...), and that their models are built with (...).", and add a few links as examples.

SimonSapin

I agree with the proposed policy as is in the PR at time of review and disagree with suggestions above to weaken it.

I think probabilistic code generation is fundamentally subject to creating subtle bugs, and hand-waving unspecified "reasonable efforts to check correctness" is far from enough.

We may not always be able to "enforce the ban" and detect that a malicious contributor is trying to sneak generated slop through review. That doesn’t mean it’s not worth clearly stating that such behavior is unwelcome.

Authorship concerns are worth mentioning explicitly even if already implied somewhere in the 2426 words of the MPL-2 text.

delan · 2024-05-28T11:30:03Z

I’ve left a reply above, but given the discussion on Zulip, I also want to say that as much as we can debate whether a helpful and ethical generative AI tool could exist someday, our policy should stick to the actual conditions of the present day. If the conditions change, then the policy should change, but until then, we should not allow hypothetical future tools to compromise a policy written to address concrete problems with the concrete tools that exist today.

We may not always be able to "enforce the ban" and detect that a malicious contributor is trying to sneak generated slop through review. That doesn’t mean it’s not worth clearly stating that such behavior is unwelcome.

I would even say that if anything, the difficulty in reliably enforcing this is a good reason to take a strong and clear stance. Otherwise the wiggle room and uncertainty will make that difficult task impossible.

gterzian · 2024-05-30T09:47:45Z

@delan @SimonSapin Thank for your comments.

To give you a bit more context, I have used GitHub Co-pilot to write a simple Rust script, using only the standard library, and found it surprisingly useful for boilerplate code(example: generating a use statement). I have also tried using Copilot for Servo, as well as doing something with a local model and python, and neither proved very useful in that context(the tool seemed to lack Servo-specific context). I think having an useful AI tool in the context of Servo would require some research, and I was actually hoping the TSC itself could drive some of that.

I will address the arguments you have put forward, while trying to focus on our contribution policy.

We may not always be able to "enforce the ban" and detect that a malicious contributor is trying to sneak generated slop through review. That doesn’t mean it’s not worth clearly stating that such behavior is unwelcome.

Isn't it easy for us to detect problematic PRs, whether they were generated by AI or not? The current discussion originates in the opening of PRs whose content were perceived as being generated with AI, but the main problem was that those PRs were perceived to create an undue maintenance burden by having been poorly prepared. Detecting the problem required no sophisticated tools. On the other hand, it seems impossible to know whether a proper PR included, say, a use statement generated by GitHub copilot. Is such a decent PR, created by a contributor who happened to have copilot turned-on in VS code, "unwelcome behavior?" I will answer this rhetorical question below.

I think probabilistic code generation is fundamentally subject to creating subtle bugs

That is an interesting argument, given the fact that human coding can also be said to be an activity fundamentally subject to creating subtle bugs(and with a much larger empirical data set to back the claim up). What prevents subtle bugs in code is not human skill, but rather automated tools, for example the Rust compiler. On the other hand, preventing subtle bugs in the logic behind the code, whether a formal spec or just an idea in your mind, is a fully human task. So is structuring code in a way that reflects these logical concepts. This leaves plenty of boilerplate coding: similar to code already written, but still too unique to deterministically generate, and so a probabilistic generation seems promising to me.

we have to fight the incentive to accept the output as correct in order to think critically about it

This makes sense, but, on the other hand, an equally strong argument could be made that humans will be more critical towards generated code, since it will lack the emotional bound one can feel towards code one wrote by hand.

Authorship concerns are worth mentioning explicitly

Then the "Copyright issues" paragraph should be reworded so that it is correct: the project itself cannot be held liable(because of the "no warranty" part in the license), the risk is on end-users(and perhaps contributors who make false representations). For a (somewhat dated but still relevant I think)primer: see this article. The problem is rather the attractiveness of the project for end-users, which could be damaged by a reputation for copyright infringement.

The LLVM policy on the matter provides a good description of the situation, I think.

I do think an argument can be made that code generation could result in more inadvertent infringement(although if one goes 20 years back in time, we could replace "code generation" with "using information from search engines"). But, from my experience with Copilot it appears that the snippets that are generated are not specific enough to come directly from a copyrighted source. I also noticed a feature in Copilot that seemed to mitigate this: alerting you if the generated code was a match for existing code on Github(in my case it would then point to forks of Servo, the specific code being a variant of a message enum). I also think it extremely unlikely that there would be a significant chunk of code somewhere on the internet that one could just copy/paste into Servo. I think there is more risk in using a library with an incompatible license(which is not a risk we mitigate in any way, as far as I know).

the difficulty in reliably enforcing this is a good reason to take a strong and clear stance

Debatable. I actually think a nudge, such as pointing out the ethical issues, is more likely to result in increased awareness and behavioral changes.

Is such a decent PR, created by a contributor who happened to have copilot turned-on in VS code, "unwelcome behavior"? (I will answer this rhetorical question below)

To answer my own question above: I am doubtful that our contribution policy should enforce the choice for contributors of whether to use some kind of AI tool. I thank you for pointing some of these ethical issue out in zulip, I wasn't aware of most of them. But this is a choice that contributors can make for themselves, like the choice of which computer to use, which operating system to install, and which search engine to ask for info from.

we should not allow hypothetical future tools to compromise a policy written to address concrete problems with the concrete tools that exist today

We do need to take potential benefits into account, especially given the developing nature of the technology.

I've just read an article entitled "Mapping the Ethics of Generative AI" from Thilo Hagendorff at the University of Stuttgart(with the only disclosed financial support coming from "the Ministry of Science, Research, and the Arts Baden-Württemberg"), and I'd like to quote a passage(from the "Discussion" paragraph)which I found particularly relevant to our current discussion:

The literature on the ethics of generative AI is predominantly characterized by a bias towards negative
aspects of the technology, putting much greater or exclusive weight on risks and harms instead
of chances and benefits. This negativity bias is in line with how human psychology is wired (...) can nevertheless result in suboptimal decision-making (...) should be approached with a critical mindset. The numerous benefits and opportunities of adopting generative AI, which may be more challenging to observe or foresee, are usually overshadowed or discussed in a fragmentary manner in the literature.

In conclusion: I am not convinced by your arguments, and I think my proposed re-wording would be sufficient to deal with the maintenance problem(which is the only problem the project experienced), and I would like to leave the door open for use of AI within Servo(including research led by the project itself, which may be a prerequisite for developing more useful tools).

delan · 2024-05-30T14:19:43Z

Isn't it easy for us to detect problematic PRs, whether they were generated by AI or not?

It is not. Reviews are already time-consuming, and it’s rare for us to reject a PR outright. Let’s not amplify this by allowing people to “contribute” large volumes of nonsense that statistically resembles code.

But, from my experience with Copilot it appears that the snippets that are generated are not specific enough to come directly from a copyrighted source. […] I also think it extremely unlikely that there would be a significant chunk of code somewhere on the internet that one could just copy/paste into Servo.

These are well-documented phenomena, as much as your experience or intuition may say otherwise.

We do need to take potential benefits into account, especially given the developing nature of the technology.

Not when it comes at the expense of the wellbeing of our reviewers and the correctness and legality of our work.

mrego · 2024-05-30T14:22:24Z

It seems the discussion here is not going to reach consensus, as there are very different positions. We should discuss this at TSC level and see if we can reach some kind of agreement somehow.

On the April TSC call it looked like people were fine with a proposal like this, but it seems now there are some concerns, and the positions are quite different.

webbeef · 2024-05-30T14:35:38Z

While there is no consensus, I think it's pretty clear that only one participant is arguing for IA contributions. And that just looks like moving the Overton window and should be ignored here and not waste the TSC time.

delan · 2024-05-30T14:45:20Z

While there is no consensus, I think it's pretty clear that only one participant is arguing for IA contributions. And that just looks like moving the Overton window and should be ignored here and not waste the TSC time.

Let’s keep this discussion constructive please.

sagudev · 2024-05-31T09:58:12Z

One thing that could be relevant for us is policy by mozilla (given that we share some code and send patches there), but when I asked about it I got mixed response: https://matrix.to/#/!lrZtdjyLpBmoKbMdyx:mozilla.org/$JkENHrOtFRq91_nLpCfE0Ys0ijouRayov2Clt9smZuU?via=chat.mozilla.org

gterzian · 2024-05-31T11:54:50Z

One thing that could be relevant for us is policy by mozilla

Here is the Mozilla policy for community contributors in general: https://support.mozilla.org/en-US/kb/contributor-policy-generative-ai-usage

sagudev · 2024-05-31T12:00:14Z

One thing that could be relevant for us is policy by mozilla

Here is the Mozilla policy for community contributors in general: https://support.mozilla.org/en-US/kb/contributor-policy-generative-ai-usage

I was searching for it and couldn't find it (I used the wrong wording). Strangely enough, nobody gave me this link on matrix.

gterzian · 2024-05-31T12:13:22Z

allowing people to “contribute” large volumes of nonsense that statistically resembles code.
at the expense of the wellbeing of our reviewers and the correctness and legality of our work.

Off course I cannot argue in favor of either of these harms, but I see what mean.

These are well-documented phenomena

Do you have a link perhaps?

We should discuss this at TSC level and see if we can reach some kind of agreement somehow.

Perhaps as a last try here, see below.

Reviews are already time-consuming, and it’s rare for us to reject a PR outright.

Allright, so then if we do stick with the overall ban, how about we add to "we may decide to revise this policy at a later date" something like "and we are open to proposals of beneficial use cases from our community", just to be clear that the TSC will remain open-minded about potential beneficial uses of AI, with a commitment to spend some time reviewing potential tools if brought forward by either members of our community or TSC members? That is the minimal change that I think changes the tone of the policy and would make it easier to people to actually present such beneficial use cases going forward.

Also please change "these contributions could make the project liable for copyright infringement" into something along the lines of "these contributions could lessen the appeal of the project to end-user from a copyright protection perspective".

fabricedesre · 2024-05-31T21:08:39Z

One thing that could be relevant for us is policy by mozilla

Here is the Mozilla policy for community contributors in general: https://support.mozilla.org/en-US/kb/contributor-policy-generative-ai-usage

I was searching for it and couldn't find it (I used the wrong wording). Strangely enough, nobody gave me this link on matrix.

Note that this is for community contributors to the support knowledge base of Mozilla. I don't think it applies to code contributions - there would certainly be details about potential IP issues if that was applicable to code.

fabricedesre · 2024-05-31T21:51:35Z

Also please change "these contributions could make the project liable for copyright infringement" into something along the lines of "these contributions could lessen the appeal of the project to end-user from a copyright protection perspective".

Who are the "end users" here? Are they developers re-using Servo, or people using a Servo based product?

CONTRIBUTING: add AI contributions policy

96649fe

Co-authored-by: Martin Robinson <mrobinson@igalia.com>

delan force-pushed the ai-policy branch from 34e45bd to 96649fe Compare May 15, 2024 09:07

gterzian requested changes May 27, 2024

View reviewed changes

SimonSapin approved these changes May 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING: add AI contributions policy #32287

CONTRIBUTING: add AI contributions policy #32287

delan commented May 15, 2024

mrego commented May 16, 2024

gterzian left a comment •

edited

gterzian May 27, 2024

delan May 28, 2024

gterzian May 27, 2024

gterzian May 27, 2024

gterzian May 27, 2024

SimonSapin left a comment

delan commented May 28, 2024

gterzian commented May 30, 2024 •

edited

delan commented May 30, 2024

mrego commented May 30, 2024

webbeef commented May 30, 2024

delan commented May 30, 2024

sagudev commented May 31, 2024

gterzian commented May 31, 2024

sagudev commented May 31, 2024

gterzian commented May 31, 2024 •

edited

fabricedesre commented May 31, 2024

fabricedesre commented May 31, 2024


		Maintainer burden: Reviewers depend on contributors to write and test their code before submitting it. We have found that these tools make it easy to generate large amounts of plausible-looking code that the contributor does not understand, is often untested, and does not function properly. This is a drain on the (already limited) time and energy of our reviewers.

		Correctness and security: Even when code generated by AI tools does seem to function, there is no guarantee that it is correct, and no indication of what security implications it may have. A web browser engine is built to run in hostile execution environments, so all code must take into account potential security issues. Contributors play a large role in considering these issues when creating contributions, something that we cannot trust an AI tool to do.


		Correctness and security: Even when code generated by AI tools does seem to function, there is no guarantee that it is correct, and no indication of what security implications it may have. A web browser engine is built to run in hostile execution environments, so all code must take into account potential security issues. Contributors play a large role in considering these issues when creating contributions, something that we cannot trust an AI tool to do.

		Copyright issues: Publicly available models are trained on copyrighted content, both accidentally and intentionally, and their output often includes that content verbatim. Since the legality of this is uncertain, these contributions could make the project liable for copyright infringement.


		Copyright issues: Publicly available models are trained on copyrighted content, both accidentally and intentionally, and their output often includes that content verbatim. Since the legality of this is uncertain, these contributions could make the project liable for copyright infringement.

		Ethical issues: AI tools require an unreasonable amount of energy and water to build and operate, their models are built with heavily exploited workers in unacceptable working conditions, and are being used to undermine labor and justify layoffs. These are harms that we do not want to perpetuate, even if only indirectly.

CONTRIBUTING: add AI contributions policy #32287

Are you sure you want to change the base?

CONTRIBUTING: add AI contributions policy #32287

Conversation

delan commented May 15, 2024

mrego commented May 16, 2024

gterzian left a comment • edited

Choose a reason for hiding this comment

gterzian May 27, 2024

Choose a reason for hiding this comment

delan May 28, 2024

Choose a reason for hiding this comment

gterzian May 27, 2024

Choose a reason for hiding this comment

gterzian May 27, 2024

Choose a reason for hiding this comment

gterzian May 27, 2024

Choose a reason for hiding this comment

SimonSapin left a comment

Choose a reason for hiding this comment

delan commented May 28, 2024

gterzian commented May 30, 2024 • edited

delan commented May 30, 2024

mrego commented May 30, 2024

webbeef commented May 30, 2024

delan commented May 30, 2024

sagudev commented May 31, 2024

gterzian commented May 31, 2024

sagudev commented May 31, 2024

gterzian commented May 31, 2024 • edited

fabricedesre commented May 31, 2024

fabricedesre commented May 31, 2024

gterzian left a comment •

edited

gterzian commented May 30, 2024 •

edited

gterzian commented May 31, 2024 •

edited