summaryrefslogtreecommitdiff
path: root/Documentation/process/researcher-guidelines.rst
blob: d159cd4f5e5b3b0c417a22c05e461424d6f9282c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
.. SPDX-License-Identifier: GPL-2.0

.. _researcher_guidelines:

Researcher Guidelines
+++++++++++++++++++++

The Linux kernel community welcomes transparent research on the Linux
kernel, the activities involved in producing it, and any other byproducts
of its development. Linux benefits greatly from this kind of research, and
most aspects of Linux are driven by research in one form or another.

The community greatly appreciates if researchers can share preliminary
findings before making their results public, especially if such research
involves security. Getting involved early helps both improve the quality
of research and ability for Linux to improve from it. In any case,
sharing open access copies of the published research with the community
is recommended.

This document seeks to clarify what the Linux kernel community considers
acceptable and non-acceptable practices when conducting such research. At
the very least, such research and related activities should follow
standard research ethics rules. For more background on research ethics
generally, ethics in technology, and research of developer communities
in particular, see:

* `History of Research Ethics <https://www.unlv.edu/research/ORI-HSR/history-ethics>`_
* `IEEE Ethics <https://www.ieee.org/about/ethics/index.html>`_
* `Developer and Researcher Views on the Ethics of Experiments on Open-Source Projects <https://arxiv.org/pdf/2112.13217.pdf>`_

The Linux kernel community expects that everyone interacting with the
project is participating in good faith to make Linux better. Research on
any publicly-available artifact (including, but not limited to source
code) produced by the Linux kernel community is welcome, though research
on developers must be distinctly opt-in.

Passive research that is based entirely on publicly available sources,
including posts to public mailing lists and commits to public
repositories, is clearly permissible. Though, as with any research,
standard ethics must still be followed.

Active research on developer behavior, however, must be done with the
explicit agreement of, and full disclosure to, the individual developers
involved. Developers cannot be interacted with/experimented on without
consent; this, too, is standard research ethics.

Surveys
=======

Research often takes the form of surveys sent to maintainers or
contributors.  As a general rule, though, the kernel community derives
little value from these surveys.  The kernel development process works
because every developer benefits from their participation, even working
with others who have different goals.  Responding to a survey, though, is a
one-way demand placed on busy developers with no corresponding benefit to
themselves or to the kernel community as a whole.  For this reason, this
method of research is discouraged.

Kernel community members already receive far too much email and are likely
to perceive survey requests as just another demand on their time.  Sending
such requests deprives the community of valuable contributor time and is
unlikely to yield a statistically useful response.

As an alternative, researchers should consider attending developer events,
hosting sessions where the research project and its benefits to the
participants can be explained, and interacting directly with the community
there.  The information received will be far richer than that obtained from
an email survey, and the community will gain from the ability to learn from
your insights as well.

Patches
=======

To help clarify: sending patches to developers *is* interacting
with them, but they have already consented to receiving *good faith
contributions*. Sending intentionally flawed/vulnerable patches or
contributing misleading information to discussions is not consented
to. Such communication can be damaging to the developer (e.g. draining
time, effort, and morale) and damaging to the project by eroding
the entire developer community's trust in the contributor (and the
contributor's organization as a whole), undermining efforts to provide
constructive feedback to contributors, and putting end users at risk of
software flaws.

Participation in the development of Linux itself by researchers, as
with anyone, is welcomed and encouraged. Research into Linux code is
a common practice, especially when it comes to developing or running
analysis tools that produce actionable results.

When engaging with the developer community, sending a patch has
traditionally been the best way to make an impact. Linux already has
plenty of known bugs -- what's much more helpful is having vetted fixes.
Before contributing, carefully read the appropriate documentation:

* Documentation/process/development-process.rst
* Documentation/process/submitting-patches.rst
* Documentation/admin-guide/reporting-issues.rst
* Documentation/process/security-bugs.rst

Then send a patch (including a commit log with all the details listed
below) and follow up on any feedback from other developers.

When sending patches produced from research, the commit logs should
contain at least the following details, so that developers have
appropriate context for understanding the contribution. Answer:

* What is the specific problem that has been found?
* How could the problem be reached on a running system?
* What effect would encountering the problem have on the system?
* How was the problem found? Specifically include details about any
  testing, static or dynamic analysis programs, and any other tools or
  methods used to perform the work.
* Which version of Linux was the problem found on? Using the most recent
  release or a recent linux-next branch is strongly preferred (see
  Documentation/process/howto.rst).
* What was changed to fix the problem, and why it is believed to be correct?
* How was the change build tested and run-time tested?
* What prior commit does this change fix? This should go in a "Fixes:"
  tag as the documentation describes.
* Who else has reviewed this patch? This should go in appropriate
  "Reviewed-by:" tags; see below.

For example::

  From: Author <author@email>
  Subject: [PATCH] drivers/foo_bar: Add missing kfree()

  The error path in foo_bar driver does not correctly free the allocated
  struct foo_bar_info. This can happen if the attached foo_bar device
  rejects the initialization packets sent during foo_bar_probe(). This
  would result in a 64 byte slab memory leak once per device attach,
  wasting memory resources over time.

  This flaw was found using an experimental static analysis tool we are
  developing, LeakMagic[1], which reported the following warning when
  analyzing the v5.15 kernel release:

   path/to/foo_bar.c:187: missing kfree() call?

  Add the missing kfree() to the error path. No other references to
  this memory exist outside the probe function, so this is the only
  place it can be freed.

  x86_64 and arm64 defconfig builds with CONFIG_FOO_BAR=y using GCC
  11.2 show no new warnings, and LeakMagic no longer warns about this
  code path. As we don't have a FooBar device to test with, no runtime
  testing was able to be performed.

  [1] https://url/to/leakmagic/details

  Reported-by: Researcher <researcher@email>
  Fixes: aaaabbbbccccdddd ("Introduce support for FooBar")
  Signed-off-by: Author <author@email>
  Reviewed-by: Reviewer <reviewer@email>

If you are a first time contributor it is recommended that the patch
itself be vetted by others privately before being posted to public lists.
(This is required if you have been explicitly told your patches need
more careful internal review.) These people are expected to have their
"Reviewed-by" tag included in the resulting patch. Finding another
developer familiar with Linux contribution, especially within your own
organization, and having them help with reviews before sending them to
the public mailing lists tends to significantly improve the quality of the
resulting patches, and there by reduces the burden on other developers.

If no one can be found to internally review patches and you need
help finding such a person, or if you have any other questions
related to this document and the developer community's expectations,
please reach out to the private Technical Advisory Board mailing list:
<tech-board@lists.linux-foundation.org>.