summaryrefslogtreecommitdiff
path: root/scripts/spdxcheck.py
AgeCommit message (Collapse)AuthorFilesLines
2024-05-04scripts/spdxcheck: Add count of missing files to stats outputBird, Tim1-0/+3
Add a count of files missing an SPDX header to the stats output. This is useful detailed information for working on SPDX header additions. Signed-off-by: Tim Bird <tim.bird@sony.com> Link: https://lore.kernel.org/r/SA3PR13MB6372DB9F9F2C09F8A1E1B99BFD1A2@SA3PR13MB6372.namprd13.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18scripts/spdxcheck: Put excluded files and directories into a separate fileThomas Gleixner1-6/+64
The files and directories which are excluded from scanning are currently hard coded in the script. That's not maintainable and not accessible for external tools. Move the files and directories which should be excluded into a file. The default file is scripts/spdxexclude. This can be overridden with the '-e $FILE' command line option. The file format and syntax is similar to the .gitignore file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18scripts/spdxcheck: Add option to display files without SPDXThomas Gleixner1-2/+19
Makes life easier when chasing the missing ones. Is activated with '-f' on the command line. # scripts/spdxcheck.py -f kernel/ Files without SPDX: ./kernel/cpu.c ./kernel/kmod.c ./kernel/relay.c ./kernel/bpf/offload.c ./kernel/bpf/preload/.gitignore ./kernel/bpf/preload/iterators/README ./kernel/bpf/ringbuf.c ./kernel/cgroup/cgroup.c ./kernel/cgroup/cpuset.c ./kernel/cgroup/legacy_freezer.c ./kernel/debug/debug_core.h ./kernel/debug/kdb/Makefile ./kernel/debug/kdb/kdb_bp.c ./kernel/debug/kdb/kdb_bt.c ./kernel/debug/kdb/kdb_cmds ./kernel/debug/kdb/kdb_debugger.c ./kernel/debug/kdb/kdb_io.c ./kernel/debug/kdb/kdb_keyboard.c ./kernel/debug/kdb/kdb_main.c ./kernel/debug/kdb/kdb_private.h ./kernel/debug/kdb/kdb_support.c ./kernel/locking/lockdep_states.h ./kernel/locking/mutex-debug.c ./kernel/locking/spinlock_debug.c ./kernel/sched/pelt.h With the optional -D parameter the directory depth can be limited: # scripts/spdxcheck.py -f -D 0 kernel/ Files without SPDX: ./kernel/cpu.c ./kernel/kmod.c ./kernel/relay.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18scripts/spdxcheck: Add [sub]directory statisticsThomas Gleixner1-10/+57
Add functionality to display [sub]directory statistics. This is enabled by adding '-d' to the command line. The optional -D parameter allows to limit the directory depth. If supplied the subdirectories are accumulated # scripts/spdxcheck.py -d kernel/ Incomplete directories: SPDX in Files ./kernel : 111 of 114 97% ./kernel/bpf : 43 of 45 95% ./kernel/bpf/preload : 4 of 5 80% ./kernel/bpf/preload/iterators : 4 of 5 80% ./kernel/cgroup : 10 of 13 76% ./kernel/configs : 0 of 9 0% ./kernel/debug : 3 of 4 75% ./kernel/debug/kdb : 1 of 11 9% ./kernel/locking : 29 of 32 90% ./kernel/sched : 38 of 39 97% The result can be accumulated by restricting the depth via the new command line option '-d $DEPTH': # scripts/spdxcheck.py -d -D1 Incomplete directories: SPDX in Files ./ : 6 of 13 46% ./Documentation : 4096 of 8451 48% ./arch : 13476 of 16402 82% ./block : 100 of 101 99% ./certs : 11 of 14 78% ./crypto : 145 of 176 82% ./drivers : 24682 of 30745 80% ./fs : 1876 of 2110 88% ./include : 5175 of 5757 89% ./ipc : 12 of 13 92% ./kernel : 493 of 527 93% ./lib : 393 of 524 75% ./mm : 151 of 159 94% ./net : 1713 of 1900 90% ./samples : 211 of 273 77% ./scripts : 341 of 435 78% ./security : 241 of 250 96% ./sound : 2438 of 2503 97% ./tools : 3810 of 5462 69% ./usr : 9 of 10 90% Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18scripts/spdxcheck: Add directory statisticsThomas Gleixner1-0/+27
For better insights. Directories accounted: 4646 Directories complete: 2565 55% Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18scripts/spdxcheck: Add percentage to statisticsThomas Gleixner1-1/+3
Files checked: 75856 Lines checked: 294516 Files with SPDX: 59410 78% Files with errors: 0 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-04spdxcheck.py: Fix a type errorDing Xiang1-1/+1
remove unused variable "col", otherwise there will be a type error as below: typeerror: not all arguments converted during string formatting Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20220114024058.74536-1-dingxiang@cmss.chinamobile.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-12scripts/spdxcheck.py: Strictly read license files in utf-8Nishanth Menon1-1/+1
Commit bc41a7f36469 ("LICENSES: Add the CC-BY-4.0 license") unfortunately introduced LICENSES/dual/CC-BY-4.0 in UTF-8 Unicode text While python will barf at it with: FAIL: 'ascii' codec can't decode byte 0xe2 in position 2109: ordinal not in range(128) Traceback (most recent call last): File "scripts/spdxcheck.py", line 244, in <module> spdx = read_spdxdata(repo) File "scripts/spdxcheck.py", line 47, in read_spdxdata for l in open(el.path).readlines(): File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2109: ordinal not in range(128) While it is indeed debatable if 'Licensor.' used in the license file needs unicode quotes, instead, force spdxcheck to read utf-8. Reported-by: Rahul T R <r-ravikumar@ti.com> Signed-off-by: Nishanth Menon <nm@ti.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210707204840.30891-1-nm@ti.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-28scripts/spdxcheck.py: Fix a typoBhaskar Chowdhury1-1/+1
s/Initilize/Initialize/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Link: https://lore.kernel.org/r/20210326091443.26525-1-unixbhaskar@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27spdxcheck.py: Use Python 3Bert Vermeulen1-1/+1
Python 2.x has been officially EOL'ed for some time, and in any case the git module for it is hard to come by. Signed-off-by: Bert Vermeulen <bert@biot.com> Link: https://lore.kernel.org/r/20210121085412.265400-1-bert@biot.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02scripts/spdxcheck.py: handle license identifiers in XML commentsLukas Bulwahn1-0/+3
Commit cc9539e7884c ("media: docs: use the new SPDX header for GFDL-1.1 on *.svg files") adds SPDX-License-Identifiers enclosed in XML comments, i.e., <!-- ... -->, for svg files. Unfortunately, ./scripts/spdxcheck.py does not handle SPDX-License-Identifiers in XML comments, so it simply fails on checking these files with 'Invalid License ID: --'. Strip the XML comment ending simply by copying how it was done for comments in C. With that, ./scripts/spdxcheck.py handles the svg files properly. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-02spdxcheck.py: fix directory structuresVincenzo Frascino1-3/+4
The LICENSE directory has recently changed structure and this makes spdxcheck fails as per below: FAIL: "Blob or Tree named 'other' not found" Traceback (most recent call last): File "scripts/spdxcheck.py", line 240, in <module> spdx = read_spdxdata(repo) File "scripts/spdxcheck.py", line 41, in read_spdxdata for el in lictree[d].traverse(): [...] KeyError: "Blob or Tree named 'other' not found" Fix the script to restore the correctness on checkpatch License checking. References: 62be257e986d ("LICENSES: Rename other to deprecated") References: 8ea8814fcdcb ("LICENSES: Clearly mark dual license only licenses") Link: http://lkml.kernel.org/r/20190523084755.56739-1-vincenzo.frascino@arm.com Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Joe Perches <joe@perches.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeremy Cline <jcline@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-21scripts/spdxcheck.py: Add dual license subdirectorySven Eckelmann1-1/+1
The licenses from the other directory were partially moved to the dual directory in commit 8ea8814fcdcb ("LICENSES: Clearly mark dual license only licenses"). checkpatch therefore rejected files like drivers/staging/android/ashmem.h with WARNING: 'SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */' is not supported in LICENSES/... #1: FILE: drivers/staging/android/ashmem.h:1: +/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */ Fixes: 8ea8814fcdcb ("LICENSES: Clearly mark dual license only licenses") Signed-off-by: Sven Eckelmann <sven@narfation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-05-20scripts/spdxcheck.py: Fix path to deprecated licensesSven Eckelmann1-1/+1
The directory name for other licenses was changed to "deprecated" in commit 62be257e986d ("LICENSES: Rename other to deprecated"). But it was not changed for spdxcheck.py. As result, checkpatch failed with FAIL: "Blob or Tree named 'other' not found" Traceback (most recent call last): File "scripts/spdxcheck.py", line 240, in <module> spdx = read_spdxdata(repo) File "scripts/spdxcheck.py", line 41, in read_spdxdata for el in lictree[d].traverse(): File "/usr/lib/python2.7/dist-packages/git/objects/tree.py", line 298, in __getitem__ return self.join(item) File "/usr/lib/python2.7/dist-packages/git/objects/tree.py", line 244, in join raise KeyError(msg % file) KeyError: "Blob or Tree named 'other' not found" Fixes: 62be257e986d ("LICENSES: Rename other to deprecated") Signed-off-by: Sven Eckelmann <sven@narfation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-02-22scripts/spdxcheck.py: fix C++ comment style detectionAurélien Cedeyn1-1/+1
With the last commit to support the SuperH boot code files, we have the following regression: $ ./scripts/checkpatch.pl -f <(echo '/* SPDX-License-Identifier: MIT */') WARNING: 'SPDX-License-Identifier: MIT */' is not supported in LICENSES/.. +/* SPDX-License-Identifier: MIT */ total: 0 errors, 1 warnings, 1 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. /dev/fd/63 has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. This is not obvious, but spdxcheck.py is launched in checkpatch.pl with : ... } elsif ($rawline =~ /(SPDX-License-Identifier: .*)/) { my $spdx_license = $1; if (!is_SPDX_License_valid($spdx_license)) { WARN("SPDX_LICENSE_TAG", "'$spdx_license' is not supported in LICENSES/...\n" . \ $herecurr); } ... sub is_SPDX_License_valid { my ($license) = @_; ... my $status = `cd "$root_path"; echo "$license" | python scripts/spdxcheck.py -`; ... } The first chars before 'SPDX-License-Identifier:' are ignored. This commit fixes this regression. Fixes:959b49687838 (scripts/spdxcheck.py: Handle special quotation mark comments) Signed-off-by:Aurélien Cedeyn <aurelien.cedeyn@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-01-17scripts/spdxcheck.py: Handle special quotation mark commentsThomas Gleixner1-1/+7
The SuperH boot code files use a magic format for the SPDX identifier comment: LIST "SPDX-License-Identifier: .... " The trailing quotation mark is not stripped before the token parser is invoked and causes the scan to fail. Handle it gracefully. Fixes: 6a0abce4c4cc ("sh: include: convert to SPDX identifiers") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-12-15scripts/spdxcheck.py: always open files in binary modeThierry Reding1-2/+4
The spdxcheck script currently falls over when confronted with a binary file (such as Documentation/logo.gif). To avoid that, always open files in binary mode and decode line-by-line, ignoring encoding errors. One tricky case is when piping data into the script and reading it from standard input. By default, standard input will be opened in text mode, so we need to reopen it in binary mode. The breakage only happens with python3 and results in a UnicodeDecodeError (according to Uwe). Link: http://lkml.kernel.org/r/20181212131210.28024-1-thierry.reding@gmail.com Fixes: 6f4d29df66ac ("scripts/spdxcheck.py: make python3 compliant") Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jeremy Cline <jcline@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joe Perches <joe@perches.com> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-18scripts/spdxcheck.py: make python3 compliantUwe Kleine-König1-1/+0
Without this change the following happens when using Python3 (3.6.6): $ echo "GPL-2.0" | python3 scripts/spdxcheck.py - FAIL: 'str' object has no attribute 'decode' Traceback (most recent call last): File "scripts/spdxcheck.py", line 253, in <module> parser.parse_lines(sys.stdin, args.maxlines, '-') File "scripts/spdxcheck.py", line 171, in parse_lines line = line.decode(locale.getpreferredencoding(False), errors='ignore') AttributeError: 'str' object has no attribute 'decode' So as the line is already a string, there is no need to decode it and the line can be dropped. /usr/bin/python on Arch is Python 3. So this would indeed be worth going into 4.19. Link: http://lkml.kernel.org/r/20181023070802.22558-1-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joe Perches <joe@perches.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-18scripts: add Python 3 compatibility to spdxcheck.pyJeremy Cline1-2/+5
"dict.has_key(key)" on dictionaries has been replaced with "key in dict". Additionally, when run under Python 3 some files don't decode with the default encoding (tested with UTF-8). To handle that, don't open the file in text mode and decode text line-by-line, ignoring encoding errors. This remains compatible with Python 2 and should have no functional change. Link: http://lkml.kernel.org/r/20180717190635.29467-1-jcline@redhat.com Signed-off-by: Jeremy Cline <jcline@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-18scripts/spdxcheck.py: work with current HEAD LICENSES/ directoryJoe Perches1-3/+1
Depending on how old your -next tree is, it may not have a master that has the LICENSES directory. Change the lookup to HEAD and find whatever LICENSE directory files are used in that branch. Miscellanea: - Remove the checkpatch test as it will have its own SPDX license identifier. Link: http://lkml.kernel.org/r/7eeefc862194930c773e662cb2152e178441d3b8.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-28scripts: Add SPDX checker scriptThomas Gleixner1-0/+284
The SPDX-License-Identifiers are growing in the kernel and so grow expression failures and license IDs are used which have no corresponding license text file in the LICENSES directory. Add a script which gathers information from the LICENSES directory, i.e. the various tags in the licenses and exception files and then scans either input from stdin, which it treats as a single file or if started without arguments it scans the full kernel tree. It checks whether the license expression syntax is correct and also validates whether the license identifiers used in the expressions are available in the LICENSES files. scripts/spdxcheck.py -h usage: spdxcheck.py [-h] [-m MAXLINES] [-v] [path [path ...]] SPDX expression checker positional arguments: path Check path or file. If not given full git tree scan. For stdin use "-" optional arguments: -h, --help show this help message and exit -m MAXLINES, --maxlines MAXLINES Maximum number of lines to scan in a file. Default 15 -v, --verbose Verbose statistics output include/dt-bindings/reset/amlogic,meson-axg-reset.h: 9:41 Invalid License ID: BSD drivers/pinctrl/sh-pfc/pfc-r8a77965.c: 1:28 Invalid License ID: GPL-2. include/dt-bindings/reset/amlogic,meson-axg-reset.h: 9:41 Invalid License ID: BSD arch/x86/kernel/jailhouse.c: 1:28 Invalid License ID: GPL2.0 include/dt-bindings/reset/amlogic,meson-axg-reset.h: 9:41 Invalid License ID: BSD arch/arm/mach-s3c24xx/h1940-bluetooth.c: 1:28 Invalid License ID: GPL-1.0 arch/x86/kernel/jailhouse.c: 1:28 Invalid License ID: GPL2.0 drivers/pinctrl/sh-pfc/pfc-r8a77965.c: 1:28 Invalid License ID: GPL-2. include/dt-bindings/reset/amlogic,meson-axg-reset.h: 9:41 Invalid License ID: BSD arch/x86/include/asm/jailhouse_para.h: 1:28 Invalid License ID: GPL2.0 arch/arm/mach-s3c24xx/h1940-bluetooth.c: 1:28 Invalid License ID: GPL-1.0 arch/x86/kernel/jailhouse.c: 1:28 Invalid License ID: GPL2.0 drivers/pinctrl/sh-pfc/pfc-r8a77965.c: 1:28 Invalid License ID: GPL-2. include/dt-bindings/reset/amlogic,meson-axg-reset.h: 9:41 Invalid License ID: BSD arch/x86/include/asm/jailhouse_para.h: 1:28 Invalid License ID: GPL2.0 License files: 14 Exception files: 1 License IDs 19 Exception IDs 1 Files checked: 61332 Lines checked: 669181 Files with SPDX: 16169 Files with errors: 5 real 0m2.642s user 0m2.231s sys 0m0.467s That's a full tree sweep on my laptop. Note, this runs single threaded. It scans by default the first 15 lines for a SPDX identifier where the current max inside a top comment is at line 10. But that's going to be faster once the identifiers are all in the first two lines as documented. The python wizards will surely know how to do that smarter and faster, but its at least better than no tool at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [jc: Fixed ironically erroneous SPDX tag and did chmod +x ] Signed-off-by: Jonathan Corbet <corbet@lwn.net>