Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Licenses only used in REUSE.toml get ignored #1062

Open
juhannc opened this issue Aug 19, 2024 · 2 comments
Open

Licenses only used in REUSE.toml get ignored #1062

juhannc opened this issue Aug 19, 2024 · 2 comments
Labels
spec Requires specification change

Comments

@juhannc
Copy link

juhannc commented Aug 19, 2024

If a license is only used in the REUSE.toml and not as a header/.license-file, reuse download will not download it and reuse lint will identify it as unused.

cd $(mktemp -d)
touch test.txt
echo "# SPDX-FileCopyrightText: 2024 foo\n#\n# SPDX-License-Identifier: MIT\n\nversion = 1\n\n[[annotations]]\npath = \"test.txt\"\nSPDX-FileCopyrightText = \"foo\"\nSPDX-License-Identifier = \"0BSD\"" > REUSE.toml

Now, reuse download --all will only download the MIT license, not the BSD Zero Clause License.
If we manually create the 0BSD file, e.g., by mkdir -p LICENSES && touch LICENSES/0BSD.txt, reuse lint will complain that the 0BSD has not been used:

# UNUSED LICENSES

The following licenses are not used:
* 0BSD



# SUMMARY

* Bad licenses: 0
* Deprecated licenses: 0
* Licenses without file extension: 0
* Missing licenses: 0
* Unused licenses: 0BSD
* Used licenses: MIT
* Read errors: 0
* Files with copyright information: 1 / 1
* Files with license information: 1 / 1

Unfortunately, your project is not compliant with version 3.2 of the REUSE Specification :-(


# RECOMMENDATIONS

* Fix unused licenses: At least one of the license text files in 'LICENSES' is
  not referenced by any file, e.g. by an 'SPDX-License-Identifier' tag. Please
  make sure that you either tag the accordingly licensed files properly, or
  delete the unused license text if you are sure that no file or code snippet is
  licensed as such.
@mxmehl mxmehl added the bug Something isn't working label Aug 19, 2024
@carmenbianca
Copy link
Member

I can reproduce as described, but this is partially expected behaviour :) Empty files are ignored by REUSE according to the spec. So when reuse download --all finds all used licences in the project, 0BSD isn't found, because it's not actually being applied to the empty file. If you populate test.txt with a single letter, reuse download --all will work just fine.


Now the question is whether this behaviour is correct. The spec says:

A Project MUST include a License File for every license under which Covered Files are licensed.

and

A Project MUST NOT include License Files for licenses under which none of the files in the Project are licensed. The LICENSES/ directory MUST NOT include any other files.

and

path (REQUIRED), a string or list of strings representing paths. A path MUST use forward slashes as path separators. A path SHOULD resolve to one or more Covered Files relative to the REUSE.toml file’s directory. A path that resolves to a non-existent or non-Covered File is ignored.

The last one could be more unambiguous, I suppose. But in my reading, the current behaviour is correct spec-wise.

We could maybe improve the spec by adding some words about the corner case 'what if REUSE.toml has an entry for a file that does not exist, using a licence used by no other file'. Super super super corner case, but I have to assume that you didn't report this bug for no reason, so I would love to hear your use case for this corner case :)

Thanks @juhannc !

@carmenbianca carmenbianca added spec Requires specification change and removed bug Something isn't working labels Aug 22, 2024
@juhannc
Copy link
Author

juhannc commented Aug 26, 2024

Thanks for getting back so quickly.

It turns out, I had a typo in my REUSE.toml

So here is a more complicated MWE.

This is working:

cd $(mktemp -d)
echo "# SPDX-FileCopyrightText: 2024 foo\n#\n# SPDX-License-Identifier: MIT\n\nversion = 1\n\n[[annotations]]\npath = \"figures/sample*\"\nSPDX-FileCopyrightText = \"pdfobject.com\"\nSPDX-License-Identifier = \"0BSD\"\n\n[[annotations]]\npath = \"figures/image*\"\nSPDX-FileCopyrightText = \"picsum.photos\"\nSPDX-License-Identifier = \"CC0-1.0\"\n" > REUSE.toml
mkdir -p figures/
wget https://picsum.photos/200 -O figures/image.jpg
wget https://pdfobject.com/pdf/sample.pdf -O figures/sample.pdf

Here, reuse download --all && reuse lint works as expected.

However, let's introduce a typo in the REUSE.toml.
Observe the typo in the path for the pdf, it's now figues and not figures (missing r).

cd $(mktemp -d)
echo "# SPDX-FileCopyrightText: 2024 foo\n#\n# SPDX-License-Identifier: MIT\n\nversion = 1\n\n[[annotations]]\npath = \"figues/sample*\"\nSPDX-FileCopyrightText = \"pdfobject.com\"\nSPDX-License-Identifier = \"0BSD\"\n\n[[annotations]]\npath = \"figures/image*\"\nSPDX-FileCopyrightText = \"picsum.photos\"\nSPDX-License-Identifier = \"CC0-1.0\"\n" > REUSE.toml
mkdir -p figures/
wget https://picsum.photos/200 -O figures/image.jpg
wget https://pdfobject.com/pdf/sample.pdf -O figures/sample.pdf

Now, reuse can obviously not find files in the (wrong) figues folder.
However, reuse download --all silently ignores this and exits with code 0. From this, I would think, that everything is correct (which is why it took me so long to realize it was a stupid typo...).
Moreover, running now reuse lint gives the following output:

# MISSING COPYRIGHT AND LICENSING INFORMATION

The following files have no copyright and licensing information:
* /tmp/tmp.FrWuaULWxM/figures/sample.pdf


# SUMMARY

* Bad licenses: 0
* Deprecated licenses: 0
* Licenses without file extension: 0
* Missing licenses: 0
* Unused licenses: 0
* Used licenses: MIT, CC0-1.0
* Read errors: 0
* Files with copyright information: 2 / 3
* Files with license information: 2 / 3

Unfortunately, your project is not compliant with version 3.2 of the REUSE Specification :-(


# RECOMMENDATIONS

* Fix missing copyright/licensing information: For one or more files, the tool
  cannot find copyright and/or licensing information. You typically do this by
  adding 'SPDX-FileCopyrightText' and 'SPDX-License-Identifier' tags to each
  file. The tutorial explains additional ways to do this:
  <https://reuse.software/tutorial/>

As the error message includes the correct path name, it was really hard for me to find this typo. I feel like, additional info could be helpful here, mentioning paths/globs that do not point to anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec Requires specification change
Projects
None yet
Development

No branches or pull requests

3 participants