Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autodiff Upstreaming - rustc_codegen_llvm changes #178

Open
wants to merge 863 commits into
base: master
Choose a base branch
from

Conversation

ZuseZ4
Copy link
Member

@ZuseZ4 ZuseZ4 commented Sep 7, 2024

Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the rustc_codegen_llvm changes.
It also includes small changes to three files under compiler/rustc_ast, which overlap with my frontend PR.
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to compiler/rustc_codegen_ssa, the majority of changes there will be in another PR, once either this or the frontend gets merged.

We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.

To already highlight the things which reviewers might want to discuss:

  1. enzyme_ffi.rs: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.

  2. add_panic_msg_to_global was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprising reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.

  3. SanitizeHWAddress: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?

  4. Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is the reason why Enzyme is so fast, so the compile time is acceptable for autodiff users: https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).

  5. I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?

  6. I'm happy to split this PR up further if reviewers have recommendations on how to.

For the full implementation, see: rust-lang#129175

Tracking:

ChrisDenton and others added 29 commits September 25, 2024 17:53
…=spastorino,RalfJung

Simple validation for unsize coercion in MIR validation

This adds the most basic validity check to unsize coercions in MIR. The src and target of an unsize cast must *at least* implement `Src: CoerceUnsized<Target>` for this to be valid.

This doesn't the second, more subtle validity check that is taken of advantage in codegen [here](https://github.com/rust-lang/rust/blob/914193c8f40528fe82696e1054828de8c399882e/compiler/rustc_codegen_ssa/src/base.rs#L126), but I did leave a beefy FIXME for that explaining what it is.

As a consequence, this also fixes an ICE with GVN and invalid unsize coercions. This is somewhat coincidental, since MIR inlining will check that a body is valid before inlining it; so now that we determine it to be invalid, we don't inline it, and we don't encounter the GVN ICE. I'm not certain if the same GVN ICE is triggerable without the inliner, and perhaps instead with trivial where clauses or something.

cc `@RalfJung`
…le_osx, r=davidtwco

Fix up setting strip = true in Cargo.toml makes build scripts fail in…

Fix issue: rust-lang#110536
Strip binary is PATH dependent which breaks builds in MacOS.
For example, on my Mac, the output of 'which strip' is '/opt/homebrew/opt/binutils/bin/strip', which leads to incorrect 'strip' results. Therefore, just like on other systems, it is also necessary to specify 'stripcmd' on macOS. However, it seems that there is a bug in binutils [bugzilla-Bug 31571](https://sourceware.org/bugzilla/show_bug.cgi?id=31571), which leads to the problem mentioned above.
add link from random() helper fn to extensive DefaultRandomSource docs
…r=Noratrieb

Add `must_use` attribute to `len_utf8` and `len_utf16`.

The `len_utf8` and `len_utf16` methods in `char` should have the `must_use` attribute.

The somewhat similar method `<[T]>::len` has had this attribute since rust-lang#95274. Considering that these two methods would most likely be used to test the size of a buffer (before a call to `encode_utf8` or `encode_utf16`), *not* using their return values could indicate a bug.

According to ["When to add `#[must_use]`](https://std-dev-guide.rust-lang.org/policy/must-use.html), this is **not** considered a breaking change (and could be reverted again at a later time).
…ubilee

fix some cfg logic around optimize_for_size and 16-bit targets

Fixes rust-lang#130818.
Fixes rust-lang#129910.

There are still some warnings when building on a 16bit target:
```
warning: struct `AlignedStorage` is never constructed
   --> /home/r/src/rust/rustc.2/library/core/src/slice/sort/stable/mod.rs:135:8
    |
135 | struct AlignedStorage<T, const N: usize> {
    |        ^^^^^^^^^^^^^^
    |
    = note: `#[warn(dead_code)]` on by default

warning: associated items `new` and `as_uninit_slice_mut` are never used
   --> /home/r/src/rust/rustc.2/library/core/src/slice/sort/stable/mod.rs:141:8
    |
140 | impl<T, const N: usize> AlignedStorage<T, N> {
    | -------------------------------------------- associated items in this implementation
141 |     fn new() -> Self {
    |        ^^^
...
145 |     fn as_uninit_slice_mut(&mut self) -> &mut [MaybeUninit<T>] {
    |        ^^^^^^^^^^^^^^^^^^^

warning: function `quicksort` is never used
  --> /home/r/src/rust/rustc.2/library/core/src/slice/sort/unstable/quicksort.rs:19:15
   |
19 | pub(crate) fn quicksort<'a, T, F>(
   |               ^^^^^^^^^

warning: `core` (lib) generated 3 warnings
```

However, the cfg stuff here is sufficiently messy that I didn't want to touch more of it. I think all `feature = "optimize_for_size"` should become `any(feature = "optimize_for_size", target_pointer_width = "16")` but I am not entirely certain. Warnings are fine, Miri will just ignore them.

Cc `@Voultapher`
…s, r=jieyouxu

Add tracking issue for io_error_inprogress

I forgot to mention this in rust-lang#130789
…iaskrgr

Rollup of 6 pull requests

Successful merges:

 - rust-lang#130735 (Simple validation for unsize coercion in MIR validation)
 - rust-lang#130781 (Fix up setting strip = true in Cargo.toml makes build scripts fail in…)
 - rust-lang#130811 (add link from random() helper fn to extensive DefaultRandomSource docs)
 - rust-lang#130819 (Add `must_use` attribute to `len_utf8` and `len_utf16`.)
 - rust-lang#130832 (fix some cfg logic around optimize_for_size and 16-bit targets)
 - rust-lang#130842 (Add tracking issue for io_error_inprogress)

r? `@ghost`
`@rustbot` modify labels: rollup
…, r=lcnr

Collect relevant item bounds from trait clauses for nested rigid projections

Rust currently considers trait where-clauses that bound the trait's *own* associated types to act like an item bound:

```rust
trait Foo where Self::Assoc: Bar { type Assoc; }
// acts as if:
trait Foo { type Assoc: Bar; }
```

### Background

This behavior has existed since essentially forever (i.e. before Rust 1.0), since we originally started out by literally looking at the where clauses written on the trait when assembling `SelectionCandidate::ProjectionCandidate` for projections. However, looking at the predicates of the associated type themselves was not sound, since it was unclear which predicates were *assumed* and which predicates were *implied*, and therefore this was reworked in rust-lang#72788 (which added a query for the predicates we consider for `ProjectionCandidate`s), and then finally item bounds and predicates were split in rust-lang#73905.

### Problem 1: GATs don't uplift bounds correctly

All the while, we've still had logic to uplift associated type bounds from a trait's where clauses. However, with the introduction of GATs, this logic was never really generalized correctly for them, since we were using simple equality to test if the self type of a trait where clause is a projection. This leads to shortcomings, such as:

```rust
trait Foo
where
    for<'a> Self::Gat<'a>: Debug,
{
    type Gat<'a>;
}

fn test<T: Foo>(x: T::Gat<'static>) {
    //~^ ERROR `<T as Foo>::Gat<'a>` doesn't implement `Debug`
    println!("{:?}", x);
}
```

### Problem 2: Nested associated type bounds are not uplifted

We also don't attempt to uplift bounds on nested associated types, something that we couldn't really support until rust-lang#120584. This can be demonstrated best with an example:

```rust
trait A
    where Self::Assoc: B,
    where <Self::Assoc as B>::Assoc2: C,
{
    type Assoc; // <~ The compiler *should* treat this like it has an item bound `B<Assoc2: C>`.
}

trait B { type Assoc2; }
trait C {}

fn is_c<T: C>() {}

fn test<T: A>() {
    is_c::<<Self::Assoc as B>::Assoc2>();
    //~^ ERROR the trait bound `<<T as A>::Assoc as B>::Assoc2: C` is not satisfied
}
```

Why does this matter?

Well, generalizing this behavior bridges a gap between the associated type bounds (ATB) feature and trait where clauses. Currently, all bounds that can be stably written on associated types can also be expressed as where clauses on traits; however, with the stabilization of ATB, there are now bounds that can't be desugared in the same way. This fixes that.

## How does this PR fix things?

First, when scraping item bounds from the trait's where clauses, given a trait predicate, we'll loop of the self type of the predicate as long as it's a projection. If we find a projection whose trait ref matches, we'll uplift the bound. This allows us to uplift, for example `<Self as Trait>::Assoc: Bound` (pre-existing), but also `<<Self as Trait>::Assoc as Iterator>::Item: Bound` (new).

If that projection is a GAT, we will check if all of the GAT's *own* args are all unique late-bound vars. We then map the late-bound vars to early-bound vars from the GAT -- this allows us to uplift `for<'a, 'b> Self::Assoc<'a, 'b>: Trait` into an item bound, but we will leave `for<'a> Self::Assoc<'a, 'a>: Trait` and `Self::Assoc<'static, 'static>: Trait` alone.

### Okay, but does this *really* matter?

I consider this to be an improvement of the status quo because it makes GATs a bit less magical, and makes rigid projections a bit more expressive.
Don't keep the `external_traits` as shared mutable data between the
`DocContext` and `clean::Crate`. Instead, move the data over when necessary.
This allows us to get rid of a borrowck hack in the `DocVisitor`.
This allows the visitor to borrow from the visitees.
Since the stabilization in rust-lang#127679 has reached stage0, 1.82-beta, we can
start using `&raw` freely, and even the soft-deprecated `ptr::addr_of!`
and `ptr::addr_of_mut!` can stop allowing the unstable feature.

I intentionally did not change any documentation or tests, but the rest
of those macro uses are all now using `&raw const` or `&raw mut` in the
standard library.
…idtwco

Reorder stack spills so that constants come later.

Currently constants are "pulled forward" and have their stack spills emitted first. This confuses LLVM as to where to place breakpoints at function entry, and results in argument values being wrong in the debugger. It's straightforward to avoid emitting the stack spills for constants until arguments/etc have been introduced in debug_introduce_locals, so do that.

Example LLVM IR (irrelevant IR elided):
Before:
```
define internal void `@_ZN11rust_1289457binding17h2c78f956ba4bd2c3E(i64` %a, i64 %b, double %c) unnamed_addr #0 !dbg !178 { start:
  %c.dbg.spill = alloca [8 x i8], align 8
  %b.dbg.spill = alloca [8 x i8], align 8
  %a.dbg.spill = alloca [8 x i8], align 8
  %x.dbg.spill = alloca [4 x i8], align 4
  store i32 0, ptr %x.dbg.spill, align 4, !dbg !192            ; LLVM places breakpoint here.
    #dbg_declare(ptr %x.dbg.spill, !190, !DIExpression(), !192)
  store i64 %a, ptr %a.dbg.spill, align 8
    #dbg_declare(ptr %a.dbg.spill, !187, !DIExpression(), !193)
  store i64 %b, ptr %b.dbg.spill, align 8
    #dbg_declare(ptr %b.dbg.spill, !188, !DIExpression(), !194)
  store double %c, ptr %c.dbg.spill, align 8
    #dbg_declare(ptr %c.dbg.spill, !189, !DIExpression(), !195)
  ret void, !dbg !196
}
```
After:
```
define internal void `@_ZN11rust_1289457binding17h2c78f956ba4bd2c3E(i64` %a, i64 %b, double %c) unnamed_addr #0 !dbg !178 { start:
  %x.dbg.spill = alloca [4 x i8], align 4
  %c.dbg.spill = alloca [8 x i8], align 8
  %b.dbg.spill = alloca [8 x i8], align 8
  %a.dbg.spill = alloca [8 x i8], align 8
  store i64 %a, ptr %a.dbg.spill, align 8
    #dbg_declare(ptr %a.dbg.spill, !187, !DIExpression(), !192)
  store i64 %b, ptr %b.dbg.spill, align 8
    #dbg_declare(ptr %b.dbg.spill, !188, !DIExpression(), !193)
  store double %c, ptr %c.dbg.spill, align 8
    #dbg_declare(ptr %c.dbg.spill, !189, !DIExpression(), !194)
  store i32 0, ptr %x.dbg.spill, align 4, !dbg !195            ; LLVM places breakpoint here.
    #dbg_declare(ptr %x.dbg.spill, !190, !DIExpression(), !195)
  ret void, !dbg !196
}
```
Note in particular the position of the "LLVM places breakpoint here" comment relative to the stack spills for the function arguments. LLVM assumes that the first instruction with with a debug location is the end of the prologue. As LLVM does not currently offer front ends any direct control over the placement of the prologue end reordering the IR is the only mechanism available to fix argument values at function entry in the presence of MIR optimizations like SingleUseConsts. Fixes rust-lang#128945

r? `@michaelwoerister`
bump rustc-build-sysroot version

This removes an implicit `--cap-lints` in the sysroot build. Let's see if that causes any trouble for the targets we test.
Use `&raw` in the standard library

Since the stabilization in rust-lang#127679 has reached stage0, 1.82-beta, we can
start using `&raw` freely, and even the soft-deprecated `ptr::addr_of!`
and `ptr::addr_of_mut!` can stop allowing the unstable feature.

I intentionally did not change any documentation or tests, but the rest
of those macro uses are all now using `&raw const` or `&raw mut` in the
standard library.
dingxiangfei2009 and others added 29 commits September 30, 2024 04:21
…otriddle

In redesigned rustdoc toolbar: Adjust spacings and sizing to improve behavior with over-long names

Fixes rust-lang#130993.

Some additional adjustments also fix more issues I’ve noticed such as:

* on small screens, opening search made the 3 bottons move down very slightly (because the row with the crate picker got larger, enlarging the whole grid), this is fixed with a `min-height: 60px` on the toolbar
* with long names in the “breadcrumps” area, wrapping was very broken
  * ![Screenshot_20240929_031831](https://github.com/user-attachments/assets/6f46bbb7-004b-4606-bf17-8a6f3289a8f7)
  * fixed:
  * ![Screenshot_20240929_035312](https://github.com/user-attachments/assets/4e2f8dd2-043e-4279-b588-0a72c7533f8e)
* the left grid area has a minimal width (105px); like before, that leaves about enough space for crate names becoming as short as “all cra…”;
    to save even more space, there’s support for a little bit of extra squeezing of the buttons
  * ![Screenshot_20240929_034511](https://github.com/user-attachments/assets/7c6788ee-8ec1-4a38-b341-8d67704f5575)
  * ![Screenshot_20240929_034525](https://github.com/user-attachments/assets/e141756d-37a9-4205-bc4d-235ddd1c0609)
  * ![Screenshot_20240929_034535](https://github.com/user-attachments/assets/526447f3-48b6-47aa-8a60-e5b0d4d055f0)

I’m really not super good with HTML or CSS stuff at all; there seem to be many magical numbers already, I’ve just used `px` values until things look right, I hope that’s okay 🤷‍♂️

r? `@GuillaumeGomez`

cc `@notriddle`
…mpiler-errors

properly elaborate effects implied bounds for super traits

Summary: This PR makes it so that we elaborate `<T as Tr>::Fx: EffectsCompat<somebool>` into `<T as SuperTr>::Fx: EffectsCompat<somebool>` when we know that `trait Tr: ~const SuperTr`.

Some discussion at rust-lang/project-const-traits#2.

r? project-const-traits
`@rust-lang/project-const-traits:` how do we feel about this approach?
avoid `{type error}` being leaked in user-facing messages,
particularly when using the `adt_const_params` feature
weekly update: add header to compiler deps update

Look at rust-lang#131000: library and rustbook have nice headers for deps updates (i.e. `library dependencies:`), but not compiler one. Fixes this.
add has_enzyme/needs-enzyme to the test infra

This unblocks merging the Enzyme / Autodiff frontend.
For the full implementation, see: rust-lang#129175

We don't want to run tests that require Enzyme / Autodiff support when we build rustc without the required features.

It correctly filtered out a test which started with `//@ needs-enzyme`.
```
running 80 tests
i...............................................................................

test result: ok. 79 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 380.41ms
```

Tracking:

- rust-lang#124509

r? jieyouxu
These conditionally applied flags trigger rebuilds because they can
invalidate previous cargo build cache.
make type-check-4 asm tests about non-const expressions

These tests recently got changed in rust-lang#129759. I asked the PR author to make the tests read from a `static mut` (rather than just making them "pass"), but I now think that was a mistake: previously the tests failed because the const was not a valid const expression, after the PR they failed because the const failed to evaluate.

So this PR restores the tests to "fail because the const is not a valid const expression". That can be done in a target-independent way so I unified the x86 and aarch64 tests into one.

Cc `@oli-obk` as the original [author](rust-lang@0d88631) of these tests -- not sure if you still remember what they were intended to test.
Reject leading unsafe in `cfg!(...)` and `--check-cfg`

This PR reject leading unsafe in `cfg!(...)` and `--check-cfg`.

Fixes (after-backport) rust-lang#131055
r? `@jieyouxu`
Drop conditionally applied cargo `-Zon-broken-pipe=kill` flags to fix stage 1 cargo rebuilds

The conditionally applied `-Zon-broken-pipe=kill` flag trigger rebuilds because they can invalidate previous tool build cache due to differing flags. This PR removes those flags to stop tool build cache invalidation.

> The build cache for clippy will be broken because bootstrap sets `-Zon-broken-pipe=kill` for every invocation except for cargo. Which means building cargo will invalidate any tool build cache which was built previously.
>
> *From <https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/Modifying.20run-make.20tests.20unnecessarily.20rebuild.20stage.201.20cargo/near/473486972>*

Fixes rust-lang#130980.
But introduces rust-lang#131059 (breaks 2 cargo tests that relied upon some behavior related to `-Zon-broken-pipe=kill` being set).

r? `@onur-ozkan` (or bootstrap)
replace manual verbose checks with `Config::is_verbose`

self-explanatory
…iaskrgr

Rollup of 4 pull requests

Successful merges:

 - rust-lang#130895 (make type-check-4 asm tests about non-const expressions)
 - rust-lang#131057 (Reject leading unsafe in `cfg!(...)` and `--check-cfg`)
 - rust-lang#131060 (Drop conditionally applied cargo `-Zon-broken-pipe=kill` flags to fix stage 1 cargo rebuilds)
 - rust-lang#131061 (replace manual verbose checks with `Config::is_verbose`)

r? `@ghost`
`@rustbot` modify labels: rollup
…obzol

Bump `cc` and run `cargo update` for bootstrap

Bump `cc` to 1.1.22, which includes a caching fix. Also run `cargo update` which does a minor increment on a few dependencies.
Copy correct path to clipboard for modules/keywords/primitives

Fixes rust-lang#131021
…ope-lint, r=jieyouxu

Preserve brackets around if-lets and skip while-lets

r? `@jieyouxu`

Tracked by rust-lang#124085

Fresh out of rust-lang#129466, we have discovered 9 crates that the lint did not successfully migrate because the span of `if let` includes the surrounding brackets `(..)` like the following, which surprised me a bit.

```rust
if (if let .. { .. } else { .. }) {
// ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
// the span somehow includes the surrounding brackets
}
```

There is one crate that failed the migration because some suggestion spans cross the macro expansion boundaries. Surely there is no way to patch them with `match` rewrite. To handle this case, we will instead require all spans to be tested for admissibility as suggestion spans.

Besides, there are 4 false negative cases discovered with desugared-`while let`. We don't need to lint them, because the `else` branch surely contains exactly one statement because the drop order is not changed whatsoever in this case.

```rust
while let Some(value) = droppy().get() {
..
}
// is desugared into
loop {
    if let Some(value) = droppy().get() {
        ..
    } else {
        break;
        // here can be nothing observable in this block
    }
}
```

I believe this is the one and only false positive that I have found. I think we have finally nailed all the corner cases this time.
…8179, r=compiler-errors

Fix `adt_const_params` leaking `{type error}` in error msg

Fixes the confusing diagnostic described in rust-lang#118179. (users would see `{type error}` in some situations, which is pretty weird)

`adt_const_params` tracking issue: rust-lang#95174
Improve `--print=check-cfg` documentation

This PR improves the `--print=check-cfg` documentation by:
 1. switching to a table for better readability
 2. adding a clear indication where the specific check-cfg syntax starts
 3. adding a link to the main `--check-cfg` documentation

`@rustbot` label +F-check-cfg
…int, r=Kobzol

enable compiler fingerprint logs in verbose mode

This provides very useful logs especially when debugging build cache-related stuff.
…iaskrgr

Rollup of 5 pull requests

Successful merges:

 - rust-lang#131023 (Copy correct path to clipboard for modules/keywords/primitives)
 - rust-lang#131035 (Preserve brackets around if-lets and skip while-lets)
 - rust-lang#131038 (Fix `adt_const_params` leaking `{type error}` in error msg)
 - rust-lang#131053 (Improve `--print=check-cfg` documentation)
 - rust-lang#131056 (enable compiler fingerprint logs in verbose mode)

r? `@ghost`
`@rustbot` modify labels: rollup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.