[WIP] Add implementation to forward linear iterators in strided views #2797

spectre-ns · 2024-07-11T00:53:47Z

Checklist

The title and commit message(s) are descriptive.
Small commits made to fix your PR have been squashed to avoid history pollution.
Tests have been added for new features or bug fixes.
API of new functions and classes are documented.

Description

spectre-ns · 2024-07-11T23:01:02Z

@JohanMabille I'm trying to forward linear iterators where possible but I think the implementations expect steppers in certain places and there are some subtle differences I haven't figured out yet. I thought it was weird that the linear methods on the strided view actually return xiterators of steppers rather than linear iterators as the interface would suggest.

Test bench from issue #2734

Without Changes:
case 1: 10.6235ms
case 2a: 48.4154ms
case 2b: 8.93694ms
case 2c: 48.7691ms
case 3a: 47.3316ms
case 3b: 9.05212ms
case 3c: 47.3312ms

With changes results in:
case 1: 10.0266ms
case 2a: 9.23594ms
case 2b: 8.97092ms
case 2c: 9.09802ms
case 3a: 8.86476ms
case 3b: 9.55695ms
case 3c: 8.67326ms

The implementation is super rough but I wanted to show you the basic idea. I ran into a pesky issue with reshaped views that resulted in my having to force a fallback to the old implementation. Might need some help there but the performance improvements speak for themselves.

spectre-ns · 2024-07-12T12:27:16Z

Can a linear assignment be used when the inner stride is not one? It seems like the linear iterators increment by one but when doing conversion from row major to column major the strides between adjacent elements are non-unity.

spectre-ns · 2024-07-12T23:54:23Z

Can a linear assignment be used when the inner stride is not one? It seems like the linear iterators increment by one but when doing conversion from row major to column major the strides between adjacent elements are non-unity.

This issue here was trying to iterate on the object like a column major when the internal storage is row major. That requires a stepper... In these cases the implementation falls back to a stepper even though the underlying expression might be linearly iterable.

spectre-ns · 2024-07-13T00:39:46Z

See benchmark code below:

#include <chrono>
#include <xtensor/xrandom.hpp>
#include <xtensor/xtensor.hpp>
#include <xtensor/xnoalias.hpp>

double mean_milliseconds_from_total(std::chrono::nanoseconds total,
    size_t num_repeats) {
    std::chrono::duration<double, std::milli> total_ms = total;
    return total_ms.count() / (double)num_repeats;
}

int main() {
    size_t num_repeats = 100;
    xt::xtensor<double, 1> a = xt::random::rand<double>({ 10000000 });
    xt::xtensor<double, 1> b = xt::random::rand<double>({ 10000000 });
    xt::xtensor<double, 1> c = xt::zeros<double>({ 10000000 });

    // case 1: full tensor
    auto started = std::chrono::high_resolution_clock::now();
    for (size_t i = 0; i < num_repeats; ++i)
        xt::noalias(c) = a + b;
    auto finished = std::chrono::high_resolution_clock::now();
    std::cout << "case 1:  "
        << mean_milliseconds_from_total(finished - started, num_repeats)
        << "ms" << std::endl;

    // case 2a: view of tensor and expression with xt::all()
    started = std::chrono::high_resolution_clock::now();
    for (size_t i = 0; i < num_repeats; ++i)
        xt::noalias(xt::strided_view(c, { xt::all() })) = xt::strided_view(a + b, { xt::all() });
    finished = std::chrono::high_resolution_clock::now();
    std::cout << "case 2a: "
        << mean_milliseconds_from_total(finished - started, num_repeats)
        << "ms" << std::endl;

    // case 2b: view of only tensor with xt::all()
    started = std::chrono::high_resolution_clock::now();
    for (size_t i = 0; i < num_repeats; ++i)
        xt::noalias(xt::strided_view(c, { xt::all() })) = a + b;
    finished = std::chrono::high_resolution_clock::now();
    std::cout << "case 2b: "
        << mean_milliseconds_from_total(finished - started, num_repeats)
        << "ms" << std::endl;

    // case 2c: view of only expression with xt::all()
    started = std::chrono::high_resolution_clock::now();
    for (size_t i = 0; i < num_repeats; ++i)
        xt::noalias(c) = xt::strided_view(a + b, { xt::all() });
    finished = std::chrono::high_resolution_clock::now();
    std::cout << "case 2c: "
        << mean_milliseconds_from_total(finished - started, num_repeats)
        << "ms" << std::endl;

    // case 3a: view of tensor and expression with xt::range()
    started = std::chrono::high_resolution_clock::now();
    for (size_t i = 0; i < num_repeats; ++i)
        xt::noalias(xt::strided_view(c, { xt::range(0, c.size()) })) =
        xt::strided_view(a + b, { xt::range(0, c.size()) });
    finished = std::chrono::high_resolution_clock::now();
    std::cout << "case 3a: "
        << mean_milliseconds_from_total(finished - started, num_repeats)
        << "ms" << std::endl;

    // case 3b: view of only tensor with xt::range()
    started = std::chrono::high_resolution_clock::now();
    for (size_t i = 0; i < num_repeats; ++i)
        xt::noalias(xt::strided_view(c, { xt::range(0, c.size()) })) = a + b;
    finished = std::chrono::high_resolution_clock::now();
    std::cout << "case 3b: "
        << mean_milliseconds_from_total(finished - started, num_repeats)
        << "ms" << std::endl;

    // case 3c: view of only expression with xt::range()
    started = std::chrono::high_resolution_clock::now();
    for (size_t i = 0; i < num_repeats; ++i)
        xt::noalias(c) = xt::strided_view(a + b, { xt::range(0, c.size()) });
    finished = std::chrono::high_resolution_clock::now();
    std::cout << "case 3c: "
        << mean_milliseconds_from_total(finished - started, num_repeats)
        << "ms" << std::endl;
    return 0;
}

JohanMabille

Thanks a lot for tackling this! I think the linear / not linear iteration was supposed to be handled in the flat_storage_adaptor but we probably missed some cases.

I wonder if it would make sense to merge the new linear_flat_expression_adaptor with the existing flat_expression_adaptor. This would put the reponsibility of choosing between the linear iterator and the regular one in the flat adaptor rather than in the strided_view. Also It would probably benefit other classes using the flat_expression adaptor.

include/xtensor/xstrided_view.hpp

include/xtensor/xstrided_view_base.hpp

JohanMabille · 2024-07-19T07:47:31Z

include/xtensor/xstrided_view_base.hpp

@@ -75,6 +76,52 @@ namespace xt
            size_type m_size;
        };

+        template <class CT, layout_type L>
+        class linear_flat_expression_adaptor : public flat_expression_adaptor<CT, L>


Why inheriting from flat_expression_adaptor here? It does not seem that you reuse anything from it.

@JohanMabille I only implemented the linear iterator methods for the new class. linear_flat_expression_adaptor relies on inheriting the implementation of flat_expression_adaptor for calls made to methods such as cbegin.

@JohanMabille what do you think? Should I implement this differently?

spectre-ns · 2024-07-19T10:28:41Z

Thanks a lot for tackling this! I think the linear / not linear iteration was supposed to be handled in the flat_storage_adaptor but we probably missed some cases.

I wonder if it would make sense to merge the new linear_flat_expression_adaptor with the existing flat_expression_adaptor. This would put the reponsibility of choosing between the linear iterator and the regular one in the flat adaptor rather than in the strided_view. Also It would probably benefit other classes using the flat_expression adaptor.

@JohanMabille I debated that as well. Let me play around with the logic and see if I can make this work without modifying the strided_view class and instead us the flat_expression_adaptor implementation. That would likely be cleaner. I was hoping to demonstrate the type of performance improvements before coming up with the final solution.

spectre-ns · 2024-07-21T01:39:43Z

Thanks a lot for tackling this! I think the linear / not linear iteration was supposed to be handled in the flat_storage_adaptor but we probably missed some cases.

I wonder if it would make sense to merge the new linear_flat_expression_adaptor with the existing flat_expression_adaptor. This would put the reponsibility of choosing between the linear iterator and the regular one in the flat adaptor rather than in the strided_view. Also It would probably benefit other classes using the flat_expression adaptor.

@JohanMabille combining them became difficult because the uvector implementation doesn't have the linear_iterator interface. The begin and end methods are linear iterators but the interface is not consistent. I'm not sure if this is intentional or not.

spectre-ns force-pushed the strided-view-linear-iterator-forwarding branch from bdde934 to b4f7e3d Compare July 12, 2024 23:52

spectre-ns force-pushed the strided-view-linear-iterator-forwarding branch from b4f7e3d to 2d268ad Compare July 12, 2024 23:55

Forwarded linear iterator for linearly iterable expressions

7935a69

spectre-ns force-pushed the strided-view-linear-iterator-forwarding branch from 2d268ad to 7935a69 Compare July 13, 2024 00:15

JohanMabille reviewed Jul 19, 2024

View reviewed changes

spectre-ns force-pushed the strided-view-linear-iterator-forwarding branch from 9f834af to 7935a69 Compare July 21, 2024 01:15

Drew Hubley added 2 commits July 20, 2024 22:25

Fixed comments

54fe084

Ran pre-commit

9268b80

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add implementation to forward linear iterators in strided views #2797

[WIP] Add implementation to forward linear iterators in strided views #2797

spectre-ns commented Jul 11, 2024

spectre-ns commented Jul 11, 2024

spectre-ns commented Jul 12, 2024 •

edited

Loading

spectre-ns commented Jul 12, 2024

spectre-ns commented Jul 13, 2024

JohanMabille left a comment •

edited

Loading

JohanMabille Jul 19, 2024

spectre-ns Jul 19, 2024

spectre-ns Jul 21, 2024

spectre-ns commented Jul 19, 2024

spectre-ns commented Jul 21, 2024 •

edited

Loading

[WIP] Add implementation to forward linear iterators in strided views #2797

Are you sure you want to change the base?

[WIP] Add implementation to forward linear iterators in strided views #2797

Conversation

spectre-ns commented Jul 11, 2024

Checklist

Description

spectre-ns commented Jul 11, 2024

spectre-ns commented Jul 12, 2024 • edited Loading

spectre-ns commented Jul 12, 2024

spectre-ns commented Jul 13, 2024

JohanMabille left a comment • edited Loading

Choose a reason for hiding this comment

JohanMabille Jul 19, 2024

Choose a reason for hiding this comment

spectre-ns Jul 19, 2024

Choose a reason for hiding this comment

spectre-ns Jul 21, 2024

Choose a reason for hiding this comment

spectre-ns commented Jul 19, 2024

spectre-ns commented Jul 21, 2024 • edited Loading

spectre-ns commented Jul 12, 2024 •

edited

Loading

JohanMabille left a comment •

edited

Loading

spectre-ns commented Jul 21, 2024 •

edited

Loading