Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor from 0.7.1 to 0.10.0 causing unrecoverable crash #36

Closed
mattiZed opened this issue Apr 24, 2024 · 11 comments
Closed

Refactor from 0.7.1 to 0.10.0 causing unrecoverable crash #36

mattiZed opened this issue Apr 24, 2024 · 11 comments

Comments

@mattiZed
Copy link
Contributor

mattiZed commented Apr 24, 2024

Hi!

First of, thanks a lot for your work on this crate!

I am experiencing a bug when using georust/polyline through a python wrapper called pypolyline.

If that's okay, I will link to the issue that I filed in their repository which contains the necessary information.

To repeat and summarize some of the information I gave on that other issue:

Any help very much appreciated.

Let me know if you need more information.

Best,
Stefan

@urschrei
Copy link
Member

@purew I believe the perf changes you made have this side effect. Can you take a look?

@urschrei
Copy link
Member

The checked_shl() method is available, but it may have perf implications.

@mattiZed
Copy link
Contributor Author

mattiZed commented Apr 24, 2024

The checked_shl() method is available, but it may have perf implications.

Hm, I don't quite understand how that could help here, can you elaborate?

IMHO a check comparing index against the length of chars is necessary before doing chars[index] to prevent this bug from happening - this should be possible using a simple condition and reacting with some kind of error when it is triggered. Or maybe just constrain the loop with an upper bound/switch to a while loop (as done in the function wrapping trans) so it may not run indefinitely. This should hopefully only have a negligible performance impact.

@urschrei
Copy link
Member

Hm, I don't quite understand how that could help here, can you elaborate?

Sure.

fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
    let mut at_index;
    let mut shift = 0;
    let mut result = 0;
    let mut byte;
    loop {
        at_index = chars[index];
        byte = (at_index as u64).saturating_sub(63);
        index += 1;
        let to_add = (byte & 0x1f)
            .checked_shl(shift)
            .ok_or(format!("Couldn't decode character at index {}", index - 1))?;
        result |= to_add;
        shift += 5;
        if byte < 0x20 {
            break;
        }
    }
    let coordinate_change = if (result & 1) > 0 {
        !(result >> 1)
    } else {
        result >> 1
    } as i64;
    Ok((coordinate_change, index))
}

However, the performance impact of this change is huge, so it's not feasible:

    Finished bench [optimized] target(s) in 0.12s
     Running benches/benchmarks.rs (target/release/deps/benchmarks-797a7b4ddb4d36ce)
bench encode: 1000 coordinates
                        time:   [77.693 µs 78.226 µs 78.783 µs]
                        change: [-0.0171% +0.2637% +0.6474%] (p = 0.13 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) high mild
  9 (9.00%) high severe

bench decode: 21502 coordinates
                        time:   [11.087 ms 11.100 ms 11.114 ms]
                        change: [+5942.8% +5967.2% +5988.8%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

bench polyline6 decoding
                        time:   [1.4322 µs 1.4341 µs 1.4361 µs]
                        change: [+2978.0% +2987.4% +2996.4%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  8 (8.00%) high mild
  5 (5.00%) high severe

bench HUGE polyline6 decoding
                        time:   [1.2993 ms 1.3056 ms 1.3128 ms]
                        change: [+5460.0% +5504.3% +5549.9%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

IMHO a check comparing index against the length of chars is necessary before doing chars[index] to prevent this bug from happening

Please post some code demonstrating your idea so we can test and benchmark it!

@mattiZed
Copy link
Contributor Author

mattiZed commented Apr 24, 2024

Oh wow, that's quite the impact. Okay, I think I understand a bit better now. I'm not totally up to speed has to how the polyline encoding/decoding works, I will probably do some research on it.

I was thinking something like this - it's basically the current code but the loop {} has been changed to a while {}. What do you think?

I don't have rust environment set up currently, so I wrote this out of thin air:

fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
    let mut at_index;
    let mut shift = 0;
    let mut result = 0;
    let mut byte;
    let mut valid = False;
    while index < chars.len() {
        at_index = chars[index];
        byte = (at_index as u64).saturating_sub(63);
        index += 1;
        result |= (byte & 0x1f) << shift;
        shift += 5;
        if byte < 0x20 {
            valid = True;
            break;
        }
    }

    if not valid {
      // this would mean that we have terminated above loop
      // but not because of the original condition on "byte < 0x20",
      // so this should be treated as an error
      Err( ? )
    }

    let coordinate_change = if (result & 1) > 0 {
        !(result >> 1)
    } else {
        result >> 1
    } as i64;
    Ok((coordinate_change, index))
}

Edit: slightly refactored.

@urschrei
Copy link
Member

That doesn't catch the error, unfortunately:

fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
    let mut at_index;
    let mut shift = 0;
    let mut result = 0;
    let mut byte;
    let mut valid = false;
    while index < chars.len() {
        at_index = chars[index];
        byte = (at_index as u64).saturating_sub(63);
        index += 1;
        result |= (byte & 0x1f) << shift;
        shift += 5;
        if byte < 0x20 {
            valid = true;
            break;
        }
    }

    if !valid {
        // this would mean that we have terminated above loop
        // but not because of the original condition on "byte < 0x20",
        // so this should be treated as an error
        return Err("Couldn't shift".to_string());
    }

    let coordinate_change = if (result & 1) > 0 {
        !(result >> 1)
    } else {
        result >> 1
    } as i64;
    Ok((coordinate_change, index))
}

The problem is this line: result |= (byte & 0x1f) << shift;

@mattiZed
Copy link
Contributor Author

mattiZed commented Apr 24, 2024

While I'm aware that this does not fix the actual error per-se, it should at least handle the binary crashing violently, right? I understand that it would be desirable to also know the exact character that is invalid though.

Currently, the main thing thats bugging me is that certain input to the pypolyline.cutil.decode_polyline function causes the rust binary to crash which then also kills the python interpreter, so there is no way of handling this problem in some way.

This is not much different than the "solution" I proposed earlier, but might further clarify the situation:

fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
    let mut at_index;
    let mut shift = 0;
    let mut result = 0;
    let mut byte;
    loop {
        if index >= chars.len() {
           Err("Cannot decode polyline".to_string())
        }
        at_index = chars[index];
        byte = (at_index as u64).saturating_sub(63);
        index += 1;
        result |= (byte & 0x1f) << shift;
        shift += 5;
        if byte < 0x20 {
            break;
        }
    }

    let coordinate_change = if (result & 1) > 0 {
        !(result >> 1)
    } else {
        result >> 1
    } as i64;
    Ok((coordinate_change, index))
}

@urschrei
Copy link
Member

Both of your solutions work with release optimisations, and have negligible perf impact. But they don't work with test optimisations (running cargo test) presumably due to overflow checks, so we need to figure out a way around that.

There's no need to worry about the pypolyline crash – once we're able to correctly return a Result from trans the error will bubble up to the wrapper and pypolyline will do the right thing.

@mattiZed
Copy link
Contributor Author

Ah, okay! So if I read you correctly, cargo detects a possible overflow on byte due to the repeated ORing? Then I also understand better why you'd rather fix the root of the problem. I'll try to think of a solution as well.

@mattiZed
Copy link
Contributor Author

mattiZed commented Apr 24, 2024

What about something like this?

fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
    let mut at_index;
    let mut shift = 0;
    let mut result = 0;
    let mut byte;
    loop {
        at_index = chars[index];
        byte = (at_index as u64).saturating_sub(63);
        index += 1;
        // Protect against overflow by checking if any bits would be moved out of the u64
        if ((byte & 0x1f) >> (64 - shift)) != 0 {
            Err(format!("Couldn't decode character at index {}", index - 1))
        }
        result |= (byte & 0x1f) << shift;
        shift += 5;
        if byte < 0x20 {
            break;
        }
    }

    let coordinate_change = if (result & 1) > 0 {
        !(result >> 1)
    } else {
        result >> 1
    } as i64;
    Ok((coordinate_change, index))
}

Edit:

I haven't fully read up on the logic of polyline decoding, but probably we just need to check if (64 - shift) > 5, e.g. if another shift by 5 is still possible without overflow?

loop {
    at_index = chars[index];
    byte = (at_index as u64).saturating_sub(63);
    result |= (byte & 0x1f) << shift;

    if byte < 0x20 {
        break;
    }

    index += 1;
    shift += 5;

    if shift > (64 - 5) {
        Err(format!("Couldn't decode character at index {}", index - 1))
    }
}

I have prepared a PR: #37

@urschrei
Copy link
Member

This has been fairly comprehensively addressed by #42, #43, #45, #47.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants