Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renovating CpuId #68

Open
JonathanWoollett-Light opened this issue Aug 19, 2022 · 0 comments
Open

Renovating CpuId #68

JonathanWoollett-Light opened this issue Aug 19, 2022 · 0 comments

Comments

@JonathanWoollett-Light
Copy link

JonathanWoollett-Light commented Aug 19, 2022

At the moment the current implementation of CpuId and the FamStructWrapper and FamStruct underlying it, are not as simple as they could be. For interop with kvm_bindings I wrote a custom structure which is both safe and has an identical memory layout to kvm_cpuid2, a zero-cost wrapper.

This structure could directly replace CpuId or possibly a more generic variant could be used in place of FamStructWrapper and FamStruct.

It would be good to get some feedback and discussion relating to what areas a change like this may influence.

use serde::{Deserialize, Serialize};

/// A rusty mimic of
/// [`kvm_cpuid`](https://elixir.bootlin.com/linux/v5.10.129/source/arch/x86/include/uapi/asm/kvm.h#L226)
/// .
///
/// [`RawCpuid`] has an identical memory layout to
/// [`kvm_cpuid`](https://elixir.bootlin.com/linux/v5.10.129/source/arch/x86/include/uapi/asm/kvm.h#L226)
/// .
///
/// This allows [`RawCpuid`] to function as a simpler replacement for [`kvm_bindings::CpuId`]. In
/// the future it may replace [`kvm_bindings::CpuId`] fully.
///
/// For implementation details see <https://doc.rust-lang.org/nomicon/vec/vec.html>.
#[derive(Debug)]
#[repr(C)]
pub struct RawCpuid {
    /// Number of entries.
    nent: u32,
    /// Padding.
    padding: Padding<{ size_of::<u32>() }>,
    /// Pointer to entries.
    entries: NonNull<RawCpuidEntry>,
    _marker: PhantomData<RawCpuidEntry>,
}
// This implementation could be significantly more efficient.
impl Clone for RawCpuid {
    fn clone(&self) -> Self {
        let mut new_raw_cpuid = Self::new();
        new_raw_cpuid.resize(self.nent as usize);
        for i in 0..self.nent as usize {
            new_raw_cpuid[i] = self[i].clone();
        }
        new_raw_cpuid
    }
}
impl serde::Serialize for RawCpuid {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        use serde::ser::SerializeSeq;
        let mut seq = serializer.serialize_seq(Some(self.nent as usize))?;
        for i in 0..self.nent as usize {
            seq.serialize_element(&self[i])?;
        }
        seq.end()
    }
}
struct RawCpuidVisitor;
impl<'de> serde::de::Visitor<'de> for RawCpuidVisitor {
    type Value = RawCpuid;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("Expected sequence of RawCpuidEntry")
    }

    fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
    where
        A: serde::de::SeqAccess<'de>,
    {
        let mut entries = Vec::new();
        while let Some(next) = seq.next_element::<RawCpuidEntry>()? {
            entries.push(next);
        }
        Ok(Self::Value::from(entries))
    }
}
impl<'de> serde::Deserialize<'de> for RawCpuid {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        deserializer.deserialize_seq(RawCpuidVisitor)
    }
}
impl PartialEq for RawCpuid {
    fn eq(&self, other: &Self) -> bool {
        if self.nent == other.nent {
            for i in 0..self.nent as usize {
                if self[i] != other[i] {
                    return false;
                }
            }
            true
        } else {
            false
        }
    }
}
impl Eq for RawCpuid {}
unsafe impl Send for RawCpuid {}
unsafe impl Sync for RawCpuid {}
impl RawCpuid {
    /// Alias for [`RawCpuid::default()`].
    #[must_use]
    pub fn new() -> Self {
        Self::default()
    }
    /// Returns number of elements.
    #[must_use]
    pub fn nent(&self) -> u32 {
        self.nent
    }
    /// Returns an entry for a given lead (function) and sub-leaf (index).
    ///
    /// Returning `None` if it is not present.
    #[must_use]
    pub fn get(&self, leaf: u32, sub_leaf: u32) -> Option<&RawCpuidEntry> {
        // TODO Would using binary search here for leaf offer much speedup?
        self.iter()
            .find(|entry| entry.function == leaf && entry.index == sub_leaf)
    }
    /// Resizes allocated memory
    #[allow(clippy::cast_ptr_alignment)]
    fn resize(&mut self, n: usize) {
        // alloc
        if self.nent == 0 && n > 0 {
            let new_layout = Layout::array::<RawCpuidEntry>(n).unwrap();

            // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
            assert!(
                isize::try_from(new_layout.size()).is_ok(),
                "Allocation too large"
            );

            let new_ptr = unsafe { std::alloc::alloc(new_layout) };
            self.entries = match NonNull::new(new_ptr.cast::<RawCpuidEntry>()) {
                Some(p) => p,
                None => std::alloc::handle_alloc_error(new_layout),
            };
        }
        // realloc
        else if self.nent > 0 && n > 0 {
            let new_layout = Layout::array::<RawCpuidEntry>(n).unwrap();

            // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
            assert!(
                isize::try_from(new_layout.size()).is_ok(),
                "Allocation too large"
            );

            let old_layout =
                Layout::array::<RawCpuidEntry>(usize::try_from(self.nent).unwrap()).unwrap();
            let old_ptr = self.entries.as_ptr().cast::<u8>();
            let new_ptr = unsafe { std::alloc::realloc(old_ptr, old_layout, new_layout.size()) };

            self.entries = match NonNull::new(new_ptr.cast::<RawCpuidEntry>()) {
                Some(p) => p,
                None => std::alloc::handle_alloc_error(new_layout),
            };
        }
        // dealloc
        else if self.nent > 0 && n == 0 {
            let old_layout =
                Layout::array::<RawCpuidEntry>(usize::try_from(self.nent).unwrap()).unwrap();
            let old_ptr = self.entries.as_ptr().cast::<u8>();
            unsafe { std::alloc::dealloc(old_ptr, old_layout) };
            self.entries = NonNull::dangling();
        }
        self.nent = u32::try_from(n).unwrap();
    }

    /// Pushes entry onto end.
    ///
    /// # Panics
    ///
    /// On allocation failure.
    pub fn push(&mut self, entry: RawCpuidEntry) {
        self.resize(usize::try_from(self.nent).unwrap() + 1);
        unsafe {
            std::ptr::write(
                self.entries
                    .as_ptr()
                    .add(usize::try_from(self.nent).unwrap()),
                entry,
            )
        }
    }
    /// Pops entry from end.
    ///
    /// # Panics
    ///
    /// On allocation failure.
    pub fn pop(&mut self) -> Option<RawCpuidEntry> {
        if self.nent > 0 {
            let u_nent = usize::try_from(self.nent).unwrap();
            let rtn = unsafe { Some(std::ptr::read(self.entries.as_ptr().add(u_nent))) };
            self.resize(u_nent - 1);
            rtn
        } else {
            None
        }
    }
}
impl Default for RawCpuid {
    fn default() -> Self {
        Self {
            nent: 0,
            padding: Padding::default(),
            entries: NonNull::dangling(),
            _marker: PhantomData,
        }
    }
}

// We implement custom drop which drops all entries using `self.nent`
impl Drop for RawCpuid {
    fn drop(&mut self) {
        if self.nent != 0 {
            unsafe {
                std::alloc::dealloc(
                    self.entries.as_ptr().cast::<u8>(),
                    Layout::array::<RawCpuidEntry>(usize::try_from(self.nent).unwrap()).unwrap(),
                );
            }
        }
    }
}
impl Deref for RawCpuid {
    type Target = [RawCpuidEntry];
    fn deref(&self) -> &Self::Target {
        unsafe {
            std::slice::from_raw_parts(self.entries.as_ptr(), usize::try_from(self.nent).unwrap())
        }
    }
}
impl DerefMut for RawCpuid {
    fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe {
            std::slice::from_raw_parts_mut(
                self.entries.as_ptr(),
                usize::try_from(self.nent).unwrap(),
            )
        }
    }
}
impl From<kvm_bindings::CpuId> for RawCpuid {
    fn from(value: kvm_bindings::CpuId) -> Self {
        // As cannot acquire ownership of the underlying slice, we clone it.
        let cloned = value.as_slice().to_vec();
        let (ptr, len, _cap) = vec_into_raw_parts(cloned);
        Self {
            nent: u32::try_from(len).unwrap(),
            padding: Padding::default(),
            entries: NonNull::new(ptr.cast::<RawCpuidEntry>()).unwrap(),
            _marker: PhantomData,
        }
    }
}
impl From<Vec<RawCpuidEntry>> for RawCpuid {
    fn from(vec: Vec<RawCpuidEntry>) -> Self {
        let (ptr, len, _cap) = vec_into_raw_parts(vec);
        Self {
            nent: u32::try_from(len).unwrap(),
            padding: Padding::default(),
            entries: NonNull::new(ptr.cast::<RawCpuidEntry>()).unwrap(),
            _marker: PhantomData,
        }
    }
}
impl From<RawCpuid> for kvm_bindings::CpuId {
    fn from(this: RawCpuid) -> Self {
        let cpuid_slice = unsafe {
            std::slice::from_raw_parts(this.entries.as_ptr(), usize::try_from(this.nent).unwrap())
        };
        // println!("cpuid_slice: {:?}",cpuid_slice);
        #[allow(clippy::transmute_ptr_to_ptr)]
        let kvm_bindings_slice = unsafe { std::mem::transmute(cpuid_slice) };
        kvm_bindings::CpuId::from_entries(kvm_bindings_slice).unwrap()
    }
}

/// Mimic of the currently unstable
/// [`Vec::into_raw_parts`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.into_raw_parts)
/// .
fn vec_into_raw_parts<T>(v: Vec<T>) -> (*mut T, usize, usize) {
    let mut me = std::mem::ManuallyDrop::new(v);
    (me.as_mut_ptr(), me.len(), me.capacity())
}
/// A structure for owning unused memory for padding.
///
/// A wrapper around an uninitialized `N` element array of `u8`s (`MaybeUninit<[u8;N]>` constructed
/// with `Self(MaybeUninit::uninit())`).
#[derive(Debug, Clone)]
#[repr(C)]
pub struct Padding<const N: usize>(MaybeUninit<[u8; N]>);
impl<const N: usize> Default for Padding<N> {
    fn default() -> Self {
        Self(MaybeUninit::uninit())
    }
}
impl<const N: usize> serde::Serialize for Padding<N> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        serializer.serialize_unit_struct("Padding")
    }
}
impl<'de, const N: usize> serde::Deserialize<'de> for Padding<N> {
    fn deserialize<D>(_deserializer: D) -> Result<Padding<N>, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        Ok(Padding(MaybeUninit::uninit()))
    }
}
impl<const N: usize> PartialEq for Padding<N> {
    fn eq(&self, _other: &Self) -> bool {
        true
    }
}
impl<const N: usize> Eq for Padding<N> {}

/// CPUID entry (a mimic of <https://elixir.bootlin.com/linux/v5.10.129/source/arch/x86/include/uapi/asm/kvm.h#L232>).
#[derive(Debug, Clone, Eq, PartialEq, Default, Serialize, Deserialize)]
#[repr(C)]
pub struct RawCpuidEntry {
    /// CPUID function (leaf).
    pub function: u32,
    /// CPUID index (subleaf).
    pub index: u32,
    /// TODO
    pub flags: u32,
    /// EAX register.
    pub eax: u32,
    /// EBX register.
    pub ebx: u32,
    /// ECX register.
    pub ecx: u32,
    /// EDX register.
    pub edx: u32,
    /// CPUID entry padding.
    pub padding: Padding<{ size_of::<[u32; 3]>() }>,
}
impl From<RawCpuidEntry> for (u32, u32, u32, u32) {
    fn from(this: RawCpuidEntry) -> Self {
        (this.eax, this.ebx, this.ecx, this.edx)
    }
}
impl fmt::LowerHex for RawCpuidEntry {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct("RawCpuidEntry")
            .field("function", &format!("{:x}", self.function))
            .field("index", &format!("{:x}", self.index))
            .field("eax", &format!("{:x}", self.eax))
            .field("ebx", &format!("{:x}", self.ebx))
            .field("ecx", &format!("{:x}", self.ecx))
            .field("edx", &format!("{:x}", self.edx))
            .finish()
    }
}

As currently used FamStruct is not a zero-cost abstraction (in this case the size of CpuId is larger than neccessary). This change would make it zero-cost.
I believe this is a plain improvement, simplifying the code, improving readability (in large part due to the simplification) and improving performance (decreasing memory usage), side affects of this change are currently not well known however and require furthers discussion.
I would advocate for the deprecation of FamStruct moving forward.
There are implementations we could use which would not make this a breaking change. I think this change could be well implemented without breaking anything (although to be sure would require implementing it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant