Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for sendmmsg/recvmmsg #494

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

Tuetuopay
Copy link

This PR aims to add support for the sendmmsg/recvmmsg syscalls.

The API exposed here is a bit simplified, where the "user friendly" functions don't expose the scatter/gather stuff (the _vectored variants found for recvmsg/sendmsg). This was not exposed to not blow up the API surface, while keeping the functionality using MmsgHdr(Mut) for advanced usages.

Thanks!

FreeBSD use a `size_t` for the `len` argument, while other UNIX-likes
use an `unigned int`.
From a quick glance, Windows does not have the equivalent syscall in the
WinSock API.
This one is great. The whole unix world uses an int for the flags, and
documents it as such. But musl uses an unsigned int there [1], while
still documenting a signed int.

This API is cursed.

[1]: https://git.musl-libc.org/cgit/musl/tree/include/sys/socket.h?id=39838619bb8b65a8897abcfda8c17ad6de0115d8#n70
For some braindead reason, old versions of android on 32 bit  defined
socklen_t to be an int, instead of the unsigned int it was everywhere
else.
Copy link
Collaborator

@Thomasdezeeuw Thomasdezeeuw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried using &[MsgHdr] and &mut [MshHdrMut] instead of new types (MmsgHdr and MmsgHdrMut)?

src/sys/unix.rs Outdated Show resolved Hide resolved
@Tuetuopay
Copy link
Author

Have you tried using &[MsgHdr] and &mut [MshHdrMut] instead of new types (MmsgHdr and MmsgHdrMut)?

Sadly yes, but this is not possible because mmsghdr interleaves the buffer size inbetween msghdrs. So it's basically a &[(MsgHdr, usize)] that's required, hence the newtype. And the kernel has no struct-of-array constructs so there's no escape hatch there neither.

target_vendor = "apple"
))
))]
pub struct MmsgHdr<'addr, 'bufs, 'control> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about matching mmsghdr here? For example

pub struct MmsgHdr<'addr, 'bufs, 'control> {
    msg: MsgHdr,
    len: libc::c_uint,
}

This way the caller can decide if they want to use a Vec, a slice an array or something else.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being able to avoid the allocation which the vec implies in the hot I/O loop when the batch size is const would be nice for sure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two options:

  • go with your suggestion, and let callers provide their own slices of MmsgHdr. This works for the "lower level" Socket::sendmmsg function, but not for Socket::send_multiple_to as it needs some memory layout adaptation
  • const generics may be an option? though we may still want to defer to the caller for that

@xv-ian-c
Copy link

xv-ian-c commented Sep 4, 2024

@Tuetuopay are you still interested in pursuing this functionality?

}

/// Returns the flags of the message.
pub fn flags(&self, n: usize) -> Vec<RecvFlags> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think accessors are also needed for len and control_len fields in order to be able to usefully consume the result of the syscall.

Rather than the 3 vec allocations it might also be nice to have a method to return all 3 at once so we'd at least have only 1, or to avoid it completely with an iterator based API.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/src/lib.rs b/src/lib.rs
index 6c49cd1..51ebcc2 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -923,7 +923,43 @@ impl<'addr, 'bufs, 'control> MmsgHdrMut<'addr, 'bufs, 'control> {
         self
     }
 
-    /// Returns the flags of the message.
+    /// Returns the lengths, flags and control lengths of the messages.
+    pub fn results(&self, n: usize) -> Vec<(usize, RecvFlags, usize)> {
+        self.inner
+            .iter()
+            .take(n)
+            .map(|msg| {
+                (
+                    msg.msg_len as _,
+                    sys::msghdr_flags(&msg.msg_hdr),
+                    sys::msghdr_control_len(&msg.msg_hdr),
+                )
+            })
+            .collect()
+    }
+
+    /// Extends the vec with the length, flags and control lengths of the messages.
+    /// This avoids the need to allocate a new vec on each use which affects `results`.
+    pub fn extend_with_results(&self, v: &mut Vec<(usize, RecvFlags, usize)>, n: usize) {
+        v.extend(self.inner.iter().take(n).map(|msg| {
+            (
+                msg.msg_len as _,
+                sys::msghdr_flags(&msg.msg_hdr),
+                sys::msghdr_control_len(&msg.msg_hdr),
+            )
+        }));
+    }
+
+    /// Returns the lengths of the messages.
+    pub fn lens(&self, n: usize) -> Vec<usize> {
+        self.inner
+            .iter()
+            .take(n)
+            .map(|msg| msg.msg_len as _)
+            .collect()
+    }
+
+    /// Returns the flags of the messages.
     pub fn flags(&self, n: usize) -> Vec<RecvFlags> {
         self.inner
             .iter()
@@ -931,6 +967,15 @@ impl<'addr, 'bufs, 'control> MmsgHdrMut<'addr, 'bufs, 'control> {
             .map(|msg| sys::msghdr_flags(&msg.msg_hdr))
             .collect()
     }
+
+    /// Returns the control lengths of the messages.
+    pub fn control_lens(&self, n: usize) -> Vec<usize> {
+        self.inner
+            .iter()
+            .take(n)
+            .map(|msg| sys::msghdr_control_len(&msg.msg_hdr))
+            .collect()
+    }
 }
 
 #[cfg(all(

Is what I ended up with.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite like the idea of an iterator-based API. This lets the user choose if they want a Vec.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not too bad for results, lens and control_lens since they are Copy but for recvmmsg getting back mutable access to the received buffer is a bit of a lifetime quagmire.

You basically setup the largest number of buffers you want to receive at once, which is fine if you use them all, but often you'll receive some smaller number.

The overhead of repopulating from scratch on each iteration is noticeable enough that it leads to populating a smaller number of potential receives than you otherwise would want to.

Ideally you'd want to repopulate the buffers which did get used but reuse the ones which didn't. I've absolutely failed to make that work with the lifetimes involved, since the MmsgHdrs end up holding a mutable reference into whatever owns the destination buffers1, which means one needs to drop the mmsghdr which you might otherwise be able to partially reuse.

AFAICT the same lifetime issue exists whether the collection in MsgHdrMut is internal (as in the Vec here) or external (as in a slice of the MsgHdrMut proposed in #494 (comment)).

Is there some technique where the lifetime of the (used) head can be split from the (unused) tail?

Footnotes

  1. In my case the buffers are owned by an array of BytesMut, but it could be a Vec<Vec<u8>> or whatever.

@Tuetuopay
Copy link
Author

@Tuetuopay are you still interested in pursuing this functionality?

@xv-ian-c Hi! Thanks for the reminder. I completely forgot about this. Yes I still mean to get this through, whenever I get time to work on this :)

Copy link
Collaborator

@Thomasdezeeuw Thomasdezeeuw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to accept all the usage of Vectors here, we can do this without allocations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants