CGlue 0.3 Future and Beyond

Published: Aug. 1, 2024, 3:43 p.m.

CGlue 0.3 now supports Future, Stream, and Sink in version 0.3! They're not only safe, but also, relatively fast. All this took around a year to build, and I may now explain the keys to the async.

Recap

CGlue is an ABI safety solution, that focuses on expressing dynamic trait objects in a stable manner. A #[cglue_trait] attribute is added to a trait, which triggers CGlue's code generator to spit out C-ABI wrappers. These wrappers can then be used to build plugin systems, and with cglue-bindgen it is possible to generate C or C++ code for calling or implementing said plugins from foreign languages.

Associated types

To express Future under CGlue, we must first be able to express associated types under CGlue. This turned out to be no easy feat.

I will not bore you with excessive details that took 6 months to work out, but essentially, we have to generate additional generic parameters, for every associated type, to CGlue vtables.

Effectively we get the following:

#[cglue_trait]
pub trait WithAssoc<T> {
	type AssocTy: Clone;

	fn with_assoc(&self, assoc: &Self::AssocTy) -> T;
}

// #[cglue_trait] spits out the following (albeit more complex)

#[repr(C)]
pub struct WithAssocVtbl<
	'cglue_a,
	CGlueC: 'cglue_a + crate::trait_group::CGlueObjBase,
	T,
	CGlueAAssocTy: Clone,
> {
	with_assoc: extern "C" fn(cont: &CGlueC, assoc: &CGlueAAssocTy) -> T,
	/* phantoms... */
}

/// Default vtable reference creation.
impl<
	'cglue_a,
	CGlueC: /* ... */,
	CGlueCtx: crate::trait_group::ContextBounds,
	T,
	CGlueAAssocTy: Clone,
> Default for &'cglue_a WithAssocVtbl<'cglue_a, CGlueC, T, CGlueAAssocTy>
where
	CGlueC::ObjType: WithAssoc<T, AssocTy = CGlueAAssocTy>,
	/* ... */
>: crate::trait_group::CGlueBaseVtbl,
{
	/// Create a static vtable for the given type.
	fn default() -> Self {
		&WithAssocVtbl {
			with_assoc: cglue_wrapped_with_assoc,
			/* phantoms... */
		}
	}
}

// C wrapper

extern "C" fn cglue_wrapped_with_assoc<
	CGlueC: /* ... */,
	CGlueCtx: crate::trait_group::ContextBounds,
	T,
	CGlueAAssocTy: Clone,
>(cont: &CGlueC, assoc: &CGlueAAssocTy) -> T
where
	CGlueC::ObjType: WithAssoc<T, AssocTy = CGlueAAssocTy>,
{
	let (this, ret_tmp, cglue_ctx) = cont.cobj_ref();
	let ret = <CGlueC::ObjType as WithAssoc<T>>::with_assoc(this, assoc);
	ret
}

// Implementation on CGlue Opaque Object

impl<
	'cglue_a,
	CGlueO: 'cglue_a + WithAssocVtblGet<'cglue_a, T, CGlueAAssocTy>
		+ /* ... */,
	T,
	CGlueAAssocTy: Clone,
> WithAssoc<T> for CGlueO
where
	T: crate::trait_group::GenericTypeBounds,
	CGlueAAssocTy: crate::trait_group::GenericTypeBounds,
{
	type AssocTy = CGlueAAssocTy;
	#[inline(always)]
	fn with_assoc(&self, assoc: &Self::AssocTy) -> T {
		let __cglue_vfunc = self.get_vtbl().with_assoc;
		let cont = self.ccont_ref();
		let assoc = assoc;
		let mut ret = __cglue_vfunc(cont, assoc);
		ret
	}
}

The trick here is that CGlue vtable encodes the specific associated type used by the given opaque object, which then allows us to have multiple CGlue objects pointing to different associated types.

Task module

Once we have associated types, then we need to safely and efficiently transfer &Waker across FFI boundary. To that end, we define CRefWaker:

#[repr(C)]
#[derive(Clone, Copy)]
pub struct CRefWaker<'a> {
    raw: &'a OpaqueRawWaker,
    clone: unsafe extern "C" fn(*const ()) -> CRawWaker,
    wake_by_ref: unsafe extern "C" fn(*const ()),
}

#[repr(transparent)]
#[derive(Clone, Copy)]
struct OpaqueRawWaker {
    waker: [*const (); 2],
}

We then create a conversion from &Waker to CRefWaker:

impl<'a> CRefWaker<'a> {
    pub unsafe fn from_raw(raw: &'a RawWaker) -> Self {
        let raw: &'a OpaqueRawWaker = core::mem::transmute(raw);

        Self {
            raw,
            clone: waker_clone,
            wake_by_ref: waker_wake_by_ref,
        }
    }
}

impl<'a> From<&'a Waker> for CRefWaker<'a> {
    fn from(waker: &'a Waker) -> Self {
        const _: [(); core::mem::size_of::<Waker>()] = [(); core::mem::size_of::<OpaqueRawWaker>()];
        unsafe { Self::from_raw(core::mem::transmute(waker)) }
    }
}

The waker_clone and waker_wake_by_ref are defined as follows:

unsafe extern "C" fn waker_clone(waker: *const ()) -> CRawWaker {
    let waker: &Waker = &*(waker as *const Waker);
    let waker = core::mem::transmute(waker.clone());

    CRawWaker {
        waker,
        vtable: Default::default(),
    }
}

unsafe extern "C" fn waker_wake_by_ref(waker: *const ()) {
    let waker: &Waker = &*(waker as *const Waker);
    waker.wake_by_ref()
}

And finally, a nice wrapper to go from CRefWaker to &Waker:

pub fn with_waker<T>(&self, cb: impl FnOnce(&Waker) -> T) -> T {
	unsafe fn unreach(_: *const ()) {
		unreachable!()
	}
	unsafe fn noop(_: *const ()) {}
	unsafe fn clone(data: *const ()) -> RawWaker {
		let this = &*(data as *const CRefWaker);
		let waker = unsafe { (this.clone)(this.raw as *const _ as *const ()) };
		let waker = BaseArc::new(waker);
		CRawWaker::to_raw(waker)
	}
	unsafe fn wake_by_ref(data: *const ()) {
		let this = &*(data as *const CRefWaker);
		unsafe { (this.wake_by_ref)(this.raw as *const _ as *const ()) };
	}

	let vtbl = &RawWakerVTable::new(clone, unreach, wake_by_ref, noop);
	let waker = RawWaker::new(self as *const Self as *const (), vtbl);
	let waker = unsafe { Waker::from_raw(waker) };

	cb(&waker)
}

Here, we create a stack allocated Waker that creates an owned CRawWaker only whenever it gets cloned. We could bypass this step, if we could rely on the calling convention of the underlying RawWaker's vtable, but we cannot, thus we must allocate. Additional optimizations could be made to improve allocation performance, but that can be done at a future point.

Future wrapper

Having all the boilerplate set up, we can now create an automatic wrapper for our traits. CGlue already has provisions for this, we just need to add future module to cglue_gen::ext, with the following implementation:

pub trait Future {
	type Output;

	#[custom_impl(
		// Types within the C interface other than self and additional wrappers.
		{
			cx: &CRefWaker,
			out: &mut MaybeUninit<Self::Output>,
		},
		// Unwrapped return type
		bool,
		// Conversion in trait impl to C arguments (signature names are expected).
		{
			let mut out_v = MaybeUninit::uninit();
			let out = &mut out_v;
			let cx = CRefWaker::from(cx.waker());
			let cx = &cx;
		},
		// This is the body of C impl minus the automatic wrapping.
		{
			cx.with_waker(|waker| {
				let mut cx = Context::from_waker(waker);
				match this.poll(&mut cx) {
					Poll::Ready(v) => {
						out.write(v);
						true
					}
					_ => false
				}
			})
		},
		// This part is processed in the trait impl after the call returns (impl_func_ret).
		{
			if ret {
				Poll::Ready(unsafe { out_v.assume_init() })
			} else {
				Poll::Pending
			}
		},
	)]
	fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output>;
}

This allows us transparently wrap Waker with CRefWaker without the caller having to care about it. Now, it may seem complex, but all the above attribute does is replace default CGlue codegen with different function parameters and bodies at strategic places. Here's the expansion:

First, the C wrapper:

extern "C" fn cglue_wrapped_poll<
	CGlueC: /* ... */*
>,
	CGlueCtx: /* ... */,
	CGlueAOutput,
>(
	cont: ::core::pin::Pin<&mut CGlueC>,
	/* Note the first block being placed here */
	cx: &crate::task::CRefWaker,
	out: &mut ::core::mem::MaybeUninit<CGlueAOutput>,
) 
	/* Note the second attr param is placed here */
	-> bool
where
	CGlueC::ObjType: for<'cglue_b> Future<Output = CGlueAOutput>,
{
	let (this, ret_tmp, cglue_ctx) = cont.cobj_pin_mut();
	let ret = {
		/* Note the fourth block being placed here */
		cx.with_waker(|waker| {
			let mut cx = ::core::task::Context::from_waker(waker);
			match this.poll(&mut cx) {
				::core::task::Poll::Ready(v) => {
					out.write(v);
					true
				}
				_ => false,
			}
		})
	};
	ret
}

And then, the Rust side:

#[inline(always)]
fn poll(
	self: ::core::pin::Pin<&mut Self>,
	cx: &mut ::core::task::Context,
) -> ::core::task::Poll<Self::Output> {
	let __cglue_vfunc = self.get_vtbl().poll;
	/* Note the third block being placed here */
	let mut out_v = ::core::mem::MaybeUninit::uninit();
	let out = &mut out_v;
	let cx = crate::task::CRefWaker::from(cx.waker());
	let cx = &cx;
	/* These lines are auto generated */
	let cont = self.ccont_pin_mut();
	let cx = cx;
	let out = out;
	let mut ret = __cglue_vfunc(cont, cx, out);
	{
		/* Note the fifth block being placed here */
		if ret {
			::core::task::Poll::Ready(unsafe { out_v.assume_init() })
		} else {
			::core::task::Poll::Pending
		}
	}
}

Putting it all together

Having built everything, it becomes as simple as following:

// Any future
async fn hii() -> u64 {
	42
}

// Turn it into CGlue object
let obj = trait_obj!(hii() as Future);

// Pass it to an executor
assert_eq!(pollster::block_on(obj), 42);

In addition, we get to do the same with futures streams and sinks:

let items = [42, 43, 42];

let obj = trait_obj!(futures::stream::iter(items) as Stream);

assert_eq!(pollster::block_on(obj.collect::<Vec<_>>()), items);

Now, of course, there are caveats. For one, you will not be able to successfully pass a tokio based future created in one shared library, back to the main executable's tokio executor, because tokio relies on thread-local storage, which is not shared across libs, but any generic future should work, and perhaps one day, the crabi gods will bless us with universal stable ABI.

CGlue 0.3 is out now, which, when opting in the task, and futures features, enables transparent wrappers for the aforementioned traits. Additional work would involve native support for AFIT and separating Send and !Send types, at which point CGlue's async story should be complete.