Skip to content

Fix profiling async functions #91

@szostid

Description

@szostid

async functions may resume and continue execution any time. This crate currently does not account for this in any way.

For example, this:

use std::time::Duration;
use tracing_subscriber::layer::SubscriberExt;

fn main() {
    let layer = tracing_tracy::TracyLayer::default();
    tracing::subscriber::set_global_default(tracing_subscriber::registry().with(layer)).unwrap();

    profiling::function_scope!();

    tokio::runtime::Builder::new_current_thread()
        .enable_time()
        .build()
        .unwrap()
        .block_on(test_future());
}

#[profiling::function]
fn do_computation() {
    std::thread::sleep(Duration::from_millis(250));
}

#[profiling::function]
async fn background_task() {
    tokio::time::sleep(Duration::from_millis(500)).await;
}

#[profiling::function]
async fn test_future() {
    tokio::spawn(background_task());

    do_computation();
    tokio::time::sleep(Duration::from_millis(500)).await;
    do_computation();
}

Emits those spans:

Image

But I believe it should look more like this:

Image

This ensures that futures are displayed in profilers only when they are being actively executed.

Solutions

Tracing currently solves this with their tracing::instrument macro. I've implemented a simple solution, which works for me, but is obviously flawed. Currently, for the tracing backend I just append that instrument macro to the function attributes. Library developers, however, could not depend on this to always work (as the user may use any other backend), and they'd be forced not to use profiling::function on async functions anyway.

I believe the correct solution would be to replicate what tracing does in their instrument macro. They detect async functions (this part, im kinda uncertain about; they account for impl Future, async fn, #[async_trait], and I don't even know what to support and what could be skipped) and wrap them in a type, which ensures that the span is only entered when a future is polled:

impl<T: Future> Future for Instrumented<T> {
    type Output = T::Output;

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        let (span, inner) = self.project().span_and_inner_pin_mut();
        let _enter = span.enter();
        inner.poll(cx)
    }
}

The crate can still stay zero-overhead without enabling any backends. Additionally, the Instrumented functions could be made optional too, further eliminating the potential overhead from this addition.

I'm happy to develop a solution like this, but first I'd like to know about any potential issues that I might have forgotten.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions