Skip to content

Intra-frame parallelism and reproducible ordering #778

@wdconinc

Description

@wdconinc

In EIC, we tend to think of collections as sets rather than lists/vectors, with an ordering that's merely a result of the order in which they are filled but cannot be inherently guaranteed since it is not defined in the data model. In our event reconstruction code development we have some infrastructure to identify changes in the outputs between feature and target branch artifacts, and that is of course sensitive to changes in ordering. Due to concurrency in the framework, frames are written out of order, which is of course expected. However, we are also exploring intra-frame parallelism, which leads to vector members and collections being filled in an order that's not necessarily sequential or therefore reproducible (think: std::for_each(std::execution::unseq_par, hits.begin(), hits.end(), [&c](auto& h){ c.add ToHits(h); }) for hits in a cluster). Is there any way to fill podio vector members and collections out of sequence, yet still end up with a reproducible ordering on disk? I'm thinking orderable data types would need a hash or compare function to define the ordering, manifested in the data model. That could then be used to fill an ordered container, which would be serialized before saving to file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions