Skip to content

Archive.tscrunch() is extremely slow #16

@rossjjennings

Description

@rossjjennings

This is a very old issue that I discussed with @mtlam years ago but somehow never actually resolved.

Despite the fact that it does basically the same thing, ar.tscrunch() is hundreds of times slower than ar.scrunch('T'). The culprit is this set of nested for loops.

Normally you can work around this by avoiding the former in favor of the latter, but ar.getLevels() (used by ar.getPulsarCalibrator()) calls self.tscrunch() internally here, and as a result is much slower than it needs to be. I ran into this in the quicklook notebook, and found that patching PyPulse to replace self.tscrunch() with self.scrunch('T') in getPulsarCalibrator() improved the runtime of the relevant cell from ~45 s to ~72 ms, and the runtime of the entire notebook from ~1 minute to ~15 s.

I'm not sure what the best way to fix this is. The simplest thing to do would be to apply the patch I described above, but that leaves the current behavior of tscrunch() as a trap for the unwary. Possibly tscrunch() should be removed altogether, but it does have some functionality that's not replicated by scrunch('T') -- namely, the ability to scrunch by an arbitrary factor. The current behavior is somewhat broken, though, in that when the factor doesn't evenly divide the number of subints, the last subint in the resulting scrunched archive will invariably consist of all zeros.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions