Skip to content

Conversation

@SrdanProdanovic
Copy link

What does this PR do?

Add a parallel mutlicore version, a test that runs in and scripts to monitor

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of who to tag.
Please tag fewer than 3 people.

LLMLingua/LongLLMLingua:

Documentation: @SiyunZhao

-->

@cornzz
Copy link

cornzz commented Dec 12, 2024

Hi @SrdanProdanovic, this looks cool! Do you have some benchmark numbers, did you actually manage to speed up the compression process? I see that you parallelized the calculation of the context probabilities when compressing multiple contexts. I have not used LLMLingua-2 with multiple contexts yet, but I'd still be interested how much faster you got it to work.

I have experimented myself with trying to implement multiprocessing in __compress, so trying to parallelize the compression of a context itself, but didn't have much luck. I believe the overhead introduced by multiprocessing was too large in my case...

@SrdanProdanovic
Copy link
Author

SrdanProdanovic commented Jan 3, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants