-
Notifications
You must be signed in to change notification settings - Fork 141
Add scorefile output convieniece function for PyRosetta #563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add scorefile output convieniece function for PyRosetta #563
Conversation
At the recent Bootcamp, several people expressed interest in the ability to create scorefile info from PyRosetta. Add a convience function which allows easy creation of JD2-like scorefiles through the PyRosetta interface.
|
IIRC, there are similar functions in the pyrosetta.distributed module. May be nice to make sure they do the same thing/call the same code under the hood. |
From what I can tell from a quick grep through the code, the use case for pyrosetta.distributed scorefile handling is quite a bit more complicated, and doesn't seem to be aimed at a simple scorefile output. I'm not sure how much overlap there is. (Thoughts, @klimaj ?) |
| lenghts = lengths[-n_score:] | ||
| self.assertEqual( min(lenghts), max(lengths) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a typo for lengths
| pyrosetta.rosetta.core.pose.setPoseExraScore(pose_clone, "extra_real", 3.14159 ) | ||
| pyrosetta.rosetta.core.pose.setPoseExraScore(pose_clone, "extra_string", "TAG_VALUE" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be setPoseExtraScore, just missing the t
|
Probably a bit outside the exact scope of this PR, but it feels like a good moment to bring up: One thing I’ve repeatedly found myself doing over the years is parsing Rosetta score files into a more structured format—usually JSON. Maybe when we write a score file, we could also write a companion .json file containing the same data in a dictionary-style structure (e.g., top-level keys as structure names, each mapping to a dict of score terms → values). Thoughts? |
|
There are quite a few mechanisms to get a scores dictionary in the
I think this PR is similar to the idea of the PyJobDistributor's Finally, |
|
As another quick comment, arbitrary python types can now be serialized into strings using the @roccomoretti instead of retrieving data from |
Jared added the ability to make JSON-formatted scorefiles from command-line Rosetta, via the This PR uses that framework to allow you to output either the conventional (default) or JSON formats (with |
| bool use_json | ||
| ); | ||
|
|
||
| /// @brief write the given data to a scorefile, autodetecting the format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"autodetecting the format" sounds a bit misleading to me, - maybe "use command line option to determine output format"?
| } | ||
|
|
||
| return write_scorefile(tag, score_map, string_map, use_json); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to specify options to dump both "classic" and JSON score files? Rationale is this: JSON could be used down the line as input for tools but plain text is more readable for humans.
As I mentioned in previous comments: maybe dumping JSON version unconditionally is preferable from practical point of view?
| poses_to_scorefile, | ||
| dump_file, | ||
| dump_scored_pdb, | ||
| dump_pdb, | ||
| dump_multimodel_pdb, | ||
| dump_cif, | ||
| dump_mmtf, | ||
| create_score_function, | ||
| get_fa_scorefxn, | ||
| get_score_function, | ||
| get_scorefile_info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to import these by default? I’m open to it if we genuinely expect everyone to need them, but otherwise it’s probably better not to expand the default imports.
My hesitation comes from past experience — we’ve been burned by this approach before: over time, people start relying on these defaults, and it makes refactoring much harder.
(Also: https://xkcd.com/1172/)
| except ImportError: | ||
| _skip_xz = True | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add the standard lines we include in all PyRosetta tests?
They ensure that no unexpected output accumulates and that runs are deterministic.
(I know the test code already calls init(), but in this case uniformity is probably preferable to purity.)
init(extra_options = "-constant_seed") # WARNING: option '-constant_seed' is for testing only! MAKE SURE TO REMOVE IT IN PRODUCTION RUNS!!!!!
import os; os.chdir('.test.output')Thanks,
At the recent Bootcamp, several people (namely @LouisaMe09 and @zyajahuggan) expressed interest in the ability to create scorefile info from PyRosetta. Add a convenience function which allows easy creation of JD2-like scorefiles through the PyRosetta interface. (
pyrosetta.poses_to_scorefile())I've also added a function (
pyrosetta.io.get_scorefile_info()) which gets what would be reported to the scorefile as a Python dictionary.Additionally, this PR also cleans up some of the scorefile writing interface at the C++ level.