Python bindings for querying SAFRAN in batched mode #10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Abstract
This PR adds the ability to run SAFRAN inference in the batched mode from Python 3.x. It is based on the code from
Main.cpp, where the parts related to initialization and inference are decoupled to implement a simple Python API using SWIG. Currently, Python API only supports prediction of tails using the"applymax"action (since this is what was required for my application), but it should be straightforward to support the prediction of heads and support other actions as well.Example usage
where
['head_i', 'rel_i', 'tail_i']are RDF triples, in SAFRAN train/validation/test file format, andresulthas the following format:where
rule_id_ijis the ID (0-indexed line number in thepath_to_rulesfile) of the rule that predicted thetail_ijentity with confidenceconfidence_ij.Installation
Install SAFRAN first: reference. Then, depending on your OS:
Ubuntu
MacOS (homebrew)
On MacOS, you would need a C++ compiler that supports OpenMP via
-fopenmpflag. Currently, it seems that the defaultclangdoes not support it and you need workarounds, such as usinglvmorgccfrom brew (src). You would also need C++17 support in your C++ compiler to compile SAFRAN and Python bindings.Changelog
A few things were changed in the SAFRAN library itself to allow smoother interoperation between the wrapper and the C++ code, namely:
Explanationis now an abstract class, decouplingSQLite-specific implementation and theExplanationinterface. There are two implementations ofExplanation, namelySQLiteExplanation(used inMain.cpp) andInMemoryExplanation(used in Python wrapper).RuleApplicationnow supports "hot-swapping"TesttripleReaders withRuleApplication::updateTTR, to be able to be run in batched mode.TesttripleReaderseparates initialization and reading from file, to allow reading test triples from multiple sources (e.g. from memory).