In a discussion with @edmcman here:
mahaloz/DAILA#32 (comment)
We have both independently noticed VarBERT is producing fewer variables than expected on binaries such as /bin/ls. It's unclear now which of the following (combinations?) is the cause:
- A bug in how we are passing data to VarBERT
- A bad choice of model to use at-scale
- An incorrect model
We should investigate this to make the real-world use better. It may also be worth looking at our Ghidra testcases again.