Why not directly use models from Mask2Former in the MODEL_ZOO.md?

From what I understood, the only difference between mask proposal network in this repo is the number of classes predicted (binary vs N dataset classes) and N predictions can be converted to binary predictions. Are there any other difference? Just curious, have you done any ablation to verify if binary prediction is necessarily better?