Are quantitative evaluation metrics verified for `Colorful Image Colorization`? (ex. classification accuracy )