Self-supervised object-centric representations learning of computer vision and natural language understanding models