Towards Transparent and Grounded Visual AI Systems