Event Structure In Vision And Language