Deep learning based multi-modal image analysis for enhanced situation awareness and environmental perception