To address transparency and trust concerns in autonomous driving systems, often criticized as black box models, we propose RAG-SafeAdapt, a retrieval-augmented vision-language model designed for complex traffic scenarios. This end-to-end framework integrates the CARLA simulator with Retrieval-Augmented Generation (RAG) and multimodal knowledge bases. The system combines visual and language inputs to generate game-theory-based safety recommendations through Responsibility-Sensitive Safety (RSS) principles and Vision-Language Models (VLMs). RAG-SafeAdapt enhances safety analysis and interpretable decision-making, with deployment compatibility for platforms such as NVIDIA Orin. Evaluation on datasets including Berkeley DeepDrive eXplanation (BDD-X) and NuScenes-QA demonstrates improved decision explainability and generalization across diverse driving scenarios. Experimental results show superior zero-shot generalization capabilities, enhanced transparency, and a reduced collision rate of 0.35%. The framework effectively addresses key challenges in navigation clarity, sensor precision, and adaptability while fostering trust in autonomous driving through optimized real-time safety and human-readable explanations (The bold formatting in the text is used to highlight proper nouns and key terms for emphasis).