Overview pyspark_xray is a diagnostic tool, in the form of Python library, for pyspark developers to debug and troubleshoot PySpark applications locally, specifically it enables local debugging of PySpark RDD or DataFrame transformation functions that runs on slave nodes. The purpose of developing pyspark_xray is to create a development framework that…