Architecting AI Workflows with Apache Spark Chemotherapy

Authors

  • Hina Gandhi New York, USA Author

DOI:

https://doi.org/10.47363/JAICC/TechFusion2025/2025(4)5

Keywords:

Architecting AI , Apache Spark Chemotherapy

Abstract

This session traces the evolution of big data systems - from Hadoop’ s batch-driven model to modern distributed architectures - and explores how AI-driven approaches can enhance and optimize Apache Spark. We’ll break down Spark’s internal design, including the roles of the Driver, DAG Scheduler, Task Scheduler, and Executors, to show how large-scale workloads are processed efficiently across clusters. Using real-world examples like cost aggregation pipelines, the talk highlights how Spark overcomes Hadoop’s limitations while still facing challenges around configuration complexity, data skew, and resource management. Finally, 
we’ll discuss how reinforcement learning can be applied to Spark to enable dynamic scheduling, smarter partitioning, and adaptive 
resource allocation, transforming Spark into a self-optimizing data processing engine.

Author Biography

  • Hina Gandhi , New York, USA

    Hina Gandhi, New York, USA

Downloads

Published

2025-11-28

How to Cite

Architecting AI Workflows with Apache Spark Chemotherapy. (2025). Journal of Artificial Intelligence & Cloud Computing, 4(6), 1-1. https://doi.org/10.47363/JAICC/TechFusion2025/2025(4)5

Similar Articles

1-10 of 248

You may also start an advanced similarity search for this article.