Scaling and Securing Spark on Kubernetes at Bloomberg – Ilan Filonenko, Bloomberg
In the management of its Data Science Platform, Bloomberg has always focused on providing tenants with secure, reliable, and scalable solutions for their machine learning workflows and ETL pipelines. In adapting Kubernetes to support a diverse set of machine learning workloads, we decided to also support Apache Spark with Native Kubernetes integration. In this talk we’ll discuss how we designed: a scalable and resilient External Shuffle Service for Dynamic Resource Allocation, a pluggable interface for secure worker creation, and a token renewal service that handles privacy and security across Spark jobs. These topics will address multi-tenancy, data security and privacy, and elastic resource scalability in the context of running Spark natively on Kubernetes, with an emphasis on disaggregated compute.
Join us for KubeCon + CloudNativeCon in Shanghai June 24 – 26 and San Diego November 18 – 21! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy and all of the other CNCF-hosted projects.
Join us for KubeCon + CloudNativeCon in San Diego November 18 – 21. Learn more at bit.ly/2WdUyQ6. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy and all of the other CNCF-hosted projects.