Member-only story
Ray: Distributed computing for all, Part 2
Deploying and running code on cloud-based clusters
Thomas Reid9 min read·Just now--
This is the second instalment in my two-part series on the Ray library, a Python framework for distributed and parallel computing. Part 1 covered how to parallelise CPU-intensive Python jobs on your local PC by distributing the workload across all available cores, resulting in marked improvements in runtime. I’ll leave a link to Part 1 at the end of this article.
This part deals with a similar theme, except we take distributing Python workloads to the next level by using Ray to parallelise them across multi-server clusters.
If you’ve come to this without having read Part 1, the TL;DR of Ray is that it is an open-source distributed computing framework designed to make it easy to scale Python programs from a laptop to a cluster with minimal code changes. That alone should hopefully be enough to pique your interest. In my own test, I took a straightforward, relatively simple Python program that finds prime numbers and was able to decrease its runtime by a factor of ten by adding just four lines of code.
Where can you run Ray clusters?
Ray clusters can be set up on the following:
- AWS and GCP Cloud, although unofficial integrations exist for other…