Clusters

A group of autoscaling compute resources centered around the Ray plasma object store

All computation within Anyscale is done within Clusters, just like in open-source (OSS) Ray. A cluster is a group of autoscaling compute resources centered around the Ray plasma object store. When a Ray instance is initialized, it is attached to exactly one cluster and all computations within this Ray instance are done only in the attached cluster.

In open-source Ray, whatever script running in Python can be a driver to kick off work inside its Ray cluster. The Ray cluster might be local (for development) or in a cloud (for larger workloads).

In Anyscale, things are no different. No code changes are required for things to "just work". Changing from an open-source Ray cluster to Anyscale's managed clusters is as easy as changing RAY_ADDRESS environment variable from an IP address of a cluster to anyscale://my_cluster or by providing this address as an argument in ray.init- both have the same effect. A cluster can be created either when Ray is initialized or separately from command line. All of the remote invocations in job (any tasks, actors or Ray-based libraries) will be run on the specified cluster.

When creating a cluster, following configuration files define the cluster:

Cluster Compute: defines cloud resource types and limitations
Cluster Environment: defines application dependencies
Runtime Environment: defines the Python environment used for the driver, tasks, and actors

In addition, initial number of CPUs, GPUs and Ray bundles can be specified. The cluster will immediately scale to accommodate the requested resources, bypassing normal upscaling speed constraints, These resources are also pinned and exempt from downscaling.

PreviousAnyscale NextJobs

Last updated 4 years ago

Was this helpful?