Migrate from OSS

Migrating your Ray projects onto Anyscale

When moving your Ray project onto Anyscale, you just need to make a few small changes.

Pre-requisites

Migration steps

Create an anyscale project

  1. From within the directory with your project files, run anyscale init and give your project a name. This directory is now associated with the anyscale project you created.

Convert your application dependencies

  1. If you require debian packages, create a cluster environment. This can be done in the UI, API, or SDK. Make note of the cluster environment build name that gets created. Ex: my_cluster_env:1

  2. Your pip and conda dependencies can be declared in a runtime environment just like in OSS. If they already are, you do not need to make any changes.

  3. Your project directory will automatically upload to the cluster. This allows you to connect to the cluster and import any of your local modules.

circle-info

You can also specify a working directory using the working_dir parameter in a runtime environment. This lets you customize what files to sync to your cluster.

Convert your compute configuration

You can create a cluster compute configuration using the UI, API, or SDK. This configuration will define your compute configuration.

  1. Take the provider config from your cluster config and set those values in the cluster compute config.

  2. Take the available_node_types config and set those values in the cluster compute config.

  3. Advanced options for AWS can be placed in the aws field in the cluster compute config or in the "Advanced configurations" box when using the UI.

Connect to an Anyscale cluster

Now, we just need to connect to an anyscale cluster instead of a cluster that you are managing. There are two ways to do it:

  • No code change: set your RAY_ADDRESS environment variable to anyscale://my-cluster?cluster_env=my_cluster_env:1&cluster_compute=my_compute_config

  • 1 line code change: In your source code, replace the address in your ray connect call to connect to anyscale by passing in anyscale:// as the address. Also, pass in the environments created in the previous steps by using their names. For example: ray. \ .client("anyscale://my-cluster") \ .cluster_env("my_cluster_env:1") \ .cluster_compute("my_compute_config") \ .connect()

If you wish to re-use an existing cluster or want to deploy a long-running service that you wish to modify in the future, you can provide a cluster name in the address. For example, by setting RAY_ADDRESS=anyscale://my-cluster in your environment or by providing it explicitly in the code ray.init("anyscale://my-cluster") your script will either create a new cluster with that name or connect to it if it already exists.

Concepts

  • In Anyscale, the monolithic cluster config has been split to two configurations, a cluster environment and a cluster compute configuration. Cluster environments are automatically built into a image for quick and easy re-use in different clusters.

  • Anyscale will manage the cluster's lifecycle for you. It will launch clusters when needed and shut them down if they have not been used for a while (this timeout is configurable, down to the second). You can also manually modify the cluster using the UI, API, or SDK.

  • The API of Anyscale is focused around your code. That is why the primary entry point into Anyscale is to "connect" via a python sdk and then execute a job or deploy a service. The UI is also focused around these concepts. We give you the ability to monitor your clusters for debugging, cost tracking, and advanced uses.

Example

Before

After

No changes to the deploy.py file, just changes to how you run it.

Last updated

Was this helpful?