[Deprecated] Builder API

API reference

This documentation covers using the Anyscale Ray Client through the ray.client API, which will be deprecated. If you're using Ray 1.5 or higher, follow the instructions in the Anyscale Ray Client API Reference instead.

To connect to a cluster using anyscale client builder you can get started with running ray.client("anyscale://cluster_name").connect(). Anyscale will automatically detect your laptop's ray and python versions and start a compatible cluster with a cluster_env that has the same ray/python versions as your laptop.

Connection Options

To create or connect to a ray cluster, use ray.client(url). This method returns a client object, which can be optionally manipulated by builder methods, before calling .connect() to initialize the connection with the ray cluster. The url can be provided explicitly in the code or by setting the RAY_ADDRESS environment variable. For example:

1)

# script.py
import ray
ray.client().connect()

and then run RAY_ADDRESS=anyscale://cluster_name python script.py is equivalent to

2)

# script.py
import ray
ray.client("anyscale://cluster_name").connect()

and then run python script.py

The Connection String

The connection string for ray.client() specifies which Ray cluster to use. To connect to Anyscale, this URL will begin with "anyscale." You have the option of using parameters in the connection string to modify the connection's configuration. Here are the options for this connection string:

ray.client("anyscale://{cluster_name}" +
                      "?update={True|False}" +
                      "&autosuspend={timeInMinutes}"  +
                      "&cluster_env={cluster_env}" +
                      "&cluster_compute={cluster_compute}"
                      ).connect()

Note that each of these methods has a correlation in the builder API below.

Builder Methods

The object returned by ray.client(url) can be subsequently modified by the methods documented below:

def connect(self) -> None:
    """Connect to Anyscale using previously specified options.

    Examples:
        >>> ray.client("anyscale://cluster_name").connect()

    WARNING: using a new cluster_compute/cluster_env when connecting to an
    active cluster will not work unless the user passes `update=True`. e.g.:
        >>> ray.client("anyscale://cluster_name?update=True").connect()
    """
def cluster_compute(
    self, cluster_compute: Union[str, CLUSTER_COMPUTE_DICT_TYPE]
) -> "ClientBuilder":
    """Set the Anyscale cluster compute to use for the cluster.

    Args:
        cluster_compute: Name of the cluster compute
            or a dictionary to build a new cluster compute.
            For example "my-cluster-compute".

    Examples:
        >>> ray.client("anyscale://cluster_name?cluster_compute=my_cluster_compute").connect()
        >>> ray.client("anyscale://cluster_name").cluster_compute("my_cluster_compute").connect()
        >>> ray.client("anyscale://cluster_name").cluster_compute({"cloud_id": "1234", ... }).connect()

    WARNING:
        If you want to pass a dictionary cluster_compute please pass it using
        the `.cluster_compute()` API. Passing it in the URL format will not work.
    """
def cluster_env(
    self, cluster_env: Union[str, CLUSTER_ENV_DICT_TYPE]
) -> "ClientBuilder":
    """Set the Anyscale cluster environment to use for the cluster.

    IMPORTANT: the Python minor version of the manually specified cluster
    environment must match the local Python version, and the Ray version must
    also be compatible with the one on the client. for example, if your local
    laptop environment is using ray 1.4 and python 3.8, then the cluster environment
    ray version must be 1.4 and python version must be 3.8.

    Args:
        cluster_env: Name (and optionally revision) of
            the cluster environment or a dictionary to build a new cluster environment.
            For example "my_cluster_env:2" where the revision would be 2.
            If no revision is specified, use the latest revision.
            NOTE: if you pass a dictionary it will always rebuild a new cluster environment
            before starting the cluster.

    Examples:
        >>> ray.client("anyscale://cluster_name?cluster_env=prev_created_cluster_env:2").connect()
        >>> ray.client("anyscale://cluster_name").cluster_env("prev_created_cluster_env:2").connect()
        >>> ray.client("anyscale://cluster_name").cluster_env({"base_image": "anyscale/ray-ml:1.1.0-gpu"}).connect()

    WARNING:
        If you want to pass a dictionary cluster_compute please pass it using
        the `.cluster_compute()` API. Passing it in the URL format will not work.
    """
def env(self, runtime_env: Dict[str, Any]) -> "ClientBuilder":
    """Sets the custom user specified runtime environment dict.

    Args:
        runtime_env (Dict[str, Any]): a python dictionary with runtime environment
            specifications.

    Examples:
        >>> ray.client("anyscale://cluster_name").env({"pip": "./requirements.txt"}).connect()
        >>> ray.client("anyscale://cluster_name")
        ...     .env({"working_dir": "/tmp/bla", "pip": ["chess"]}).connect()
        >>> ray.client("anyscale://cluster_name").env({"conda": "conda.yaml"}).connect()
    """

def download_results(self, *, remote_dir: str, local_dir: str) -> None:
    """Specify a directory to download results from the cluster head node.

    IMPORTANT: the data is downloaded immediately after this call.
        `download_results` must not be called with `connect()`. See examples below.

    Args:
        remote_dir (str): the result dir on the head node.
        local_dir (str): the local path to download the results to.

    Examples:
        >>> ray.client("anyscale://cluster_name")
        ...   .download_results(
        ...       local_dir="~/ray_results", remote_dir="/home/ray/proj_output")
        >>> ray.client("anyscale://").download_results(
        ...       local_dir="~/ray_results", remote_dir="/home/ray/proj_output")
        >>> anyscale.download_results(
        ...       local_dir="~/ray_results", remote_dir="/home/ray/proj_output")
    """
def cloud(self, cloud_name: str) -> "ClientBuilder":
    """Set the name of the cloud to be used.

    This sets the name of the cloud that your connect cluster will be started
    in by default. This is completely ignored if you pass in a cluster compute config.

    Args:
        cloud_name (str): Name of the cloud to start the cluster in.

    Examples:
        >>> ray.client("anyscale://cluster_name").cloud("aws_test_account").connect()
    """
def project_dir(
    self, local_dir: str, name: Optional[str] = None
) -> "ClientBuilder":
    """Set the project directory path on the user's laptop.

    This sets the project code directory. If not specified, the project
    directory will be autodetected based on the current working directory.
    If no Anyscale project is found, a "scratch" project will be used.
    In general the project directory will be synced to all nodes in the
    cluster as required by Ray, except for when the user passes
    "working_dir" in `.env()` in which case we sync the latter instead.

    Args:
        local_dir (str): path to the project directory.
        name (str): optional name to use if the project doesn't exist.

    Examples:
        >>> ray.client("anyscale://cluster_name").project_dir("~/my-proj-dir").connect()
    """
def request_resources(
    self,
    *,
    num_cpus: Optional[int] = None,
    num_gpus: Optional[int] = None,
    bundles: Optional[List[Dict[str, float]]] = None,
) -> "ClientBuilder":
    """Configure the initial resources to scale to.

    The cluster will immediately attempt to scale to accomodate the
    requested resources, bypassing normal upscaling speed constraints.
    The requested resources are pinned and exempt from downscaling.

    Args:
        num_cpus (int): number of cpus to request.
        num_gpus (int): number of gpus to request.
        bundles (List[Dict[str, float]): resource bundles to
            request. Each bundle is a dict of resource_name to quantity
            that can be allocated on a single machine. Note that the
            ``num_cpus`` and ``num_gpus`` args simply desugar into
            ``[{"CPU": 1}] * num_cpus`` and ``[{"GPU": 1}] * num_gpus``
            respectively.

    Examples:
        >>> ray.client("anyscale://cluster_name").request_resources(num_cpus=200, num_gpus=30).connect()
        >>> ray.client("anyscale://cluster_name").request_resources(
        ...     num_cpus=8,
        ...     resource_bundles=[{"GPU": 8}, {"GPU": 8}, {"GPU": 1}],
        ... ).connect()
    """
def autosuspend(
    self,
    enabled: bool = True,
    *,
    hours: Optional[int] = None,
    minutes: Optional[int] = None,
) -> "ClientBuilder":
    """Configure or disable cluster autosuspend behavior.

    The cluster will be autosuspend after the specified time period. By
    default, cluster auto terminate after one hour of idle.

    Args:
        enabled (bool): whether autosuspend is enabled.
        hours (int): specify idle time in hours.
        minutes (int): specify idle time in minutes. This is added to the
            idle time in hours.

    Examples:
        >>> ray.client("anyscale://cluster_name").autosuspend(False).connect()
        >>> ray.client("anyscale://cluster_name?autosuspend=10").connect()  # 10 minutes
        >>> ray.client("anyscale://cluster_name").autosuspend(hours=1, minutes=30).connect()
    """
def run_mode(self, run_mode: Optional[str] = None) -> "ClientBuilder":
    """Re-exec the driver program in the remote cluster.

    By setting ``run_mode("background")``, you can tell Anyscale
    to run the program driver remotely in the head node instead of executing
    locally. This allows you to e.g., close your laptop during development
    and have the program continue executing in the cluster.


    You can also change the run mode by setting the ANYSCALE_BACKGROUND=1
    or ANYSCALE_LOCAL_DOCKER=1 environment variables. Changing the run mode
    is only supported for script execution. Attempting to change the run
    mode in a notebook or Python shell will raise an error.

    Args:
        run_mode (str): either None or "background".

    Examples:
        >>> ray.client("anyscale://cluster_name").run_mode("background").connect()
    """
def job_name(self, job_name: Optional[str] = None) -> "ClientBuilder":
    """Sets the job_name so the user can identify it in the UI.
       This name is only used for display purposes in the UI.

    Args:
        job_name (str): the name of this job, which will be shown in the UI.

    Example:
        >>> ray.client("anyscale://cluster_name").job_name("production_job").connect()
    """
def namespace(self, namespace: str) -> "ClientBuilder":
    """Sets the namespace in the job config of the started job.

    Args:
        namespace (str): the name of to give to this namespace.

    Example:
        >> ray.client("anyscale://cluster_name").namespace("training_namespace").connect()
    """

ClientContext

Calling ray.client("anyscale://...").connect() returns a ClientContext object. This object provides information about the cluster (Ray version, Python version, and a URL to the Ray dashboard). The ClientContext also provides a disconnect() method that will disconnect from the current cluster. Finally, you may also use ClientContext as a context manager, which will automatically call disconnect() for you after the with block is complete.

context = ray.client("anyscale://cluster1").connect()
... # Computation in Cluster 1
context.disconnect() # Manually disconnect

# Start client using a `with` statement
with ray.client("anyscale://cluster2").connect() as c2:
    ... # Computation in Cluster 2
# Automatically disconnect from cluster 2 after exiting the `with` block

Last updated

Was this helpful?