Known Issues
Known issues in Anyscale and Ray
Known issues
pip packages fail to install within a cluster environment
If you see this message in Cluster Environment build logs:
[INFO] 6/23/2021, 4:20:16 PM: Running setup.py install for <SOME PIP PACKAGE>: finished with status 'error'
[ERROR] 6/23/2021, 4:20:16 PM: ERROR: Command errored out with exit status 1:
...
ImportError: Something is wrong with the numpy installation. While importing we detected an older version of numpy in ['/home/ray/anaconda3/lib/python3.8/site-packages/numpy']. One method of fixing this is to repeatedly uninstall numpy until none is found, then reinstall this version.
Workaround
Remove from pip section
Add the following to post build commands:
/home/ray/anaconda3/bin/python -m pip uninstall -y numpy
rm -rf /home/ray/anaconda3/lib/python3.<7 OR 8>/site-packages/numpy
/home/ray/anaconda3/bin/pip install numpy
/home/ray/anaconda3/bin/pip install --upgrade --no-cache-dir <SOME PIP PACKAGE>
Tensorboard support requires port forwarding
Problem
Future versions of anyscale will include support for tensorboard out of the box. Here is how to use it today.
Workaround
After having run your training clusters, there will be log files in your cluster at /home/ray/ray_results.
Use the Anyscale CLI to ssh into your cluster, including a port forwarding option for tensorboard:
anyscale ssh -o -L6006:localhost:6006
Inside the resulting cluster, launch
tensorboard
.Open a brower on your local machine to
http://localhost:6006
and use the tensorboard UI.
push -a does not update worker nodes
Problem
The anyscale push -a command is expected to copy code from the working directory to all nodes in the cluster. However, it only copies to the cluster’s head node.
Note that anyscale push
is deprecated. Using ray.init("anyscale://")
to interact with Ray ensures that your code is distrbuted to each node.
Resolution
This issue is mitigated in Ray 1.4 with runtime environments, which automatically upload and sync to all nodes in a cluster.
The deprecated -a option will not be available in future releases of Anyscale. Ray 1.4 has support for file mounts in ray core, which is the supported method for keeping files synchronized among Anyscale head and worker nodes.
Workarounds
To ensure your code gets to all of the nodes, use one of the following strategies:
Use anyscale up
This command will restart your cluster and ensure the code is distributed to all nodes.
Leverage anyscale connect to update clusters
Anyscale connect ensures that all of your local code is shipped to the cluster anyway.
Use push and copy method
This more intricate method is provided as a workaround for updating code in running clusters. It is not recommended.
anyscale push
to send your local files to the leader node.Use
anyscale ssh
to connect to the cluster interactively.Use
scp
to copy files among the hosts in the cluster.
Last updated
Was this helpful?