Skip to content

Conversation

@rueian
Copy link
Contributor

@rueian rueian commented Dec 7, 2025

Description

The availability_zone restrictions somehow cause aws_cluster_launcher_full tests to fail with SSH timeouts when the cluster launcher tried to set up the cluster it launched via ssh:
image

After commenting availability_zone out, the tests now pass https://buildkite.com/ray-project/release/builds/71069.
image

Related issues

Fixes anyscale#552

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

gemini-code-assist[bot]

This comment was marked as outdated.

@rueian rueian force-pushed the update-aws-cluster-launcher-example branch 3 times, most recently from 6313d0c to bd01c7d Compare December 9, 2025 01:03
@rueian rueian changed the title [core][ci] replace the old ray-ml:latest-gpu image with ray:latest-gpu in the aws cluster launcher yaml [core][ci] comment out the old image id which causing an ssh timeout issue in the aws cluster launcher yaml Dec 9, 2025
…out issue in the aws cluster launcher yaml

Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
@rueian rueian force-pushed the update-aws-cluster-launcher-example branch from bd01c7d to 24f8f11 Compare December 9, 2025 02:06
@rueian rueian changed the title [core][ci] comment out the old image id which causing an ssh timeout issue in the aws cluster launcher yaml [core][ci] comment out the availability_zone which causes an ssh timeout issue in the aws cluster launcher yaml Dec 9, 2025
@rueian rueian added the go add ONLY when ready to merge, run all tests label Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Release test aws_cluster_launcher_full.v1 failed

1 participant