About Anyscale:
Anyscale provides a development platform intended to simplify distributed computing. This enables software developers of all skill levels to build applications that run at any scale from a laptop to the data center.
We're commercializing a popular open source project called
Ray - which is a framework for distributed computing as well as an ecosystem of libraries for scalable machine learning.
Anyscale is based in San Francisco, CA.
Available Roles
* Cloud setup: Experience with AWS/GCP/Azure resources/permissions/access controls, IAMs, Roles, certificates, subnets, Load Balancers, Terraform.
* Networking Expert: Experience building stable networking infra, Cloud DNS, gRPC, NGINX/Proxies, HTTPS, TLS, routing, Kubernetes, Load Balancers, Go.
* Staff/Senior Staff Infra engineer to be the Tech Lead of the overall infrastructure team:
Personality : Great people skills. Grows people around them, humble, easy going, positive energy leads by example, supports and amplifies the team, excellent communication skills, good listener, quick learner holds a very high-quality bar for the team. Very strong front leadership skills and knows when to say No/push back and knows how to push things forward. Makes things happen and leads cross-team efforts. Very strong execution. Very hands-on / can be low-level and write code daily.
Experience : The tech lead of an infra team, expert in cluster management, Cluster Orchestration, Kubernetes/K8s, observability/metrics, 0->1, Autoscaling, Public cloud providers (GCP/AWS/GKE/EKS), networking expert, very good designing/ architecture skills. Ideally has previous B2B experience.
* Infra Engineer: Infra generalist. Experience with Kubernetes, gRPC, Go, cluster management, autoscaling, observability/metrics, docker container management, VMs.
* Infra Engineer: Infra generalist. Experience with Kubernetes, gRPC, Go, cluster management, autoscaling, observability/metrics, docker container management, VMs.
About the role:
The Infrastructure team builds the foundational blocks of Anyscale's serverless infrastructure end-to-end. This includes cluster orchestration, autoscaling, logging, metrics, billing, and a multi-cloud, multi-region architecture that provides a reliable and scalable managed Ray experience for Anyscale customers.
We are also responsible for providing our customers with a serverless experience where they do not have to worry about managing resources, connecting to the clusters, starting/managing clusters, or even having the right environment for their workloads. We give the user an infinite laptop experience, as if their laptop can scale seamlessly to a cluster without the user noticing. They just run the application on their laptop and our infrastructure manages everything under the hood. The Infrastructure team’s work spans both open source Ray and the proprietary Anyscale products.
A snapshot of previous projects:
- Optiziming the startup time of clusters and autoscaling
- Tracking the usage of cluster resources
- Provisioning and managing the life cycle of Ray clusters managed by Anyscale
- Building a client for Ray to enable the infinite laptop experience
- Building billing infrastructure, cluster monitoring, and cost tracking
- Building AWS/GCP Based cloud provider to manage the lifecycle of nodes on AWS/GCP.
- Improving Spot instance support.