Datacentre Operations Engineer

Radiant Nuclear · Dallas-Fort Worth

Ashby Posted Jun 5, 2026 First seen Jun 5, 2026

About Us

We’re a fast-growing GPU-as-a-Service provider, delivering scalable, high-performance compute infrastructure purpose-built for AI and HPC workloads. Operating across global data centres, we run mission-critical environments where uptime, throughput, and ultra-low latency are non-negotiable.

Role Overview

We’re looking for a qualified, experienced Datacentre/Hardware Engineer to run our muli-million dollar HPC infrastructure based in Dallas Fort Worth, US. You’ll be well versed with managing and optimising datacentres, dealing promptly with hardware failures, optimising environmental performance as well as deploying new hardware and services 24/7 x 365. You’ll be hands on with high performing HPC compute and will operate with utmost diligence, professionalism and focus to ensure the equipment underpinning our services operate at peak performance.

Key Responsibilities

  • Troubleshooting and Support: Quickly diagnose and resolve hardware and network issues to maximise uptime.

  • Respond to critical hardware alerts via our monitoring and observability platform. Contribute to ongoing service improvement to improve our monitoring capability

  • RMA and Support: Manage vendor relationships, handling RMAs and support requests within Ori’s Service Level Objectives (SLOs) to meet customer contract SLAs.

  • Data Center Management: Guide data center acquisition, setup, and ongoing maintenance, fostering compliance and leveraging strong vendor partnerships.

  • Fully own acquisition of hardware assets from the point of purchase and delivery, through lifecycle management and disposal - all while owning asset management within ORI’s CMDB system.

  • Hardware Installation and Maintenance: Deploy and maintain HPC and AI hardware for uninterrupted operations, including performing low-level system maintenance such as hardware troubleshooting, firmware updates, and replacement of components as needed.

  • Datacenter Environment Technologies: Oversee cooling, power distribution, and other critical data center technologies to maintain high operational standards.

  • Capacity Planning and Resource Allocation: Support strategic planning to align infrastructure capabilities with current and projected demands.

  • Develop and maintain datacentre/hardware management SOP’s ensuring continual alignment with ORI’s governance and compliance requirements

  • Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement.

  • Operate and support services 24x7x365 for production environments, including on-call rotation

  • Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations

  • Mentor junior engineers and act as an Operational requirements consultant to other departments

  • Communicate technical decisions clearly to non-technical stakeholders and customers

  • Uphold a culture of: do, document, automate

  • Willing to cross train and upskill in Infrastructure/Platform SRE practises.

  • Willing to travel across North America to support future datacentre onboarding and deployments.

Essential Skills & Experience

  • Degree in Computer Science, or 10 years industry experience.

  • 3+ years of experience in data center operations, HPC, or related roles.

  • Proven track record working with HPC Nvidia GPU or equivalent systems, high-performance storage, and networking.

  • Expertise in hardware installation, network configuration, and low-level system maintenance, including hardware troubleshooting and firmware management.

  • Knowledge of data center environment technologies, including cooling and power distribution.

  • Experience in data center design, greenfield deployments, and operations.

  • Strong understanding of hardware and spares management, with the ability to handle RMAs and support cases within defined SLOs to meet SLA requirements.

  • Solid understanding of HPC and AI workloads.

  • Strong problem-solving abilities and the resilience to thrive in a fast-paced environment.

  • Excellent communication skills and ability to collaborate with cross-functional, internationally dispersed teams.

  • Strong grasp of ITSM and service operation best practices

  • Excellent communication and mentorship skills

  • Comfortable interfacing with internal stakeholders and external customers

  • Bonus: Specific vendor endorsed qualifications from Supermicro or Dell for HGX based systems

Preferred Qualifications

  • Knowledge of large scale private cloud deployments and capacity planning.

  • Qualifications in HVAC management and deployments

  • Certifications in relevant areas - Hardware, Networking

  • ITIL Foundation level qualification or equivalent experience