Role Definition
The Architect will:
- Own critical and irreversible architectural decisions
- Lead hands-on recovery of production failures
- Ensure long-term survivability, scalability, and resilience of infrastructure
⚠️ This is a strictly on-site role, requiring continuous physical presence in the data center environment, including direct access to hardware, consoles, and secure facilities.
Experience (Mandatory)
- 15–20 years of enterprise IT infrastructure experience
- Proven ownership of large-scale data centers (thousands of servers/workloads)
- Demonstrated accountability for:
- High-risk architectural decisions
- Major production outages
- Root cause analysis and recovery execution
- Hands-on involvement in critical incident management and recovery
- Mandatory 100% on-site presence for operational and recovery activities
Hardware & Data Center Management (Expert Level – Mandatory)
Enterprise Hardware Platforms
- Design, deployment, and lifecycle management of:
- Rack, blade, and hyper-converged systems
- Deep expertise in:
- BIOS, firmware, drivers, compatibility matrices
- BMC / iDRAC / iLO and out-of-band management
- Execution of:
- Large-scale hardware refresh and expansion programs
- Decommissioning and lifecycle governance
- Advanced troubleshooting of:
- Disk, memory, CPU, NIC, HBA failures
Data Center Operations & Governance
- Hands-on management of:
- Racks, power, cooling, cabling, and space planning
- Coordination with:
- Facilities teams
- Vendors and field engineers
- Ownership of:
- Hardware standards and build templates
- Asset lifecycle, warranties, and support contracts
Hands-On Technical Expertise (Mandatory)
Containers, Kubernetes & OpenShift
- Deep operational expertise in Kubernetes/OpenShift environments
- Hands-on recovery of:
- Control plane failures under live production traffic
- Strong understanding of:
- Scheduler behavior
- etcd failure scenarios
- API server performance bottlenecks
- Design for:
- High availability (node, rack, DC failures as baseline scenarios)
Virtualization & SDDC
- Expert-level design and recovery of virtualization platforms
- Strong knowledge of:
- Micro-segmentation
- Distributed firewalls
- Performance tuning
- Ownership of:
- Emergency upgrades
- Rollbacks
- Lifecycle decisions
Networking & Hybrid Cloud
- Carrier-grade networking design and troubleshooting
- Hybrid cloud architecture with resilience against:
- Connectivity failures
- Identity/control-plane issues
- Ability to operate independently of vendors/system integrators
Certifications (Highly Preferred / Mandatory)
- Red Hat Certified Architect (RHCA) – OpenShift & Infrastructure tracks
- Certified Kubernetes Administrator (CKA)
- Certified Kubernetes Security Specialist (CKS)
- VMware VCIX (DCV + NV) or equivalent
- Cloud Architect Certification:
- AWS Professional OR Azure Architect Expert
- Expert-level networking certification (CCIE or equivalent)
Required Technical Skills
Virtualization & HCI
- Nutanix HCI & AHV (mandatory)
- VMware vSphere, NSX, vSAN
- Hypervisor performance tuning
- Data center infrastructure architecture
Cloud & Automation
- Private cloud architecture
- Infrastructure as Code (Terraform, Ansible)
- CI/CD integration
- API-driven infrastructure
Storage & Networking
- Software-defined storage:
- Ceph, Portworx, Nutanix Storage
- Software-defined networking
- Load balancers & ingress controllers
Experience Requirements
- 10+ years in infrastructure/virtualization architecture
- 5+ years in Kubernetes/container platforms
- Strong expertise in:
- Nutanix HCI
- VMware enterprise environments
- Experience in:
- Large enterprise / telecom / banking environments
- Proven track record in:
- Data center transformation
- Cloud migration programs