Search by job, company or skills

  • Posted 17 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

The Kubernetes Architect OpenShift is responsible for designing, building, and operating secure, scalable, and highly available Red Hat OpenShift Container Platform (OCP) environments across hybrid and multi-cloud infrastructures. This role bridges business requirements and technical execution by leading platform architecture, enabling application modernization, establishing governance and automation standards, and acting as a trusted technical advisor to engineering teams and stakeholders.

Core Responsibilities

OpenShift Architecture and Components

  • Control Plane: Manages cluster state, scheduling, and orchestration. Consists of API server, etcd, controller manager, and scheduler.
  • Worker Nodes: Run application workloads, managed via machine config pools for scalability and lifecycle management.
  • Operators: Automate the deployment, scaling, and management of complex applications and platform services.
  • Integrated Registry: Stores and manages container images, supporting lifecycle policies and image pruning.
  • Authentication and Authorization: Implements RBAC, OAuth, and integration with external identity providers (LDAP, Active Directory).
  • Networking: OVN-Kubernetes as the default CNI, supporting overlay networking, service mesh, ingress/egress, and network policies.
  • Persistent Storage: OpenShift Data Foundation (ODF) for block, file, and object storage, supporting dynamic provisioning and multi-cloud integration.

Kubernetes Fundamentals and Advanced Concepts

  • Pod and Service Management: Understanding pod lifecycle, service discovery, and load balancing.
  • Custom Resource Definitions (CRDs): Extending Kubernetes APIs for custom automation and platform services.
  • Controllers and Reconciliation Loops: Ensuring desired state through declarative configuration and automated remediation.
  • Admission Controllers and Webhooks: Enforcing policies and validating resource creation.
  • Resource Governance: Implementing quotas, limit ranges, and priority classes for workload management.

Platform Automation, Operators, and Custom Resources

  • Operator SDK: Building, testing, and deploying custom Operators using Go, Ansible, or Helm.
  • Operator Lifecycle Manager (OLM): Managing Operator installation, upgrades, and dependencies.
  • Custom Resource Definitions (CRDs): Defining APIs for custom automation and platform services.
  • Automated Health Checks and Metrics: Integrating Prometheus and Grafana for observability.
  • Declarative Infrastructure: Using YAML manifests and GitOps workflows for repeatable deployments.

Networking, Service Mesh, and Ingress/Egress Design

  • OVN-Kubernetes: Overlay networking, distributed routing, and support for hybrid clusters (Linux/Windows).
  • Service Mesh (Istio): Microservices communication, traffic management, and observability.
  • Ingress/Egress Controllers: Managing external access, TLS termination, and routing policies.
  • Network Policies: Implementing fine-grained access controls for pods and services.
  • IPsec Encryption: Securing intra-cluster communication.

Storage, Persistent Volumes, and Data Services

OpenShift Data Foundation (ODF)

  • Block, File, and Object Storage: Supporting databases, logging, monitoring, and application data.
  • Dynamic Provisioning: Using CSI drivers for automated volume management.
  • Multi-cloud Object Gateway: Abstracting storage across AWS S3, Azure Blob, GCP, and on-premises resources.
  • Backup and Disaster Recovery: Implementing Velero and multi-region strategies for data protection.

Monitoring, Observability, Logging, and SRE Practices

  • Prometheus and Grafana: Metrics collection, dashboarding, and alerting.
  • Thanos Querier: Aggregating metrics across clusters for centralized monitoring.
  • Logging Stack: Fluentd, Loki, Elasticsearch, and Kibana for log aggregation and analysis.
  • Service Level Objectives (SLOs): Defining and tracking reliability metrics.
  • Incident Response and Forensics: Integrating audit logs and monitoring tools for rapid issue resolution.

High Availability, Scalability, and Disaster Recovery

  • Multi-AZ Deployments: Distributing control plane and worker nodes across availability zones.
  • Cluster Autoscaling: Dynamic scaling of compute resources based on workload demand.
  • Pod Disruption Budgets: Ensuring application availability during maintenance.
  • Disaster Recovery: Backing up etcd, restoring clusters, and implementing failover mechanisms.

Identity, Access Management, and Governance

  • RBAC and OAuth: Managing user and service account permissions.
  • Integration with LDAP/Active Directory: Centralized identity management and group synchronization.
  • Security Context Constraints (SCCs): Enforcing pod-level security policies.
  • Audit Logging: Tracking access and changes for compliance.
  • Policy-as-Code: Using OPA/Gatekeeper for automated policy enforcement.

Compliance, Auditing, and Regulatory Readiness

  • Compliance Operator: Automated scanning and remediation for CIS, NIST, PCI-DSS, HIPAA, and other benchmarks.
  • Tailored Profiles: Customizing compliance checks for client-specific requirements.
  • Audit Trails: Persistent storage of scan results and remediation actions.
  • Manual and Automated Remediation: Applying fixes via MachineConfig and KubeletConfig.

Container Runtime, Image Management, and Registries

  • CRI-O and Docker: Managing container runtimes.
  • Internal and External Registries: OpenShift integrated registry, Quay, DockerHub, Artifactory.
  • Image Pruning and Lifecycle Policies: Automating cleanup of unused images to optimize storage.
  • Vulnerability Scanning: Integrating Quay Security Operator and Trivy for image scanning.

Application Modernization and Migration Strategies

  • Migration Toolkit for Applications (MTA): Assessing container suitability, analyzing source code, and automating migration paths.
  • Bulk Assessment and Automated Refactoring: Reducing manual effort and technical debt.
  • CI/CD Integration: Generating deployment artifacts for automated pipelines.
  • Modernization Planning: Prioritizing applications based on business impact and migration effort.

Enterprise Integration: Middleware, Messaging, Databases

  • AMQ Streams (Kafka): Event-driven architectures, message brokering, and stream processing.
  • Operators for Databases and Middleware: Automating deployment and management of PostgreSQL, MongoDB, JBoss, and other services.
  • Kafka Connect and MirrorMaker: Integrating with external systems and multi-cluster replication.
  • Service Mesh Integration: Managing microservices communication and observability.

Business And Consulting Skills

Stakeholder Communication and Solution Design

  • Stakeholder Engagement: Translating business requirements into technical solutions, managing expectations, and facilitating decision-making.
  • Solution Design: Architecting resilient, scalable, and secure platforms tailored to client needs.
  • Cost Optimization and Cloud Economics: Advising clients on pricing models, reserved instances, and resource utilization to minimize costs.
  • Service Level Agreements (SLAs): Defining and managing SLAs, support models, and shared responsibility matrices.
  • Compliance Readiness and Audit Support: Guiding clients through regulatory compliance and audit processes.

Application Modernization and Migration Consulting

  • Modernization Planning: Assessing application portfolios, prioritizing migration efforts, and defining strategies.
  • Migration Toolkit for Applications (MTA): Automating assessment, refactoring, and deployment of legacy applications.
  • Integration with Enterprise Systems: Designing patterns for middleware, messaging, and databases.

Cost Optimization Strategies

  • Hardware Overcommit: Maximizing resource utilization for virtualized workloads.
  • Reserved Instances and Savings Plans: Leveraging cloud provider programs for predictable costs.
  • Unified Billing and Financial Planning: Aligning platform consumption with organizational budgets.

Required Skills & Experience

  • Strong hands-on experience architecting and operating Red Hat OpenShift in production.
  • Deep knowledge of Kubernetes architecture, networking, storage, and security.
  • Experience with GitOps, CI/CD pipelines, and infrastructure automation.
  • Proven ability to design for high availability, scalability, and disaster recovery.
  • Strong communication skills with experience working across engineering and business teams.

Certifications, Training, and Career Development Paths

Red Hat OpenShift Certification Tracks

  • Red Hat Certified OpenShift Administrator (EX280)
  • Red Hat Certified Architect (RHCA)
  • Specialist Certifications: OpenShift Automation and Integration (EX380), Advanced Cluster Security (EX430), Data Foundation (EX370), Virtualization (EX316).
  • Kubernetes certifications (CKA, CKAD, CKS)
  • Cloud Services Specializations: ROSA (AWS), ARO (Azure), hybrid cloud deployments.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 139221253