Data Center Operations Specialist
Role Summary
The Data Center Operations role involves maintaining operational standards for maximum system reliability and performance at the designated Data Centre. This includes a crucial responsibility in ensuring the smooth operation, reliability, and efficiency of data centers. The data center specialists support, coordinate, and monitor all CelcomDigi personnel and vendor activities to ensure compliance with established procedures. This covers areas of responsibility such as Break and Fix management of DC infra, coordinating vendor requests, reviewing quotations, managing procurement processes, and ensuring proper documentation for effective equipment repairs and maintenance. Moreover, the position involves developing and implementing comprehensive data center facility plans, which includes not only the layout but also power distribution, cooling systems, fire fighting, UPS, rectifier, battery backup and security measures. The role requires a high level of collaboration with various teams and stakeholders to plan and execute data center expansions, relocations, and installations, all the while ensuring the scalability and efficiency of the data center infrastructure.
Responsibilities
- End-to-End Infrastructure Governance: Govern the overall health and performance of all 35 TOC/DC CelcomDigi sites (xCelcom & xDigi), covering the full power chain (TNB supply, transformers, UPS, PDU, generators), cooling, and fire protection systems, leveraging reporting (RI) and BMS insights.
- Change Management Governance: Review, scrutinize, and approve Technical MOPs and processes for all planned activities in data centres, ensuring risks are mitigated and compliance with operational standards by project, planning, and IFM teams.
- Incident and Crisis Management Oversight: Govern incident management to ensure meets agreed SLAs for response and resolution; lead coordination, war-room engagement, stakeholder communication, and ensure timely RCA and service restoration.
- Corrective Maintenance Governance: Oversee end-to-end corrective maintenance by: 1. Validating SOW to ensure accurate root cause resolution without over-scoping. 2. Governing cost through BOQ review against market benchmarks and contract rate cards. 3. Managing CUW/PR/PO processes with budget control. 4. Ensuring IFM/vendor delivery, quality assurance, and formal sign-off (POR)
- Preventive Maintenance and SLA Compliance: Ensure all preventive maintenance activities in strict adherence to SLA scope and frequency across electrical, cooling, fire protection, and standby power systems (UPS, DC system and generator set).
- Vendor and IFM Partner Performance Governance: Monitor and enforce IFM/vendor performance against SLAs, KPIs, and contractual obligations, ensuring timely closure of issues, service quality, and continuous improvement.
- Regulatory and Audit Compliance: Ensure all infrastructure, maintenance activities, and certifications comply with regulatory requirements (ST, DOSH, Bomba, DOE, etc.) and maintain audit readiness at all times.
- Modernization and Resilience Enhancement: Drive and support infrastructure modernization, refresh, and upgrade initiatives to eliminate risks, improve efficiency, and enhance overall data centre resilience.
- Automation and Future-Ready Operations: Lead initiatives to automate infrastructure monitoring, reporting, and inventory management, aligning TOC/DC operations with next-generation technologies and best practices.
- Cost Optimization and Operational Efficiency Governance: Govern overall operational spending across TOC/DC by enforcing cost controls, validating expenditures, and ensuring alignment with approved budgets. Continuously identify and drive cost optimization initiatives (e.g., maintenance efficiency, energy optimization, vendor rationalization, automation) to improve operational efficiency without compromising reliability and SLA performance.
Requirements
- Bachelor’s Degree in engineering (Electrical/Mechanical) or equivalent
- At least 5 year(s) working experience of in Engineering (Electronic/Electrical), Engineering (Mechanical) or equivalent.
- Certification in Data Centre Management (e.g., CDCP, CDCS, or Uptime Institute’s ATS/ATD) and ITIL framework for service management will be an added advantage
- Experience in Data Centre maintenance and familiar with infra elements such as HV/LV Power Chains, UPS & Rectifier, Battery Systems, Diesel Generators, Chiller & Precision Cooling (CRAC/CRAH/CRV), Fire Suppression, and BMS/DCIM Monitoring Platforms
Role Summary
The Data Center Operations role involves maintaining operational standards for maximum system reliability and performance at the designated Data Centre. This includes a crucial responsibility in ensuring the smooth operation, reliability, and efficiency of data centers. The data center specialists support, coordinate, and monitor all CelcomDigi personnel and vendor activities to ensure compliance with established procedures. This covers areas of responsibility such as Break and Fix management of DC infra, coordinating vendor requests, reviewing quotations, managing procurement processes, and ensuring proper documentation for effective equipment repairs and maintenance. Moreover, the position involves developing and implementing comprehensive data center facility plans, which includes not only the layout but also power distribution, cooling systems, fire fighting, UPS, rectifier, battery backup and security measures. The role requires a high level of collaboration with various teams and stakeholders to plan and execute data center expansions, relocations, and installations, all the while ensuring the scalability and efficiency of the data center infrastructure.
Responsibilities
- End-to-End Infrastructure Governance: Govern the overall health and performance of all 35 TOC/DC CelcomDigi sites (xCelcom & xDigi), covering the full power chain (TNB supply, transformers, UPS, PDU, generators), cooling, and fire protection systems, leveraging reporting (RI) and BMS insights.
- Change Management Governance: Review, scrutinize, and approve Technical MOPs and processes for all planned activities in data centres, ensuring risks are mitigated and compliance with operational standards by project, planning, and IFM teams.
- Incident and Crisis Management Oversight: Govern incident management to ensure meets agreed SLAs for response and resolution; lead coordination, war-room engagement, stakeholder communication, and ensure timely RCA and service restoration.
- Corrective Maintenance Governance: Oversee end-to-end corrective maintenance by: 1. Validating SOW to ensure accurate root cause resolution without over-scoping. 2. Governing cost through BOQ review against market benchmarks and contract rate cards. 3. Managing CUW/PR/PO processes with budget control. 4. Ensuring IFM/vendor delivery, quality assurance, and formal sign-off (POR)
- Preventive Maintenance and SLA Compliance: Ensure all preventive maintenance activities in strict adherence to SLA scope and frequency across electrical, cooling, fire protection, and standby power systems (UPS, DC system and generator set).
- Vendor and IFM Partner Performance Governance: Monitor and enforce IFM/vendor performance against SLAs, KPIs, and contractual obligations, ensuring timely closure of issues, service quality, and continuous improvement.
- Regulatory and Audit Compliance: Ensure all infrastructure, maintenance activities, and certifications comply with regulatory requirements (ST, DOSH, Bomba, DOE, etc.) and maintain audit readiness at all times.
- Modernization and Resilience Enhancement: Drive and support infrastructure modernization, refresh, and upgrade initiatives to eliminate risks, improve efficiency, and enhance overall data centre resilience.
- Automation and Future-Ready Operations: Lead initiatives to automate infrastructure monitoring, reporting, and inventory management, aligning TOC/DC operations with next-generation technologies and best practices.
- Cost Optimization and Operational Efficiency Governance: Govern overall operational spending across TOC/DC by enforcing cost controls, validating expenditures, and ensuring alignment with approved budgets. Continuously identify and drive cost optimization initiatives (e.g., maintenance efficiency, energy optimization, vendor rationalization, automation) to improve operational efficiency without compromising reliability and SLA performance.
Requirements
- Bachelor’s Degree in engineering (Electrical/Mechanical) or equivalent
- At least 5 year(s) working experience of in Engineering (Electronic/Electrical), Engineering (Mechanical) or equivalent.
- Certification in Data Centre Management (e.g., CDCP, CDCS, or Uptime Institute’s ATS/ATD) and ITIL framework for service management will be an added advantage
- Experience in Data Centre maintenance and familiar with infra elements such as HV/LV Power Chains, UPS & Rectifier, Battery Systems, Diesel Generators, Chiller & Precision Cooling (CRAC/CRAH/CRV), Fire Suppression, and BMS/DCIM Monitoring Platforms
Screen readers cannot read the following searchable map.
Follow this link to reach our Job Search page to search for available jobs in a more accessible format.
Job Segment:
Operations Manager, Database, Facilities, Data Center, Compliance, Operations, Technology, Legal