Key responsibilities:
Facility Operations & Maintenance
– Oversee day-to-day operations of all critical facility systems
– Ensure continuous operation of electrical, mechanical, and environmental systems
– Implement preventive and predictive maintenance programs
– Maintain accurate documentation of infrastructure and maintenance activities
Power Infrastructure Management
– Manage electrical systems including UPS, generators, switchgear, and PDUs
– Ensure redundancy and reliability of power supply
– Coordinate testing (e.g., load bank testing, generator testing)
– Monitor power usage and optimize energy distribution
Cooling & Environmental Control
– Oversee cooling systems (CRAC/CRAH/HVAC units)
– Maintain optimal temperature and humidity levels
– Implement energy-efficient cooling strategies
– Monitor environmental conditions and respond to deviations
Safety & Compliance
– Ensure compliance with health, safety, and environmental regulations
– Manage fire detection and suppression systems
– Conduct regular safety drills and risk assessments
– Ensure adherence to certified standards
Physical Security
– Oversee physical access control systems
– Manage surveillance systems and security protocols
– Ensure secure access to critical infrastructure areas
– Coordinate with security teams and external providers
Vendor & Contractor Management
– Manage third-party service providers for maintenance and repairs
– Define and enforce SLAs and performance standards
– Coordinate on-site activities and ensure compliance with safety procedures
– Oversee procurement of facility equipment and services
Capacity & Space Management
– Plan and manage space allocation within the data center
– Support rack layout, power density planning, and future expansion
– Ensure efficient utilization of floor space and infrastructure
Energy Efficiency & Sustainability
– Monitor and improve energy efficiency metrics (eg PUE)
– Implement sustainability initiatives and reduce environmental impact
– Optimize power and cooling usage to reduce costs
Incident & Emergency Management
– Act as primary lead for facility-related incidents (power, cooling, fire, etc.)
– Develop and maintain emergency response procedures
– Conduct root cause analysis and implement corrective actions
– Ensure rapid recovery from facility disruptions
Required Skills & Qualifications:
Hands-on experience with data center power and cooling systems
Preferred Certifications:
Experience: