Berkay Çelik

Site Reliability Engineer - DevOps Engineer - Cloud Architect
Istanbul, TR.

About

Highly experienced Site Reliability Engineer with over 4 years of expertise in optimizing cloud infrastructure and implementing robust DevOps best practices. Proven track record of achieving 99.99% uptime for complex enterprise systems, significantly reducing infrastructure costs by 30%, and enhancing deployment efficiency by 60%. Adept in Azure, AWS, Kubernetes, Terraform, and MLOps, with a focus on scalable and resilient cloud solutions.

Work

AMADEUS
|

Senior DevOps Engineer

Istanbul, Istanbul, Türkiye

Summary

Managed SRE operations for a high-volume travel platform, leading cloud migration efforts and architecting MLOps solutions.

Highlights

Managed SRE operations for a travel platform processing over 1.5B annual transactions, achieving 99.99% uptime through Prometheus/Grafana monitoring and automated incident response.

Led a critical Azure cloud migration for 200+ microservices on OpenShift, reducing infrastructure costs by 30% and deployment time by 60% using Terraform IaC and CI/CD pipelines.

Implemented a comprehensive observability framework with 50+ custom Prometheus metrics and Grafana dashboards, resulting in a 40% decrease in Mean Time To Resolution (MTTR).

Architected an MLOps platform utilizing Kubeflow and MLflow for AI-powered recommendations, streamlining model release cycles from weeks to just 2 days.

Collaborated cross-functionally with 25+ engineers across 4 time zones to establish robust SRE best practices, achieving 95% error budget compliance.

Automated infrastructure provisioning using Azure ARM templates and Ansible, reducing manual tasks by 70% and enhancing operational efficiency.

ICRON
|

Cloud Engineer

Istanbul, Istanbul, Türkiye

Summary

Orchestrated cloud deployments for a supply chain platform, automating infrastructure and modernizing CI/CD pipelines.

Highlights

Orchestrated complex cloud deployments for a supply chain platform serving 50+ enterprise clients, managing Azure infrastructure and reducing associated costs by 25%.

Automated infrastructure provisioning using Terraform and Ansible, drastically cutting deployment time from 4 hours to 30 minutes.

Designed and implemented a robust RBAC framework with 20+ custom roles, reducing access provisioning time by 75% and ensuring SOC2 compliance.

Implemented a comprehensive monitoring stack with Prometheus/Grafana, improving system availability from 97% to 99.5% and reducing MTTR by 40%.

Modernized CI/CD pipelines using Azure DevOps, implementing blue-green deployments to enable zero-downtime releases and enhance system stability.

IBM
|

Site Reliability Engineering

Istanbul, Istanbul, Türkiye

Summary

Optimized performance for enterprise applications and established SRE practices.

Highlights

Optimized performance for 20+ enterprise applications using Instana APM, reducing P1 incident resolution time from 2 hours to 45 minutes.

Deployed and managed Kubernetes clusters on RedHat OpenShift with auto-scaling and self-healing capabilities, increasing system uptime to 99.5%.

Established and enforced SRE practices, including golden signals monitoring and error budgets, which led to a 40% reduction in Mean Time To Resolution (MTTR).

AYSTEK Smart Software
|

DevOps Engineer

Istanbul, Istanbul, Türkiye

Summary

Migrated applications to AWS, developed Terraform modules, and optimized serverless architecture.

Highlights

Successfully migrated over 15 applications from on-premises to AWS (EC2, RDS, S3), resulting in a 30% reduction in monthly costs, saving $8K per month.

Developed comprehensive Terraform modules for AWS infrastructure, automating the deployment of over 50 resources and accelerating development cycles.

Implemented robust CI/CD pipelines using Jenkins and AWS CodeDeploy, achieving a 95% deployment success rate for critical applications.

Optimized serverless architecture leveraging AWS Lambda and API Gateway, significantly improving application response times from 800ms to 200ms.

Ozyegin University
|

Teaching Assistant

Istanbul, Istanbul, Türkiye

Summary

Delivered lab sessions, developed automated grading systems, and mentored teaching assistants.

Highlights

Delivered over 20 lab sessions focused on optimization techniques to 150+ students, achieving a 92% student satisfaction rating.

Developed an automated grading system using Python and VBA, which reduced grading time by 60% and improved efficiency.

Mentored and trained 15+ teaching assistants through structured programs, enhancing their instructional capabilities and team performance.

Education

Ozyegin University
Istanbul, Istanbul, Türkiye

Master of Science

Artificial Intelligence

Courses

Focus: Machine Learning

Deep Learning

MLOps

Ozyegin University
Istanbul, Istanbul, Türkiye

Bachelor of Science

Computer Science

Grade: 3.2/4.0

Certificates

Microsoft Azure Administrator (AZ-104)

Issued By

Microsoft

Skills

Cloud Platforms

Azure (AZ-104 Certified), AWS (EC2, RDS, Lambda, S3, SageMaker), Google Cloud Platform.

Container/Orchestration

Kubernetes, Docker, OpenShift, Helm, ArgoCD, Kubeflow.

Infrastructure as Code

Terraform, Ansible, Azure ARM Templates, CloudFormation, Pulumi.

CI/CD Tools

Azure DevOps, Jenkins, GitHub Actions, GitLab CI, Tekton.

Monitoring/Observability

Prometheus, Grafana, ELK Stack, Datadog, New Relic, Instana APM.

Programming Languages

Python, Go, Bash, PowerShell, JavaScript, SQL.

MLOps/AI Tools

MLflow, Kubeflow, TensorFlow, PyTorch, A/B Testing.

Methodologies

Agile, Scrum, GitOps, SRE Practices, DevSecOps.

Projects

Cryptocurrency Price Prediction with Sentiment Analysis

Summary

Engineered deep learning models (Informer, Autoformer) to predict BTC/ETH prices using Twitter sentiment analysis.

Blockchain E-Commerce Platform

Summary

Built a decentralized e-commerce platform using React, Node.js, and Solidity smart contracts.