
Hello, I am Adam Schweitzer, an outgoing and energetic IT professional with experience in incident management, production support, observability, and team leadership.
My current work focuses on supporting reliable technology platforms used across thousands of stores, with a strong emphasis on Back of House software, while also supporting POS, e-commerce, and AWS-supported application environments. I am passionate about improving system reliability, strengthening observability, and helping teams respond effectively to production incidents.
I help keep production systems stable, observable, and ready for scale.
I am a Site Reliability Engineer focused on incident management, application reliability, observability, and operational readiness across large restaurant technology environments. My work supports software used by thousands of stores, with a strong focus on Back of House systems while also supporting POS, e-commerce, and other critical restaurant platforms.
My day-to-day work includes supporting high-severity incident response, creating Datadog monitors and dashboards, analyzing SQL data, improving runbooks, and partnering with engineering, product, SRE, and operations teams to improve service health.
I also support AWS-based application environments, including ECS, Lambda, RDS, S3, SQS, and CloudWatch. My focus is on detecting issues earlier, reducing repeat incidents, improving troubleshooting paths, and helping teams make better decisions with production data.
What I Focus On
Production Reliability
Improving system stability through monitoring, alerting, incident analysis, and proactive issue detection.
Incident Management
Supporting high-severity incident response, coordinating technical teams, documenting timelines, and driving clear corrective actions.
Observability
Creating Datadog monitors, dashboards, and service health views that help teams identify problems faster.
Operational Readiness
Building runbooks, improving support processes, reviewing incident patterns, and helping teams prepare for production support needs.
Data-Driven Troubleshooting
Using SQL, logs, metrics, and system behavior to investigate issues, confirm impact, and support root cause analysis.
Current Areas of Work
- Supporting Back of House software used across thousands of stores
- Monitoring POS, e-commerce, and related restaurant technology platforms
- Creating Datadog monitors and dashboards for production visibility
- Supporting AWS services across ECS, Lambda, RDS, S3, SQS, and CloudWatch
- Writing SQL queries for troubleshooting, incident review, and operational reporting
- Partnering with engineering and operations teams during incidents
- Improving runbooks, alert quality, and response workflows
- Mentoring junior team members and strengthening incident response practices
My Approach
Strong reliability work comes from clear visibility, fast response, accurate data, and practical process improvement. My focus is not only resolving incidents, but also understanding why they happened, reducing repeat issues, and helping teams operate with more confidence in production.
Resume
Click below to download a copy of my current resume.
