Intelligent Conversation and Communications Cloud (IC3) powers billions of real-time customer conversations across Microsoft's first party (Teams, Skype, Azure Communication Servies), and second party (Dynamics) solutions. IC3 enables reliable and high-quality audio/video calling, meeting, and messaging services that work every time from anywhere seamlessly across all customer touchpoints. IC3’s mission is to make conversations on M365 platforms more intelligent in real-time empowering best-in-class productivity tools for the modern workplace where every call, meeting or chat will make the next one better. This role is about advancing engineering excellence, leveraging AI to solve reliability challenges, and making a direct impact on millions of customers, transforming Microsoft’s quality engineering landscape.
Requirements
- coding in languages including, but not limited to, C, C++, C-Sharp, Java, JavaScript, or Python
- 4+ years of experience with icloud services.
- Experience building or operating observability platforms (monitoring, logging, tracing) and applying AI/ML to anomaly detection or root cause analysis.
- Experience with designing and implement automated solutions that reduce manual effort.
- Familiarity with cloud platforms (Azure/AWS/GCP) and microservices architectures.
- Knowledge of AI/ML concepts and practical experience integrating AI-driven features into engineering workflows is highly desired.
- Understanding of distributed systems and microservices architecture.
Responsibilities
- Design and develop large-scale distributed services using modern engineering practices.
- Architect systems with well-defined interfaces and leverage telemetry data for decision-making.
- Ensure services are modular, secure, reliable, diagnosable, monitored, and reusable.
- Improve test coverage, implement integration tests, and resolve problem areas.
- Build reusable engineering tools that boost service health, reduce operational overhead, and empower teams with actionable insights.
- Enhance observability across business-critical services to accelerate detection and diagnosis of issues.
- Strengthen on-call effectiveness by modernizing incident response workflows and leveraging intelligent systems.
Other
- 3 days / week in-office
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- Experience working in large-scale enterprise environments.
- Eagerness to learn about building reliable and performant systems.