NVIDIA is looking to define the next era of computing by leveraging AI and GPU technology. The Cloud Gaming team needs a Data Engineer to develop and maintain Kubernetes-based GPU-accelerated data processing services to optimize query response times and drive platform engagement.
Requirements
- Built required infrastructure for optimal extraction, transformation, and loading of data from various sources using AWS, Azure, SQL, or other technologies, and knowledgeable in Databricks, Splunk, or Snowflake solutions.
- Strong coding skills, including the ability to write readable, testable, maintainable, and extensible code (primarily Python).
- Strong Kubernetes experience on-premise and/or CSP, developing containerized microservices.
- Hands-on experience in performance tuning/troubleshooting spark applications.
- Familiarity with metrics collection, health monitoring, and observability tools.
- Strong experience in data cleaning, aggregation, transformation, and extraction.
- Experience in active ML production pipelines is a plus (MLflow, Kubeflow).
Responsibilities
- Optimize distributed computing infrastructure by analyzing cost and right-sizing for latency and performance.
- Identify benchmarks and establish metrics for tracking on dashboards and alerting.
- Improve and sustain a strong, expandable, dependable, live data processing service.
- Develop reusable framework deployments for data ingestion, processing, and analysis.
- Build data systems and pipelines, ensuring that data sources, ingestion components, transformation functions, and destinations are well understood for implementation.
- Prepare data for prescriptive and predictive modeling by ensuring the data is complete, cleansed, and has necessary rules in place.
- Train data engineers, data scientists, and production engineers on adopting data processing workflows.
Other
- Minimum of 5 years of work experience in similar domain.
- Strong ability to drive continuous improvement of systems and processes.
- Prior data processing at scale on NVIDIA GPUs.
- Proven ability to work in a fast-paced environment where strong organizational skills are essential.
- Expertise in optimizing Spark applications and micro services for enhanced performance and scalability.