The Amazon Artificial General Intelligence (AGI) Data Services organization is looking to solve the problem of advancing the state-of-the-art in natural language processing and machine learning by driving innovation with a Language Engineer with experience in dataset construction, linguistic annotation, dialog/semantic schemas, and automatic processing of large datasets.
Requirements
- Experience with language annotation and other forms of data markup
- Experience with scripting languages, such as Python
- Experience working with speech and text language data in multiple languages
- Practical familiarity with Machine Learning and language modeling
- Practical knowledge of version control and agile development
- Familiarity with database queries and data analysis processes (SQL, R, Matlab, etc.)
Responsibilities
- Design data collection/creation tasks in response to science needs: author instructions, define and implement quality targets and mechanisms, provide day-to-day coordination of data collection efforts (including planning, scheduling, and reporting), and be responsible for the final deliverables
- Analyze and extract language-related insights from large amounts of data
- Build tools or tool prototypes for data analysis or data authoring, using Python or another scripting language
- Use modeling tools to bootstrap or test new functionalities
- Collaborate with scientists and software engineers to evaluate performance of language models
- Handle competing requests from a range of data customers
Other
- Master’s or higher degree in a relevant field (computational linguistics or equivalent field with computational analysis)
- 2+ years experience in computational linguistics or language data processing
- Excellent communication, strong organizational skills and very detailed oriented
- Comfortable working in a fast paced, highly collaborative, dynamic work environment
- Work safely and cooperatively with other employees, supervisors, and staff