Prime Time Consulting is seeking a Data Scientist to support an NLP project focused on accurate and automatic tokenization of language data from spoken or written sources. The goal is to develop automated solutions for annotating language data with parts of speech information and enhance existing models by evaluating their performance against human-generated annotations.
Requirements
- Foundations: (Mathematical, Computational, Statistical)
- Data Processing: (Data management and curation, data description and visualization, workflow and reproducibility)
- Modeling, Inference, and Prediction: (Data modeling and assessment, domain-specific considerations)
- Programming (skill in at least one high-level language (e.g. Python))
- Statistical analysis (e.g. variability, sampling error, inference, hypothesis testing, EDA, application of linear models)
- Data management (e.g. data cleaning and transformation)
- Data mining, data modeling and assessment, artificial intelligence, and/or software engineering.
Responsibilities
- Develop automated solutions for annotating language data with parts of speech information.
- Enhance existing models by evaluating their performance against human-generated annotations for both speech and text.
- Devise strategies for extracting meaning and value from large datasets.
- Make and communicate principled conclusions from data using elements of mathematics, statistics, computer science, and application specific knowledge.
- Through analytic modeling, statistical analysis, programming, and/or another appropriate scientific method, develop and implement qualitative and quantitative methods for characterizing, exploring, and assessing large datasets in various states of organization, cleanliness, and structure that account for the unique features and limitations inherent in Government data holdings.
- Translate practical mission needs and analytic questions related to large datasets into technical requirements and, conversely, assist others with drawing appropriate conclusions from the analysis of such data.
- Make informed recommendations regarding competing technical solutions by maintaining awareness of the constantly shifting Government collection, processing, storage and analytic capabilities and limitations.
Other
- Bachelor’s Degree with 10 years of relevant experience
- Associates degree with 12 years of relevant experience
- Bachelor’s Degree must be in Mathematics, Applied Mathematics Statistics, Applied Statistics, Machine learning, Data Science, Operations Research, or Computer Science or a degree in a related field (Computer Information Systems, Engineering), a degree in the physical/hard sciences (e.g. physics, chemistry, biology, astronomy), or other science disciplines with a substantial computational component (i.e. behavioral, social, or life) may be considered if it included a concentration of coursework (5 or more courses) in advanced Mathematics (typically 300 level or higher, such as linear algebra, probability and statistics, machine learning) and/or computer science (e.g. algorithms, programming, , data structures, data mining, artificial intelligence).
- Effectively communicate complex technical information to non-technical audiences.
- Position requires active Security Clearance with appropriate Polygraph