The Statistician / Data Scientist III role is intended to support the SISPV leadership by preparing reports, proposing process improvements, and developing sophisticated analytic models, Triage capabilities, and dashboard metrics for CE/CV Investigators. This involves discovering patterns in past data to predict future events, processing text data, integrating disparate data sources, and creating advanced visualizations to aid in decision-making and prioritization.
Requirements
- Familiar with data mining, ability to develop and manipulate, and maintain databases.
- Demonstrated experience using software packages used for advanced statistical analysis of operational data and tools for data visualization.
- Demonstrated experience using COTS statistical software (SPSS, SAS, MatLab, etc.) for advanced statistical analysis of operational tools and data visualization.
Responsibilities
- prepare annual reports summarizing the findings and lessons learned related to SEAD 3 reportable anomalies reported and unreported, percentage of filers who were compliant, final actions referred to CE/CV for investigation and trends in customer feedback.
- prepare monthly reports on status of performance metrics, due to the SISPV leadership.
- leveraging lessons learned, propose process improvements to the SISPV Branch Chief.
- use Commercial off the Shelf Software (COTS) in combination with all available CE/CV reportable information (polygraphs, eReport, walk-ins, referrals, Rap Back, DNI CE alerts, emails, and EAFR) to produce sophisticated analytic models, Triage capabilities, and dashboard metrics for CE/CV Investigators.
- Discover patterns in past data to predict the outcome of future events including statistical modeling, classification and analysis, clustering, optimization and simulation, and customer segmentation.
- Text Mining-Provide information stored in text documents and databases including document classification, natural language processing, information extraction and search.
- Data Infrastructure- Prepare, clean and integrate disparate data sources and building Extract, Transform, Load (ETL) and data pipelines optimized for advanced analytics.
Other
- Active Top Secret
- Clear and concise writing skills.
- Positive, engaging communication skills.
- Excellent organizational skills to achieve required timelines.
- Minimum 7-10 years' experience