Equifax is looking to leverage data science to develop and deploy innovative solutions, enhance productivity, and solve real market problems through advanced analytics, predictive modeling, and AI/Machine Learning.
Requirements
- Theoretical and practical understanding of algorithm time and space complexity, and a proven ability to apply this knowledge to develop efficient and scalable data science solutions
- 5+ years of experience with Python, Tensorflow, SQL (strong skills and scripting experience), and Spark with advanced experience in data manipulation libraries (e.g., Pandas, Dask, Spark DataFrames)
- Experience with model performance evaluation and predictive model optimization for accuracy and efficiency
- Experience with large-scale data processing in distributed environments
- Experience working on big data platforms (e.g., Google Cloud, AWS, Snowflake, Hadoop) a plus
- Extensive experience with NLP (Natural Language Processing), LLMs (Large Language Models) and/or Generative AI
- SQL Mastery & Optimization: Design, write, and optimize highly complex SQL queries for data extraction, transformation, and analysis, often dealing with massive datasets.
Responsibilities
- Utilize subject matter expertise of data structures, analytics, algorithms/models, and strong computer science fundamentals to lead data preparation, analytics, and development of deployable solutions across multiple projects
- Collect, analyze and interpret large data assets to define and build multiple innovative solution components leveraging business and technical expertise. Lead the analytical strategy on critical technical capabilities
- Contribute to evaluation of new data sources, provide recommendations on value of data sources, and design code to improve the productivity of Equifax, enhance and update code where needed.
- Perform as lead technical data scientist for multiple technical and business domains, collaborating with other teams to develop predictive models, risk assessments, fraud detection, recommendation engines, etc. encouraging enhanced solutions and asking questions
- Able to analyze and prepare complex and new data sources and incorporate them into analytical solutions.
- Research innovative data solutions (in distributed cloud computing constrained and unconstrained optimization) to solve real market problems
- Design, develop, and implement advanced NLP and LLM solutions, including text classification, summarization, and NER (Name Entity Recognition), powered by state-of-the-art embedding models like Gemini and BERT.
Other
- Be an integral part of the Data Science Lab team that works closely with internal clients in all phases of prototype development and deployment
- Communicate results to senior management and external stakeholders, able to communicate the strategic impact of the work
- Evaluate the technical work of experienced data scientists guiding them on deliverable quality and accuracy
- Serve as SME consultant for COE / Business Unit / Regions, share best practices globally
- Strong communication skills of analytical results to technical and non-technical audiences alike