Nearsure is seeking a skilled Data Scientist to work on data science projects involving data gathering, cleaning, transformation, and model building, leveraging the Google Stack to extract insights, build predictive models, and optimize data pipelines.
Requirements
- 5+ Years of SQL experience with large datasets, including hands-on ELT and query optimization in Google BigQuery (partitioning, clustering, UDFs, and relational data modeling best practices).
- 5+ Years of hands-on experience with Python for data analysis and model training, with a focus on using Google Colab.
- 3+ Years of experience with data manipulation and analysis libraries such as Pandas and NumPy.
- 2+ Years of experience building predictive models (classification, regression, time series) using Python ML libraries.
- 2+ Years of experience applying embeddings/similarity search (e.g., cosine similarity) for relatedness/semantic tasks.
- 2+ Years of experience working in the Google Cloud ecosystem (e.g., BigQuery; familiarity with Vertex AI is a plus).
- 2+ Years of experience with Git and collaborative development practices (code reviews, basic testing).
Responsibilities
- Extract, transform, and analyze large datasets using SQL (Google BigQuery) and Python.
- Develop and maintain data pipelines to ensure efficient data processing.
- Utilize Google Colab and Python libraries like Pandas and NumPy for data manipulation and analysis.
- Implement vector-based representations of data and apply cosine similarity for relatedness measures.
- Design and develop interactive dashboards using Google Looker for data visualization.
- Work with LLMs (Large Language Models) and APIs, including Gemini API, for data prediction and transformation.
- Conduct manual data labeling and validation, including web research.
Other
- Bachelor's degree in Computer Science, Statistics, Mathematics, or a related field.
- 5+ Years of experience delivering data science projects end-to-end (problem framing, data preparation, modeling, evaluation, and communication).
- Advanced English Level is required for this role, as you will work with US clients. Effective communication in English is essential to deliver the best solutions to our clients and expand your horizons.
- Entry-level knowledge of Salesforce required for this role.
- Ability to perform manual data labeling/validation and web research when needed.