MasterControl is building its next generation data platform that will leverage AI/ML techniques to help redefine how customers bring life-saving and life-changing products to market. To enable this, they are looking for a Data Engineering Intern to support their Data Platform.
Requirements
- Experience working with distributed data technologies (e.g. Hadoop, Spark, Flink, Kafka, etc.) for building efficient & large-scale data pipelines.
- Software Engineering proficiency in at least one high-level programming language (Java, Scala, Python or equivalent).
- Experience building stream-processing applications using Apache Flink, Spark-Streaming, Apache Storm, Kafka Streams or others.
- Knowledge of multi-dimensional modeling like start schema, snowflakes, normalized and de-normalized models.
- Knowledge of flexible, scalable data models addressing a wide variety of consumption patterns including random-access, sequential access including necessary optimizations like bucketing, aggregating, sharding.
- Expertise in one or more NoSQL database (Neo4J, Mongo DB, Cassandra, HBase, DynamoDB, Big Table etc.).
- Experience with full Software Development Life Cycle (SDLC), including design, development, and release of products
Responsibilities
- Experience working with distributed data technologies (e.g. Hadoop, Spark, Flink, Kafka, etc.) for building efficient & large-scale data pipelines.
- Software Engineering proficiency in at least one high-level programming language (Java, Scala, Python or equivalent).
- Experience building stream-processing applications using Apache Flink, Spark-Streaming, Apache Storm, Kafka Streams or others.
- Knowledge of multi-dimensional modeling like start schema, snowflakes, normalized and de-normalized models.
- Knowledge of flexible, scalable data models addressing a wide variety of consumption patterns including random-access, sequential access including necessary optimizations like bucketing, aggregating, sharding.
- Expertise in one or more NoSQL database (Neo4J, Mongo DB, Cassandra, HBase, DynamoDB, Big Table etc.).
- Experience with full Software Development Life Cycle (SDLC), including design, development, and release of products
Other
- This position is specifically designed for Northeastern University students as part of the university's cooperative education (Co-op) program. Candidates must be enrolled at Northeastern University and eligible to participate in the official Co-op program for the designated term.
- You are curious: always learning new technologies, rapidly synthesizing new information, and understanding “the why” before “the what.”
- You are self-directed and capable of operating amid ambiguity.
- Excellent communication, interpersonal, collaboration, and team skills
- Passionate about creatively solving business problems