MasterControl is looking to build its next generation data platform that leverages AI/ML techniques to help customers bring life-saving and life-changing products to market
Requirements
- Experience working with distributed data technologies (e.g. Hadoop, Spark, Flink, Kafka, etc.)
- Software Engineering proficiency in at least one high-level programming language (Java, Scala, Python or equivalent)
- Experience building stream-processing applications using Apache Flink, Spark-Streaming, Apache Storm, Kafka Streams or others
- Knowledge of multi-dimensional modeling like star schema, snowflakes, normalized and de-normalized models
- Knowledge of flexible, scalable data models addressing a wide variety of consumption patterns
- Expertise in one or more NoSQL database (Neo4J, Mongo DB, Cassandra, HBase, DynamoDB, Big Table etc.)
- Experience with data engineering and data pipeline development
Responsibilities
- Experience working with distributed data technologies (e.g. Hadoop, Spark, Flink, Kafka, etc.) for building efficient & large-scale data pipelines
- Building stream-processing applications using Apache Flink, Spark-Streaming, Apache Storm, Kafka Streams or others
- Designing and implementing data models and architectures to support various consumption patterns
- Working with NoSQL databases (Neo4J, Mongo DB, Cassandra, HBase, DynamoDB, Big Table etc.)
- Building efficient and scalable data pipelines
- Experience with multi-dimensional modeling like star schema, snowflakes, normalized and de-normalized models
- Knowledge of flexible, scalable data models addressing a wide variety of consumption patterns
Other
- 2+ years with Data Engineering experience
- Currently authorized to work in the United States on a full-time basis
- Excellent communication, interpersonal, collaboration, and team skills
- Passionate about creatively solving business problems
- Must be enrolled at Northeastern University and eligible to participate in the official Co-op program for the designated term