Full-Time Data Engineer
The Office of Sustainability seeks to transform MIT into a powerful model—that generates just, equitable, scalable, and applicable solutions for responding to the unprecedented challenges of a changing planet. To achieve our mission, we seek to advance a collaborative process that engages and elevates a diverse set of voices to foster operational excellence, education, research, and innovation on our campus. We are looking for a team member who can lead projects that produce actionable data insights and develop tools to help MIT advance its commitment to climate and sustainability.
This position will report to the Director of the Office of Sustainability and work collaboratively with the members of the Office of Sustainability team, as well as stakeholders from a mix of administrative and academic units across the Institute. Our teammates describe our culture as caring, innovative, and impactful. We share a common desire to solve problems, address environmental issues and injustices.
The Data Engineer is responsible for updating and building the data architecture needed to track and report out on all campus sustainability activities, prioritizing the climate mitigation and resiliency, waste and food goals outlined in Fast Forward: MIT’s Climate Action Plan for the Decade. More specifically, this position will initially be responsible for formalizing the data acquisition and analytical structure, policies and procedures. This individual will curate and organize data from multiple sources and upload for access on the Sustainability Data Pool, a first of its kind centralized data repository available to all members of the MIT community. The time periods and units of measurement of the data will be topic and data dependent.
Principle Duties and Responsibilities:
- Data Management
- Learn about historic and existing MIT and Office of Sustainability processes for obtaining and managing data e.g. IST data warehouse; refine processes where needed and develop new protocols that help to ensure efficient and robust data sourcing, ingestion and management within the office.
- Partner with Office of Sustainability project managers and student researchers to identify core data gaps, needs to sufficiently achieve a reasonable level of data set completeness, and external offices/sources for seeking the data
- Coordinate a data curation and management process in alignment with and leveraging IS&T campus data management practices, strategies, and software.
- Collaborate with and provide support to Office of Sustainability project management team members in the process of seeking and obtaining data from the source office, vendor, etc.
- Implement and complete processes to ingest data and integrate data into the central campus sustainability data repository known as the MIT Sustainability Data Pool.
- Collaborate with data providers to ensure that the data ingestion and processing of data maintains quality, controls and integrity as expected by the sharing office (i.e. seek input on any duplications of lines or other anomalies).
- Effectively organize, clean, integrate and prepare large, varied datasets, architect specialized database and computing environments, and communicate results.
- Ensure quality and integrity of data across campus sustainability topic areas [e.g. energy, water, materials, food, transportation, waste etc.].
- When a data topic is deemed a priority by the Director, collaborate with staff/students to model data to enhance data quality, impute missing values, detect anomalies, identify important relationships, and/or generate predictions and forecasts
- Determine when statistical learning techniques (machine learning) can be applied and where they would add value. Execute these techniques and clearly communicate limitations to stakeholders.
- Generate automated reports and communications necessary for city compliance, stakeholder transparency, operational performance, and decision-making. Reports include, but are not limited to supporting MIT’s greenhouse gas inventory, transportation trends, and waste management.
- Prepares materials and communicates analytical findings in the appropriate mediums and level of detail for MIT leadership, department management, institutional partners, the MIT community, and the broader public. Communications may include, but are not limited to, an annual sustainability performance report, web-based dashboards, charts or infographics.
- Other duties as required.
- Bachelor’s degree from 4-year College or University in Computer Science/Engineering/ /Business/Math/Policy or related field is required. Master’s degree preferred
- A minimum of 7 years related work experience
- Demonstrated knowledge of traditional relational databases (SQL), big data technologies (Hadoop, Spark), and computer programming experience (e.g. experience with APIs).
- At least 5 years of experience with an open-source data science programming language (i.e. R, python)
- At least 3 years of experience with data visualization software such as Tableau.
- Demonstrated experience collaborating with others to obtain, ingest, organize and clean data sets to enable analysis
- Strong data visualization skills
- Demonstrated evidence as a team player.
- Demonstrated commitment to the values of justice, equity, diversity, and inclusion within the climate and sustainability field
- Interest in and or demonstrated impact working on and integrating racial, economic, and climate justice initiatives
- Demonstrated self-awareness, cultural competency and inclusivity, and ability to work with colleagues and stakeholders across diverse cultures and backgrounds and serving the needs of diverse populations.
- Big data experience (Spark, hive, Hadoop) a plus
- Experience working with Energy and/or Sustainability Metrics preferred
- Ability to collaborate and work effectively with others and function well as part of a team
- Experience working in higher education a plus