Data Architecture and Engineering for Data Scientists

Expand your expertise beyond modelling by learning the foundations of data architecture and engineering. Gain the skills to design and maintain efficient pipelines that provide reliable, high-quality data for analysis and machine learning. This course will strengthen your ability to manage the full data lifecycle and support production-grade projects.

Why Does this Course Matter:

Strong data science relies on robust pipelines and well-structured data. Without this foundation, models fail to scale, insights are delayed, and projects risk producing unreliable results. This course gives professionals the tools to design architectures and workflows that ensure quality, governance, and smooth integration into production environments.

Who is this course for:

  • Data scientists looking to broaden into data engineering
  • Analysts and technical professionals building end-to-end data products
  • Teams aiming to improve data quality and streamline pipelines
  • Practitioners interested in the link between data engineering and MLOps
  • Professionals responsible for delivering reliable and scalable data solutions

What Will I Learn:

  • Principles of ETL and ELT, with practical methods to design automated pipelines
  • Knowledge of when and how to use data warehouses, lakes, and lakehouses
  • Skills in applying data validation, cleaning, and cataloguing practices
  • Understanding of the modern data stack and orchestration tools like Apache Airflow
  • Ability to connect data engineering principles with MLOps for better model deployment

Course Topics:

  • Design and build automated pipelines for reliable data movement
  • Compare different storage systems and their use cases
  • Establish governance and quality frameworks for trustworthy data
  • Explore tools and techniques within the modern data stack
  • Link data engineering practices with the operationalisation of machine learning models
  • Engage in practical activities to design scalable data workflows
  • Receive a resource pack for extending your learning beyond the course

Want to run this course in-house? Enquire about running this course in-house

Meet the tutor

Dr Michael Mortenson

Dr Michael Mortenson is the Course Director for the MSc in Business Analytics at Warwick Business School and leads the AI Research Group at the Gillmore Centre for Financial Technology. He holds a PhD in Analytics and an MSc in eBusiness. His specialisms include natural language processing/large language models, computer vision, OR and data engineering. Alongside his academic work he has significant practical experience working on OR, data science and AI projects for companies such as Amazon Web Services and Vodafone.