Data Science Track

Develop and implement a set of techniques or analytics applications to transform raw data into meaningful information using data-oriented programming languages and visualization software. Apply data mining, data modeling, natural language processing, and machine learning to extract and analyze information from large structured and unstructured datasets. Visualize, interpret, and report data findings. Predict outcomes and report confidence in such predictions. 

This track will help you  develop core IT skills, including  programming, probabilities and statistics, machine learning,  in addition to specific data science  courses  that are necessary for a first job as a  data science analyst.

A number of career paths are then opened up for you. These include specialization in natural language processing, or computer vision to name but few. 

Data Science is traditionally divided  into several  Statistical and machine learning based approaches. This  track introduces the students to both.

  • In their second year, students take courses such as intro to machine learning, big data processing, data visualization and deep learning.
  • The second year culminates in conducting a combined Capstone/Internship project.

Course Curriculum

Second Year Courses​
DS1 Data Analysis
This course focuses on the life cycle of statistical data processing projects from preparation to answering real world questions passing by statistical analysis. Students will learn how to use mainstream Python data analysis packages to solve real world problems using real datasets and Python libraries such as SciPy and Scikit-learn and real dataset.
DS2 Intro to machine learning
This course introduces the fundamentals of machine learning. In particular, it covers popular supervised and unsupervised learning techniques, such as decision trees, clustering, and classification together with their performance evaluation metrics. It shows how these techniques can be applied for making better decisions. The course discusses use case studies that provide good models for such applications.
DS3 Big Data Processing
This courses introduces the concepts and the basic technologies required to store, transform, and process/ingest Big Data in real-time and batch modes. Popular open source and proprietary Big Data environments for storage and processing/ingestion will be reviewed. In particular, students will learn about the Apache Hadoop ecosystem, Hbase, Casendra, MangoDB as well as real time, batch and lambda processing Kafka, Spark, and Apache beam. Students will carry out real world Big data processing projects using Spark.
DS4 Data Visualisation
This course introduces the theory behind good data visualization and its link to the task at hand. It covers major visualization Idioms for tabular, network, field and geometric datasets. It also introduces Python based visualization libraries. Students will select and apply the right visualization of real-world datasets to respond to given tasks.
DS5 Applied Deep Learning
In this course, students will learn the foundations of Deep Neural Networks, and understand how to build them using different sorts of architectures, such as Convolutional Neural Networks, Recurrent Neural Networks, as well as hybrid architectures in order to extract latent representations of input data in a way that maximizes performance on prediction, recognition or classification tasks. Students will be equipped with the skills to lead successful deep learning projects in different fields, including Computer Vision and Natural Language Processing, and achieve good performance using Python deep learning programming frameworks such as Keras, Tensorflow, or/and Pytorch.
CA01 Preparation for Capstone
The course is run as a seminar. It aims to help students (I) Identify a real-world project idea for their Capstone or joint Capstone/Internship and (II) Identify and prepare for an appropriate certification.
CA02 Capstone
The Capstone is a 2-term effort. Students can either (I) Pursue the development of a large-scale real-world project to consolidate the knowledge and skills acquired throughout the program and enrich their employment portfolio, OR (II) Engage in preparing and taking a certification exam. Students opting for developing a project can do so as a combined Capstone/Internship and/or benefit from UM6P StartGate ecosystem for startup incubation and acceleration and will have to submit a final report. Students choosing to prepare for certification will have to submit the results of the certification exam. All students will have to periodically report progress to the Capstone coordinator.