Skip to content


Data Analytics with Python

In this course, students will learn how to use statistical machine learning techniques to understand the relationship between customer demographics and purchasing behavior and develop a model for predicting the future sales volumes of products. Students will learn how to apply the Python programming language to solve Data Analytics and Machine Learning problems.

Predicting Profitability and Customer Preferences Using R

In this course, you will learn to use statistical machine learning techniques to predict brand preferences based on customer characteristics, and then you will develop a model to recommend new products based on customer purchases. You will learn and use the R statistical programming language and the R Studio analytics environment.

Big Data with Spark: Text Mining and Sentiment Analysis

In this course, students will be working as data analysts for Alert Analytics, a data analytics consulting firm. Their initial client has developed a suite of smartphone medical apps for use by aid workers in developing countries, and they need to limit the support to a single model of smartphone and operating system. The students’ job is to conduct the required analysis. Students will use the Amazon Web Services (AWS) Elastic Map Reduce (EMR) platform to run a series of Hadoop streaming jobs and Python to develop a predictive model that will infer user sentiment towards devices from the lexical content of web pages extracted from the Common Crawl of the World-Wide Web.

After completing the smartphone project, students will complete a more complicated project diving into the various constructs of text mining and text context. The business scope and criteria for success for this project is straightforward: Can the contents of a tweet be classified as having a certain level of sentiment or not? Students will perform this work using Apache Spark. Since the data they’ll be working with is too large to work with locally they’ll be using Microsoft Azure and Databricks, which uses a Spark cluster in the cloud, to complete this project.