Data Collection, Processing, and Analysis eLearning Bundle Course

Course Overview

This eLearning bundle consists of these 3 courses:

  • Learning Data Analysis with R
  • Data Mining with Python: Implementing Classification and Regression
  • Learning Path: Statistics and Data Mining for Data Science

Course Topics

Learning Data Analysis with R – 6 hours and 7 minutes

R is a programming language and software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. This video delivers viewers the ability to conduct data analysis in practical contexts with R, using core language packages and tools. The end goal is to provide analysts and data scientists a comprehensive learning course on how to manipulate and analyse small and large sets of data with R. It will introduce how CRAN works and will demonstrate why viewers should use them. You will start with the most basic importing techniques, to downloading compressed data from the web and learn of more advanced ways to handle even the most difficult datasets to import. Next, you will move on to create static plots, while the second will show how to plot spatial data on interactive web platforms such as Google Maps and Open Street maps. Finally, you will learn to implement your learning with real-world examples of data analysis. This video will lay the foundations for deeper applications of data analysis, and pave the way for advanced data science.

Data Mining with Python: Implementing Classification and Regression – 2 hours and 3 minutes

Python is a dynamic programming language used in a wide range of domains by programmers who find it simple yet powerful. In today’s world, everyone wants to gain insights from the deluge of data coming their way. Data mining provides a way of finding these insights, and Python is one of the most popular languages for data mining, providing both power and flexibility in analysis. Python has become the language of choice for data scientists for data analysis, visualization, and machine learning.

In this course, you will discover the key concepts of data mining and learn how to apply different data mining techniques to find the valuable insights hidden in real-world data. You will also tackle some notorious data mining problems to get a concrete understanding of these techniques.

We begin by introducing you to the important data mining concepts and the Python libraries used for data mining. You will understand the process of cleaning data and the steps involved in filtering out noise and ensuring that the data available can be used for accurate analysis. You will also build your first intelligent application that makes predictions from data. Then you will learn about the classification and regression techniques such as logistic regression, k-NN classifier, and SVM, and implement them in real-world scenarios such as predicting house prices and the number of TV show viewers.

By the end of this course, you will be able to apply the concepts of classification and regression using Python and implement them in a real-world setting.

Statistics and Data Mining for Data Science – 5 hours and 51 minutes

Data science is an ever-evolving field, with an exponentially growing popularity. It includes techniques and theories based on the fields of statistics, computer science, and most importantly machine learning, databases, and visualization. If you wish to enter the world of statistics and data mining, then look no further because this practical video course will walk you through the basics as well as the advanced concepts in a step-by-step manner.

Packt’s Video Learning Path is a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it.

The highlights of this Learning Path are:

Learn when to use different statistical techniques, how to set up different analyses, and how to interpret the results
Apply statistical and data mining techniques to analyze and interpret results using CHAID, linear regression, and neural networks
This Learning Path begins with explaining the steps to analyse data and identify which summary statistics are relevant to the type of data you are summarizing. You will then learn several procedures, such as how to run and interpret frequencies and how to create various graphs. You will also be introduced to the idea of inferential statistics, probability, and hypothesis testing.

Next, you will learn how to perform and interpret the results of basic statistical analyses such as chi-square, independent and paired sample t-tests, one-way ANOVA, post-hoc tests, and bivariate correlations and graphical displays such as clustered bar charts, error bar charts, and scatter plots. You will then learn how to use different statistical techniques, set up different analyses, and interpret the results.

Moving ahead, this Learning Path shows the comparing and contrasting between statistics and data mining and then provides an overview of the various types of projects data scientists usually encounter. Next, you will be introduced to the three methods (statistical, decision tree, and machine learning) with which you can perform predictive modeling. Finally, you will explore segmentation modeling to learn the art of cluster analysis and will work with association modelling to perform market basket analysis.

By the end of this Learning Path, you will gain a firm knowledge on data analysis, data mining, and statistical analysis and be able to implement these powerful techniques on your data with ease.