Self Learning

Introduction to Data Science

The Introduction to Data Science class will survey the foundational topics in data science, namely: Data Manipulation, Data Analysis with Statistics and Machine Learning, Data Communication with Information Visualization, Data at Scale -- Working with Big Data

0(0 Ratings) 2 Students enrolled
Created by George Afful Last updated Tue, 13-Oct-2020 English Part Time USA
What Will I Learn?

Curriculum for this course
11 Lessons 00:08:50 Hours
Introduction
11 Lessons 00:08:50 Hours
  • The Data Scientist Nanodegree Program 00:01:47
  • Introduction to Data Science 00:00:59
  • What is a Data Scientist 00:00:16
  • Quiz: Exercise: What is a Data Scientist 00:00:16
  • What Does a data Scientist Do? 00:00:39
  • Pi Chuan : Introduction 00:00:15
  • Pi Chuan-What is data Science? 00:00:44
  • Basic Data Scientist Shills 00:01:14
  • Simpson's Paradox
  • Problems Solved By Data Science 00:02:40
  • Data Science programming Tools
Requirements
+ View more
Description

Overview

The Introduction to Data Science class will survey the foundational topics in data science, namely:

  • Data Manipulation
  • Data Analysis with Statistics and Machine Learning
  • Data Communication with Information Visualization
  • Data at Scale -- Working with Big Data

The class will focus on breadth and present the topics briefly instead of focusing on a single topic in depth. This will give you the opportunity to sample and apply the basic techniques of data science.

This course is also a part of our Data Analyst Nanodegree.



Why Take This Course?

You will have an opportunity to work through a data science project end to end, from analyzing a dataset to visualizing and communicating your data analysis.

Through working on the class project, you will be exposed to and understand the skills that are needed to become a data scientist yourself.

Syllabus

Lesson 1: Introduction to Data Science

  • Introduction to Data Science
  • What is a Data Scientist
  • Pi-Chaun (Data Scientist @ Google): What is Data Science?
  • Gabor (Data Scientist @ Twitter): What is Data Science?
  • Problems Solved by Data Science
  • Pandas
  • Dataframes
  • Create a New Dataframe

Lesson 2: Data Wrangling

  • What is Data Wrangling?
  • Acquiring Data
  • Common Data Formats
  • What are Relational Databases?
  • Aadhaar Data
  • Aadhaar Data and Relational Databases
  • Introduction to Databases Schemas
  • API’s
  • Data in JSON Format
  • How to Access an API efficiently
  • Missing Values
  • Easy Imputation
  • Impute using Linear Regression
  • Tip of the Imputation Iceberg

Lesson 3: Data Analysis

  • Statistical Rigor
  • Kurt (Data Scientist @ Twitter) - Why is Stats Useful?
  • Introduction to Normal Distribution
  • T Test
  • Welch T Test
  • Non-Parametric Tests
  • Non-Normal Data
  • Stats vs. Machine Learning
  • Different Types of Machine Learning
  • Prediction with Regression
  • Cost Function
  • How to Minimize Cost Function
  • Coefficients of Determination

Lesson 4: Data Visualization

  • Effective Information Visualization
  • Napoleon's March on Russia
  • Don (Principal Data Scientist @ AT&T): Communicating Findings
  • Rishiraj (Principal Data Scientist @ AT&T): Communicating Findings Well
  • Visual Encodings
  • Perception of Visual Cues
  • Plotting in Python
  • Data Scales
  • Visualizing Time Series Data

Lesson 5: MapReduce

  • Big Data and MapReduce
  • Basics of MapReduce
  • Mapper
  • Reducer
  • MapReduce with Aadhaar Data
  • MapReduce with Subway Data

Taught by

Dave Holtz

+ View more
Other related courses
00:21:41 Hours
Updated Tue, 20-Oct-2020
0 6 Free
Student feedback
0
Average rating
  • 0%
  • 0%
  • 0%
  • 0%
  • 0%
Reviews
Free
Includes:
  • 00:08:50 Hours On demand videos
  • 11 Lessons
  • Full lifetime access
  • Access on mobile and tv