CALL US NOW AT 78374-02000 FOR PROFESSIONAL DATA SCIENCE TRAINING
Data Science Course Content:
Module1: Introduction
What Data Science?
Common Terms in Analytics
Analytics vs. Data warehousing, OLAP, MIS Reporting
Relevance in industry and need of the hour
Types of problems and business objectives in various industries
How leading companies are harnessing the power of analytics?
Critical success drivers
Overview of analytics tools & their popularity
Analytics Methodology & problem solving framework
List of steps in Analytics projects
Identify the most appropriate solution design for the given problem statement
Project plan for Analytics project & key milestones based on effort estimates
Build Resource plan for analytics project
Why Python for data science?
Module2:Core Python
Overview of Python- Starting with Python
Introduction to installation of Python
Introduction to Python Editors & IDE's(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
Python Syntax
Variables & Data Types
Operators
Conditional Statements
Working With Numbers & Strings
Collections API
LISTS
TUPLES .
DICTIONARY
Date and Time
Function & Modules
File handling
Exception Handling
OOPS Concepts in python
Regular Expression
Module 3: Python Libraries for Data Science
Numpy
Scify
pandas
scikitlearn
statmodels
nltk
Module 4: Python Modules for Access, Import/Export Data
Importing Data from various sources (Csv, txt, excel, access etc.)
Database Input (Connecting to database)
Viewing Data objects - subsetting, methods
Exporting Data to various formats
Important python modules: Pandas, beautiful soup
Module 5: Data Manipulation, Cleansing and Munging
Cleansing Data with Python
Data Manipulation steps (Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc.)
Data manipulation tools (Operators, Functions, Packages, control structures, Loops, arrays etc.)
Python Built-in Functions (Text, numeric, date, utility functions)
Python User Defined Functions
Stripping out extraneous information
Normalizing data
Formatting data
Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc.)
Module 6: Data Analysis and Visualization
Introduction exploratory data analysis
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc.)
Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and scipy.stats etc.)
Data visualization with tableau.
Module 7: Statistics
Basic Statistics - Measures of Central Tendencies and Variance
Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem
Inferential Statistics -Sampling - Concept of Hypothesis Testing
Statistical Methods - Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square
Important modules for statistical methods: Numpy, Scipy, Pandas
Module 8: Predictive Modeling
Concept of model in analytics and how it is used?
Common terminology used in analytics & modeling process
Popular modeling algorithms
Types of Business problems - Mapping of Techniques
Different Phases of Predictive Modeling
Module 9: Data Exploration for Modeling
Need for structured exploratory data
EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
Identify missing data
Identify outliers data
Visualize the data trends and patterns
Module 10: Data Preparation
Need of Data preparation
Consolidation/Aggregation - Outlier treatment - Flat Liners - Missing values- Dummy creation - Variable Reduction
Variable Reduction Techniques - Factor & PCA Analysis
Module 11: Solving Segmentation Problems
Introduction to Segmentation
Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
Behavioral Segmentation Techniques (K-Means Cluster Analysis)
Cluster evaluation and profiling - Identify cluster characteristics
Interpretation of results - Implementation on new data
Module 12: Linear Regression
Introduction - Applications
Assumptions of Linear Regression
Building Linear Regression Model
Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
Assess the overall effectiveness of the model
Validation of Models (Re running Vs. Scoring)
Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
Interpretation of Results - Business Validation - Implementation on new data
Module 13: Logistic Regression
Introduction - Applications
Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
Building Logistic Regression Model (Binary Logistic Model)
Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
Validation of Logist ic Regression Models (Re running Vs. Scoring)
Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
Interpretation of Results - Business Validation - Implementation on new data
Module 14: Time Series Forecasting
Introduction - Applications
Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
Classification of Techniques(Pattern based - Pattern less)
Basic Techniques - Averages, Smoothening, etc
Advanced Techniques - AR Models, ARIMA, etc
Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc