Course Outline

Introduction to Data Analysis and Big Data

  • What Makes Big Data "Big"?
    • Velocity, Volume, Variety, Veracity (VVVV)
  • Limits to Traditional Data Processing
  • Distributed Processing
  • Statistical Analysis
  • Types of Machine Learning Analysis
  • Data Visualization

Big Data Roles and Responsibilities

  • Administrators
  • Developers
  • Data Analysts

Languages Used for Data Analysis

  • Python
    • Why Python for Data Analysis?
    • Manipulating, processing, cleaning, and crunching data

Approaches to Data Analysis

  • Statistical Analysis
    • Time Series analysis
    • Forecasting with Correlation and Regression models
    • Inferential Statistics (estimating)
    • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Machine Learning
    • Supervised vs unsupervised learning
    • Classification and clustering
    • Estimating cost of specific methods
    • Filtering

Big Data Infrastructure

  • Data Storage
    • Relational databases (SQL)
      • MySQL
      • Postgres
      • Oracle
    • Understanding the nuances
      • Hierarchical databases
      • Object-oriented databases
      • Document-oriented databases
      • Graph-oriented databases
      • Other

The Future of Big Data

Summary and Next Steps

Requirements

  • A general understanding of math
  • A general understanding of programming
  • A general understanding of databases

Audience

  • Developers / programmers
  • IT consultants
 21 Hours

Testimonials (5)

Related Categories