• Data Science Foundations
  • I PREFACE
  • Welcome to the General Data Science – Exploratory Data Analysis (EDA) Layer
    • 📘 What You’ll Learn
  • Setting Up Your Analysis Environment
    • Explanation
    • 🔹 Who This Is For
    • Install Python
    • Install R and RStudio
    • 🔹 Install VSCode (Recommended Editor)
      • ✅ Python Extension (by Microsoft)
      • ✅ R Extension
    • Setup with venv for Python
    • Setup with renv for R
  • Verifying Your Setup
  • How to Navigate This Guide
    • 🔁 Tips for Side-by-Side Learning
    • ✅ What’s Next?
  • II DATA EXPLORATION
  • 1 How do you create a project directory ready for analysis?
    • 1.1 Explanation
    • 1.2 Bash (Terminal)
    • 1.3 Python Code
    • 1.4 R Code
  • 2 How do you install basic data science tools and libraries for Python and R?
    • 2.1 Explanation
    • 2.2 Python Tools
    • 2.3 R Tools
  • 3 What are common sources of datasets for Python and R?
    • 3.1 Explanation
    • 3.2 Python Package-Based Datasets
    • 3.3 R Package-Based Datasets
    • 3.4 Online Public Data Sources
  • 4 How do you save a dataset in Python and R?
    • 4.1 Explanation
    • 4.2 Python Code
    • 4.3 R Code
  • 5 How do you load a pre-cleaned dataset in Python and R?
    • 5.1 Explanation
    • 5.2 Python Code
    • 5.3 R Code
  • 6 How do you rename column names in Python and R?
    • 6.1 Explanation
    • 6.2 Python Code
    • 6.3 R Code
  • 7 How do you examine the structure and types of variables in Python and R?
    • 7.1 Explanation
      • ✅ Common Data Types in Python and R
    • 7.2 Python Code
    • 7.3 R Code
  • 8 How do you check for missing values in Python and R?
    • 8.1 Explanation
    • 8.2 Python Code
    • 8.3 R Code
  • 9 How do you get summary statistics for numeric variables in Python and R?
    • 9.1 Explanation
    • 9.2 Python Code
    • 9.3 R Code
  • 10 How do you filter rows based on a condition in Python and R?
    • 10.1 Explanation
    • 10.2 Python Code
    • 10.3 R Code
  • 11 How do you sort rows based on a variable in Python and R?
    • 11.1 Explanation
    • 11.2 Python Code
    • 11.3 R Code
  • 12 How do you create a new variable in Python and R?
    • 12.1 Explanation
    • 12.2 Python Code
    • 12.3 R Code
  • 13 How do you detect and remove duplicate rows in Python and R?
    • 13.1 Explanation
    • 13.2 Python Code
    • 13.3 R Code
  • 14 How do you export a cleaned dataset in Python and R?
    • 14.1 Explanation
    • 14.2 Python Code
    • 14.3 R Code
  • 15 How do you convert variable types in Python and R?
    • 15.1 Explanation
    • 15.2 Python Code
    • 15.3 R Code
  • 16 How do you group and summarize data in Python and R?
    • 16.1 Explanation
    • 16.2 Python Code
    • 16.3 R Code
  • 17 How do you drop or reorder columns in Python and R?
    • 17.1 Explanation
    • 17.2 Python Code
    • 17.3 R Code
  • 18 How do you subset specific columns in Python and R?
    • 18.1 Explanation
    • 18.2 Python Code
    • 18.3 R Code
  • 19 How do you sample rows randomly in Python and R?
    • 19.1 Explanation
    • 19.2 Python Code
    • 19.3 R Code
  • 20 How do you take a random sample from a large dataset in Python and R?
    • 20.1 Explanation
      • 🎨 Why Switch to the Diamonds Dataset?
    • 20.2 Python Code
    • 20.3 R Code
  • EDA Summary
    • 🧱 What You’ve Accomplished
    • 📈 What Comes After EDA?
    • 🚀 Continue Learning with CDI
  • Explore More Guides

General Data Science
Exploratory Data Analysis (EDA)

General Data Science
Exploratory Data Analysis (EDA)


Last updated: July 06, 2025