Q&A 18 How do you subset specific columns in Python and R?
18.1 Explanation
You may want to work with only a few columns at a time — for visualization, inspection, or modeling. This helps reduce clutter and focus on key variables.
18.2 Python Code
import pandas as pd
# Load dataset
df = pd.read_csv("data/iris.csv")
# Select specific columns
subset = df[["sepal_length", "sepal_width", "species"]]
print(subset.head()) sepal_length sepal_width species
0 5.1 3.5 setosa
1 4.9 3.0 setosa
2 4.7 3.2 setosa
3 4.6 3.1 setosa
4 5.0 3.6 setosa
18.3 R Code
library(readr)
library(dplyr)
# Load dataset
df <- read_csv("data/iris.csv")
# Select specific columns
subset <- df %>%
select(sepal_length, sepal_width, species)
head(subset)# A tibble: 6 × 3
sepal_length sepal_width species
<dbl> <dbl> <chr>
1 5.1 3.5 setosa
2 4.9 3 setosa
3 4.7 3.2 setosa
4 4.6 3.1 setosa
5 5 3.6 setosa
6 5.4 3.9 setosa
✅ Subsetting lets you focus your analysis on the most relevant columns.