Q&A 19 How do you sample rows randomly in Python and R?
19.1 Explanation
Sampling is useful when working with large datasets, doing quick checks, or creating training/test splits. You can randomly select a few rows for inspection.
19.2 Python Code
import pandas as pd
# Load dataset
df = pd.read_csv("data/iris.csv")
# Random sample of 5 rows
sampled = df.sample(n=5, random_state=42)
print(sampled) sepal_length sepal_width petal_length petal_width species
73 6.1 2.8 4.7 1.2 versicolor
18 5.7 3.8 1.7 0.3 setosa
118 7.7 2.6 6.9 2.3 virginica
78 6.0 2.9 4.5 1.5 versicolor
76 6.8 2.8 4.8 1.4 versicolor
19.3 R Code
library(readr)
library(dplyr)
# Load dataset
df <- read_csv("data/iris.csv")
# Random sample of 5 rows
set.seed(42)
sampled <- df %>%
sample_n(5)
sampled# A tibble: 5 × 5
sepal_length sepal_width petal_length petal_width species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.3 3.7 1.5 0.2 setosa
2 5.6 2.9 3.6 1.3 versicolor
3 6.1 2.8 4.7 1.2 versicolor
4 6.7 3 5.2 2.3 virginica
5 5.6 2.8 4.9 2 virginica
✅ Sampling allows you to explore or test your data without loading the entire dataset.