Q&A 15 How do you convert variable types in Python and R?
15.1 Explanation
Sometimes your data has columns in the wrong type — for example, a numeric column stored as text, or a categorical variable treated as a string. This can affect grouping, plotting, or modeling.
In this example, we’ll convert:
- The species column to a categorical variable
- A numeric column to string (for labeling)
15.2 Python Code
import pandas as pd
# Load dataset
df = pd.read_csv("data/iris.csv")
# Convert species to categorical
df["species"] = df["species"].astype("category")
# Convert sepal_length to string (optional use case)
df["sepal_length_str"] = df["sepal_length"].astype(str)
# Confirm types
print(df.dtypes.head())sepal_length float64
sepal_width float64
petal_length float64
petal_width float64
species category
dtype: object
15.3 R Code
library(readr)
library(dplyr)
# Load dataset
df <- read_csv("data/iris.csv")
# Convert species to factor
df <- df %>%
mutate(species = as.factor(species))
# Convert sepal_length to character
df <- df %>%
mutate(sepal_length_str = as.character(sepal_length))
# Confirm structure
str(df)tibble [150 × 6] (S3: tbl_df/tbl/data.frame)
$ sepal_length : num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ sepal_width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ petal_length : num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ petal_width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
$ sepal_length_str: chr [1:150] "5.1" "4.9" "4.7" "4.6" ...
✅ Converting variable types ensures that each column behaves correctly in your analysis or visualization.