Job Board
Consulting

Spark Scala String Padding Functions

The lpad and rpad functions pad a string column to a specified length by adding characters to the left or right side. They're commonly used for zero-padding numbers, aligning text output, and formatting fixed-width fields.

def lpad(str: Column, len: Int, pad: String): Column

def rpad(str: Column, len: Int, pad: String): Column

Both functions take a string column, a target length, and a padding string. If the input is shorter than len, it gets padded. If it's already longer, it gets truncated to len characters.

Zero-Padding Numbers with lpad

A common use case for lpad is formatting numeric IDs with leading zeros:

val df = Seq(
  ("INV", 42),
  ("INV", 1587),
  ("INV", 3),
  ("INV", 99012),
  ("INV", 678),
).toDF("prefix", "number")

val df2 = df
  .withColumn("padded_number", lpad(col("number").cast("string"), 6, "0"))

df2.show(false)
// +------+------+-------------+
// |prefix|number|padded_number|
// +------+------+-------------+
// |INV   |42    |000042       |
// |INV   |1587  |001587       |
// |INV   |3     |000003       |
// |INV   |99012 |099012       |
// |INV   |678   |000678       |
// +------+------+-------------+

Note that lpad works on strings, so you need to cast numeric columns to string first.

Comparing lpad and rpad

Here's a side-by-side comparison showing both functions with space and dot padding:

val df = Seq(
  "Alice",
  "Bob",
  "Charlotte",
  "Dan",
  "Eve",
).toDF("name")

val df2 = df
  .withColumn("left_padded", lpad(col("name"), 12, " "))
  .withColumn("right_padded", rpad(col("name"), 12, " "))
  .withColumn("left_dot", lpad(col("name"), 12, "."))
  .withColumn("right_dot", rpad(col("name"), 12, "."))

df2.show(false)
// +---------+------------+------------+------------+------------+
// |name     |left_padded |right_padded|left_dot    |right_dot   |
// +---------+------------+------------+------------+------------+
// |Alice    |       Alice|Alice       |.......Alice|Alice.......|
// |Bob      |         Bob|Bob         |.........Bob|Bob.........|
// |Charlotte|   Charlotte|Charlotte   |...Charlotte|Charlotte...|
// |Dan      |         Dan|Dan         |.........Dan|Dan.........|
// |Eve      |         Eve|Eve         |.........Eve|Eve.........|
// +---------+------------+------------+------------+------------+

Truncation When the String Is Already Longer

When len is shorter than the input string, both lpad and rpad truncate from the right — they return the first len characters:

val df = Seq(
  "Hello",
  "Spark",
  "Pad",
).toDF("value")

val df2 = df
  .withColumn("lpad_short", lpad(col("value"), 3, "*"))
  .withColumn("rpad_short", rpad(col("value"), 3, "*"))

df2.show(false)
// +-----+----------+----------+
// |value|lpad_short|rpad_short|
// +-----+----------+----------+
// |Hello|Hel       |Hel       |
// |Spark|Spa       |Spa       |
// |Pad  |Pad       |Pad       |
// +-----+----------+----------+

This means lpad and rpad can double as a simple truncation mechanism when you need strings capped at a maximum length.

Handling Nulls

When lpad or rpad encounter a null value, the result is null:

val df = Seq(
  "Alice",
  null,
  "Charlie",
  null,
  "Eve",
).toDF("name")

val df2 = df
  .withColumn("padded", lpad(col("name"), 10, "."))

df2.show(false)
// +-------+----------+
// |name   |padded    |
// +-------+----------+
// |Alice  |.....Alice|
// |null   |null      |
// |Charlie|...Charlie|
// |null   |null      |
// |Eve    |.......Eve|
// +-------+----------+

For other string formatting functions, see lower and upper for case conversion, initcap for title case, or trim, ltrim, and rtrim for removing unwanted characters from the ends of strings.

Example Details

Created: 2026-03-15 06:00:00 PM

Last Updated: 2026-03-15 06:00:00 PM