Job Board
Consulting

Spark Scala Repeat and Space

The repeat function duplicates a string column a specified number of times. The space SQL function generates a string of spaces — it's shorthand for repeat(" ", n). Together they cover common needs like building separators, indenting text, and padding output.

The repeat function is defined as:

def repeat(str: Column, n: Int): Column

It returns a new string formed by concatenating str with itself n times. If n is zero or negative, the result is an empty string. If str is null, the result is null.

Repeating String Values

val df = Seq(
  ("Warning", 3),
  ("Go", 5),
  ("Stop", 1),
  ("Hello", 2),
).toDF("word", "times")

val df2 = df
  .withColumn("repeated", repeat(col("word"), 3))

df2.show(false)
// +-------+-----+---------------------+
// |word   |times|repeated             |
// +-------+-----+---------------------+
// |Warning|3    |WarningWarningWarning|
// |Go     |5    |GoGoGo               |
// |Stop   |1    |StopStopStop         |
// |Hello  |2    |HelloHelloHello      |
// +-------+-----+---------------------+

Note that the n parameter is a fixed Int, not a column reference — every row gets the same repeat count.

Building Separators and Dividers

A handy use for repeat is generating separator lines or visual dividers from a single character:

val df = Seq(
  "-",
  "=",
  "*",
  "#",
).toDF("char")

val df2 = df
  .withColumn("separator", repeat(col("char"), 20))

df2.show(false)
// +----+--------------------+
// |char|separator           |
// +----+--------------------+
// |-   |--------------------|
// |=   |====================|
// |*   |********************|
// |#   |####################|
// +----+--------------------+

Generating Whitespace with space

The space function isn't available directly in org.apache.spark.sql.functions, but you can call it through expr. It returns a string of n space characters — equivalent to repeat(lit(" "), n):

val df = Seq(
  "Alice",
  "Bob",
  "Charlotte",
).toDF("name")

val df2 = df
  .withColumn("indented", concat(expr("space(4)"), col("name")))

df2.show(false)
// +---------+-------------+
// |name     |indented     |
// +---------+-------------+
// |Alice    |    Alice    |
// |Bob      |    Bob      |
// |Charlotte|    Charlotte|
// +---------+-------------+

Handling Nulls

When repeat encounters a null input, the result is null — it does not produce an empty string:

val df = Seq(
  ("Alice", null.asInstanceOf[String]),
  ("Bob", "Hey"),
  (null, "World"),
  (null, null.asInstanceOf[String]),
).toDF("name", "greeting")

val df2 = df
  .withColumn("name_repeated", repeat(col("name"), 2))
  .withColumn("greeting_repeated", repeat(col("greeting"), 3))

df2.show(false)
// +-----+--------+-------------+-----------------+
// |name |greeting|name_repeated|greeting_repeated|
// +-----+--------+-------------+-----------------+
// |Alice|null    |AliceAlice   |null             |
// |Bob  |Hey     |BobBob       |HeyHeyHey        |
// |null |World   |null         |WorldWorldWorld  |
// |null |null    |null         |null             |
// +-----+--------+-------------+-----------------+

For other string formatting functions, see lpad and rpad for padding strings to a fixed length, concat and concat_ws for joining strings together, or overlay for replacing characters at a specific position.

Example Details

Created: 2026-03-27 10:54:49 PM

Last Updated: 2026-03-27 10:54:49 PM