Spark Scala Btrim

btrim strips characters from both ends of a string. It's the SQL-standard equivalent of trim — useful when you're writing Spark SQL expressions or prefer the more explicit name.

btrim is a Spark SQL function. It isn't available directly in the org.apache.spark.sql.functions object, so you call it through expr():

def btrim(str): Column — via expr()

def btrim(str, trimStr): Column — via expr()

The first form removes leading and trailing spaces. The second form removes all characters found in trimStr from both ends of the string.

The btrim function first appeared in version 3.2.0.

Here's a basic example trimming whitespace from city names:

val df = Seq(
  "   San Francisco   ",
  "  New York  ",
  "Chicago",
  "   Austin  ",
  "  Portland   ",
).toDF("city")

val df2 = df
  .withColumn("trimmed", expr("btrim(city)"))

df2.show(false)
// +-------------------+-------------+
// |city               |trimmed      |
// +-------------------+-------------+
// |   San Francisco   |San Francisco|
// |  New York         |New York     |
// |Chicago            |Chicago      |
// |   Austin          |Austin       |
// |  Portland         |Portland     |
// +-------------------+-------------+

Trimming specific characters

Pass a second argument to remove specific characters instead of spaces. Every character in the trimStr string is removed individually — it's not a substring match.

val df = Seq(
  "---San Francisco---",
  "--New York--",
  "-Chicago-",
  "---Austin---",
  "--Portland--",
).toDF("city")

val df2 = df
  .withColumn("trimmed", expr("btrim(city, '-')"))

df2.show(false)
// +-------------------+-------------+
// |city               |trimmed      |
// +-------------------+-------------+
// |---San Francisco---|San Francisco|
// |--New York--       |New York     |
// |-Chicago-          |Chicago      |
// |---Austin---       |Austin       |
// |--Portland--       |Portland     |
// +-------------------+-------------+

Null and empty string handling

When the input is null, btrim returns null. An empty string passes through unchanged.

val df = Seq(
  ("Alice", "  alice@example.com  "),
  ("Bob",   "bob@example.com"),
  ("Carol", null),
  ("Dave",  ""),
).toDF("name", "email")

val df2 = df
  .withColumn("trimmed", expr("btrim(email)"))

df2.show(false)
// +-----+---------------------+-----------------+
// |name |email                |trimmed          |
// +-----+---------------------+-----------------+
// |Alice|  alice@example.com  |alice@example.com|
// |Bob  |bob@example.com      |bob@example.com  |
// |Carol|null                 |null             |
// |Dave |                     |                 |
// +-----+---------------------+-----------------+

btrim vs trim

btrim and trim do the same thing — both remove characters from both sides of a string. The difference is availability: trim is in the org.apache.spark.sql.functions object and can be called directly on columns, while btrim is a SQL function that requires expr(). If you're building column expressions in Scala, trim is more convenient. If you're writing SQL strings or coming from PostgreSQL where btrim is standard, either works.

For trimming only the left or right side, see ltrim and rtrim. For replacing specific substrings, see replace. For character-by-character substitution, see translate.

Spark Scala Btrim

Trimming specific characters

Null and empty string handling

btrim vs trim

Related functions