Job Board
Consulting

Spark Scala Elt

elt returns the n-th value from a list of column expressions, where n is a 1-based index. It's handy when you have an integer column that points at one of several sibling columns (or literal values) and you want to materialize the chosen value into a single result column.

elt is a Spark SQL function. It isn't available directly in the org.apache.spark.sql.functions object, so you call it through expr():

def elt(n, input1, input2, ...): Column — via expr()

The first argument is the 1-based index. The remaining arguments are the candidate values — they can be column references, literals, or any mix of the two. If n is out of range or null, elt returns null (assuming spark.sql.ansi.enabled is false, which is the default — when ANSI mode is on, an out-of-range index throws ArrayIndexOutOfBoundsException instead).

Here's a basic example using elt to pick a region label from three sibling columns based on an index:

val df = Seq(
  (1, "North America", "Europe", "Asia"),
  (2, "North America", "Europe", "Asia"),
  (3, "North America", "Europe", "Asia"),
  (1, "North America", "Europe", "Asia"),
).toDF("region_idx", "region_1", "region_2", "region_3")

val df2 = df
  .withColumn("region", expr("elt(region_idx, region_1, region_2, region_3)"))

df2.show(false)
// +----------+-------------+--------+--------+-------------+
// |region_idx|region_1     |region_2|region_3|region       |
// +----------+-------------+--------+--------+-------------+
// |1         |North America|Europe  |Asia    |North America|
// |2         |North America|Europe  |Asia    |Europe       |
// |3         |North America|Europe  |Asia    |Asia         |
// |1         |North America|Europe  |Asia    |North America|
// +----------+-------------+--------+--------+-------------+

Mapping integer codes to string labels

Because the candidate values can be string literals, elt is also useful as a compact lookup for small, fixed enums — for example, mapping a numeric tier code to a tier name:

val df = Seq(
  ("Alice",  1),
  ("Bob",    2),
  ("Carol",  3),
  ("Dave",   4),
  ("Eve",    0),
).toDF("name", "tier")

val df2 = df
  .withColumn("tier_label", expr("elt(tier, 'Bronze', 'Silver', 'Gold')"))

df2.show(false)
// +-----+----+----------+
// |name |tier|tier_label|
// +-----+----+----------+
// |Alice|1   |Bronze    |
// |Bob  |2   |Silver    |
// |Carol|3   |Gold      |
// |Dave |4   |null      |
// |Eve  |0   |null      |
// +-----+----+----------+

Note that Dave's tier of 4 and Eve's tier of 0 both fall outside the 1-based range of the three candidates, so elt returns null for those rows. For larger or more dynamic lookups, a join against a lookup DataFrame is usually a better fit — elt shines when the candidate list is short and known at query time.

Null and out-of-range index handling

When the index is null or points past the end of the candidate list, elt returns null:

val df = Seq(
  ("order-1", Some(1)),
  ("order-2", Some(2)),
  ("order-3", None),
  ("order-4", Some(5)),
).toDF("order_id", "status_idx")

val df2 = df
  .withColumn("status", expr("elt(status_idx, 'pending', 'shipped', 'delivered')"))

df2.show(false)
// +--------+----------+-------+
// |order_id|status_idx|status |
// +--------+----------+-------+
// |order-1 |1         |pending|
// |order-2 |2         |shipped|
// |order-3 |null      |null   |
// |order-4 |5         |null   |
// +--------+----------+-------+

If you want a fallback value instead of null for out-of-range indices, wrap the result in coalesce with a default literal.

For looking up a value by position from an array column rather than a list of separate columns, see element_at. For conditional column selection where each row picks a value based on a predicate rather than an index, when / otherwise is the more natural fit.

Example Details

Created: 2026-04-28 11:03:39 PM

Last Updated: 2026-04-28 11:03:39 PM