Job Board
Consulting

Spark Scala date_from_unix_date and unix_date

date_from_unix_date converts a day count (days since 1970-01-01) into a calendar date, and unix_date does the reverse — turning a date into the number of days since the epoch. They're useful when your data uses integer day offsets for compact storage or interoperability with other systems.

Both are Spark SQL functions. They aren't available directly in the org.apache.spark.sql.functions object, so you call them through expr():

def date_from_unix_date(days): Column — via expr()

def unix_date(date): Column — via expr()

Both functions first appeared in version 3.1.0.

Converting day counts to dates

Pass an integer column representing days since 1970-01-01 to date_from_unix_date and it returns the corresponding date. Day 0 is the epoch itself, day 1 is the day after, and so on:

val df = Seq(
  0,
  1,
  365,
  18000,
  19723,
  20000,
).toDF("days_since_epoch")

val df2 = df
  .withColumn("date", expr("date_from_unix_date(days_since_epoch)"))

df2.show(false)
// +----------------+----------+
// |days_since_epoch|date      |
// +----------------+----------+
// |0               |1970-01-01|
// |1               |1970-01-02|
// |365             |1971-01-01|
// |18000           |2019-04-14|
// |19723           |2024-01-01|
// |20000           |2024-10-04|
// +----------------+----------+

The output column has type date, not a string, so you can pass it directly to other date functions like date_add, date_format, or datediff.

Converting dates to day counts

unix_date goes the other way — given a date column, it returns the number of days since 1970-01-01 as an integer:

val df = Seq(
  "1970-01-01",
  "1970-01-02",
  "2000-01-01",
  "2024-01-01",
  "2024-12-31",
).toDF("date_str")

val df2 = df
  .withColumn("date", to_date(col("date_str")))
  .withColumn("days_since_epoch", expr("unix_date(date)"))

df2.show(false)
// +----------+----------+----------------+
// |date_str  |date      |days_since_epoch|
// +----------+----------+----------------+
// |1970-01-01|1970-01-01|0               |
// |1970-01-02|1970-01-02|1               |
// |2000-01-01|2000-01-01|10957           |
// |2024-01-01|2024-01-01|19723           |
// |2024-12-31|2024-12-31|20088           |
// +----------+----------+----------------+

unix_date expects a date column, not a string — use to_date first if your data is stored as strings.

Round-tripping between the two

The two functions are inverses of each other. Converting a day count to a date and back yields the original integer:

val df = Seq(
  ("Alice",  19723),
  ("Bob",    19815),
  ("Carol",  19900),
  ("Dave",   20100),
).toDF("name", "days_since_epoch")

val df2 = df
  .withColumn("signup_date", expr("date_from_unix_date(days_since_epoch)"))
  .withColumn("round_trip",  expr("unix_date(signup_date)"))

df2.show(false)
// +-----+----------------+-----------+----------+
// |name |days_since_epoch|signup_date|round_trip|
// +-----+----------------+-----------+----------+
// |Alice|19723           |2024-01-01 |19723     |
// |Bob  |19815           |2024-04-02 |19815     |
// |Carol|19900           |2024-06-26 |19900     |
// |Dave |20100           |2025-01-12 |20100     |
// +-----+----------------+-----------+----------+

This is the typical pattern when integrating with systems that store dates as integer day offsets — read the integer, convert it to a real date for analysis, and convert back when writing out.

Null handling

Both functions return null when their input is null:

val df = Seq(
  ("Alice", Some(19723)),
  ("Bob",   None),
  ("Carol", Some(20000)),
).toDF("name", "days_since_epoch")

val df2 = df
  .withColumn("date", expr("date_from_unix_date(days_since_epoch)"))

df2.show(false)
// +-----+----------------+----------+
// |name |days_since_epoch|date      |
// +-----+----------------+----------+
// |Alice|19723           |2024-01-01|
// |Bob  |null            |null      |
// |Carol|20000           |2024-10-04|
// +-----+----------------+----------+

For converting Unix timestamps (seconds since the epoch, not days), see unix_timestamp and from_unixtime. For parsing date strings into date columns, see to_date and to_timestamp.

Example Details

Created: 2026-05-12 10:30:59 PM

Last Updated: 2026-05-12 10:30:59 PM