Job Board
Consulting

Spark Scala Current Date and Timestamp

Spark provides a small family of functions for getting the current date, time, and session timezone — useful for tagging records with a load time, calculating ages, or filtering on "today". They all return the value at the start of query evaluation, so every call within a single query sees the same value.

The two most common are current_date and current_timestamp:

def current_date(): Column

def current_timestamp(): Column

current_date returns a date column (no time component), and current_timestamp returns a timestamp column with microsecond precision. Spark also exposes localtimestamp, which returns the current timestamp without a time zone:

def localtimestamp(): Column

The localtimestamp function first appeared in version 3.3.0 and is defined in org.apache.spark.sql.functions.

Here's all three side by side:

val df = Seq("row_1").toDF("example")

val df2 = df
  .withColumn("current_date", current_date())
  .withColumn("current_timestamp", current_timestamp())
  .withColumn("localtimestamp", localtimestamp())

df2.show(false)
// +-------+------------+--------------------------+--------------------------+
// |example|current_date|current_timestamp         |localtimestamp            |
// +-------+------------+--------------------------+--------------------------+
// |row_1  |2026-04-30  |2026-04-30 18:41:54.360634|2026-04-30 18:41:54.360634|
// +-------+------------+--------------------------+--------------------------+

df2.printSchema()
// root
//  |-- example: string (nullable = true)
//  |-- current_date: date (nullable = false)
//  |-- current_timestamp: timestamp (nullable = false)
//  |-- localtimestamp: timestamp_ntz (nullable = false)

The displayed values look identical, but the schema reveals the difference: current_timestamp is timestamp (a moment in absolute time, displayed in the session timezone), while localtimestamp is timestamp_ntz (a wall-clock timestamp with no timezone attached). If your session timezone changes, a timestamp value will display differently while a timestamp_ntz value stays the same.

SQL aliases: curdate and now

Spark SQL ships a couple of aliases for these functions that aren't directly available in org.apache.spark.sql.functions. To use them from Scala, call them through expr():

def curdate(): Column — via expr()

def now(): Column — via expr()

The curdate function first appeared in version 3.4.0 and is an alias for current_date. now is an alias for current_timestamp (since 1.6.0). They behave identically to their counterparts:

val df = Seq("row_1").toDF("example")

val df2 = df
  .withColumn("current_date", current_date())
  .withColumn("curdate",      expr("curdate()"))
  .withColumn("current_timestamp", current_timestamp())
  .withColumn("now",              expr("now()"))

df2.show(false)
// +-------+------------+----------+--------------------------+--------------------------+
// |example|current_date|curdate   |current_timestamp         |now                       |
// +-------+------------+----------+--------------------------+--------------------------+
// |row_1  |2026-04-30  |2026-04-30|2026-04-30 18:41:55.020831|2026-04-30 18:41:55.020831|
// +-------+------------+----------+--------------------------+--------------------------+

For new code in Scala, prefer current_date and current_timestamp — they're the canonical names and don't require expr(). The aliases exist mainly for compatibility with SQL written against other engines.

Reading the session timezone

current_timezone returns the timezone Spark is using to interpret timestamp values. It's a SQL-only function, so it also goes through expr():

def current_timezone(): Column — via expr()

The current_timezone function first appeared in version 3.1.0.

val df = Seq("row_1").toDF("example")

val df2 = df
  .withColumn("session_timezone", expr("current_timezone()"))

df2.show(false)
// +-------+----------------+
// |example|session_timezone|
// +-------+----------------+
// |row_1  |America/New_York|
// +-------+----------------+

This reflects the value of spark.sql.session.timeZone, which defaults to the JVM's timezone if unset. Knowing what it is matters when you're converting between timestamp and timestamp_ntz, or when reading data that was written under a different timezone.

Tagging records with a load time

The most common use of these functions is stamping a load date or processing time onto rows as they enter a table:

val df = Seq(
  ("ord_1001", "Alice"),
  ("ord_1002", "Bob"),
  ("ord_1003", "Carol"),
).toDF("order_id", "customer")

val df2 = df
  .withColumn("loaded_date", current_date())
  .withColumn("loaded_at",   current_timestamp())

df2.show(false)
// +--------+--------+-----------+--------------------------+
// |order_id|customer|loaded_date|loaded_at                 |
// +--------+--------+-----------+--------------------------+
// |ord_1001|Alice   |2026-04-30 |2026-04-30 18:41:55.195612|
// |ord_1002|Bob     |2026-04-30 |2026-04-30 18:41:55.195612|
// |ord_1003|Carol   |2026-04-30 |2026-04-30 18:41:55.195612|
// +--------+--------+-----------+--------------------------+

Notice every row gets the same timestamp. That's by design: Spark evaluates these functions once per query, so they're safe to use as deterministic markers within a write. If you need a unique identifier per row, reach for monotonically_increasing_id or uuid instead.

Example Details

Created: 2026-04-30 10:42:04 PM

Last Updated: 2026-04-30 10:42:04 PM