Job Board
Consulting

Spark Scala Hour, Minute, and Second Extraction

Spark provides hour, minute, and second for pulling the time-of-day components out of a timestamp column. They're the time-side counterparts to year, month, and day and are useful for bucketing events by hour of day, filtering work hours, or building time-based features.

All three functions accept a timestamp, date, or string formatted as a timestamp and return an integer.

Hour, Minute, and Second

def hour(e: Column): Column

def minute(e: Column): Column

def second(e: Column): Column

hour returns 0 through 23, minute returns 0 through 59, and second returns 0 through 59. The input column can be a timestamp, a date (in which case the time components are all 0), or a string in yyyy-MM-dd HH:mm:ss format — Spark will parse it for you.

val df = Seq(
  "2026-01-15 09:05:30",
  "2026-01-15 14:30:45",
  "2026-01-15 23:59:59",
  "2026-01-15 00:00:00",
).toDF("event_time")

val df2 = df
  .withColumn("hour",   hour(col("event_time")))
  .withColumn("minute", minute(col("event_time")))
  .withColumn("second", second(col("event_time")))

df2.show(false)
// +-------------------+----+------+------+
// |event_time         |hour|minute|second|
// +-------------------+----+------+------+
// |2026-01-15 09:05:30|9   |5     |30    |
// |2026-01-15 14:30:45|14  |30    |45    |
// |2026-01-15 23:59:59|23  |59    |59    |
// |2026-01-15 00:00:00|0   |0     |0     |
// +-------------------+----+------+------+

Midnight (00:00:00) returns 0 for all three components, and the last second of the day (23:59:59) returns the maximum values. There's no separate "12-hour" variant — hour always returns the 24-hour value.

Bucketing Events by Time of Day

A common use case for hour is bucketing event timestamps into named periods like morning, afternoon, and evening. Combine it with when / otherwise:

val df = Seq(
  "2026-01-15 09:05:30",
  "2026-01-15 14:30:45",
  "2026-01-15 18:45:12",
  "2026-01-15 23:59:59",
).toDF("event_time")

val df2 = df
  .withColumn("hour", hour(col("event_time")))
  .withColumn("bucket",
    when(col("hour") < 6,  "night")
     .when(col("hour") < 12, "morning")
     .when(col("hour") < 18, "afternoon")
     .otherwise("evening")
  )

df2.show(false)
// +-------------------+----+---------+
// |event_time         |hour|bucket   |
// +-------------------+----+---------+
// |2026-01-15 09:05:30|9   |morning  |
// |2026-01-15 14:30:45|14  |afternoon|
// |2026-01-15 18:45:12|18  |evening  |
// |2026-01-15 23:59:59|23  |evening  |
// +-------------------+----+---------+

This pattern works well for behavioral analytics — grouping by time-of-day bucket often shows clearer patterns than grouping by exact hour.

Null Handling

All three functions return null when the input is null, so they're safe to use on columns with missing timestamps:

val df = Seq(
  Some("2026-01-15 09:05:30"),
  None,
  Some("2026-01-15 14:30:45"),
).toDF("event_time")

val df2 = df
  .withColumn("hour",   hour(col("event_time")))
  .withColumn("minute", minute(col("event_time")))
  .withColumn("second", second(col("event_time")))

df2.show(false)
// +-------------------+----+------+------+
// |event_time         |hour|minute|second|
// +-------------------+----+------+------+
// |2026-01-15 09:05:30|9   |5     |30    |
// |null               |null|null  |null  |
// |2026-01-15 14:30:45|14  |30    |45    |
// +-------------------+----+------+------+

If the input is a string that can't be parsed as a timestamp, the result is also null rather than an error.

For the date-side equivalents, see year, month, and day. For getting the current timestamp to extract from, see current_date and current_timestamp.

Example Details

Created: 2026-05-03 12:12:49 PM

Last Updated: 2026-05-03 12:12:49 PM