Job Board
Consulting

Spark Scala Make Interval Functions: make_interval, make_dt_interval, and make_ym_interval

The make_interval family builds interval values from numeric columns. Use make_interval for general-purpose intervals that mix years, months, days, and time-of-day components; use make_ym_interval when you need a strict year-month interval (e.g., subscription terms); and use make_dt_interval when you need a strict day-time interval (e.g., job durations). All three are Spark SQL functions called through expr().

make_interval

make_interval constructs a general interval from up to seven numeric arguments. Every argument is optional and defaults to zero.

def make_interval([years[, months[, weeks[, days[, hours[, mins[, secs]]]]]]]): Column — via expr()

The make_interval function first appeared in version 3.0.0 and is the most flexible of the three — it can carry both calendar (years, months) and time (days, hours, minutes, seconds) components in a single value. Weeks are converted to days (one week = seven days) before the interval is built.

val df = Seq(
  (1, 6, 2, 3, 4, 30, 15.5),
  (0, 3, 0, 0, 12, 0,  0.0),
  (2, 0, 0, 0, 0,  0,  0.0),
  (0, 0, 1, 0, 0,  0,  0.0),
).toDF("years", "months", "weeks", "days", "hours", "mins", "secs")

val df2 = df
  .withColumn(
    "interval",
    expr("make_interval(years, months, weeks, days, hours, mins, secs)")
  )

df2.show(false)
// +-----+------+-----+----+-----+----+----+--------------------------------------------------------+
// |years|months|weeks|days|hours|mins|secs|interval                                                |
// +-----+------+-----+----+-----+----+----+--------------------------------------------------------+
// |1    |6     |2    |3   |4    |30  |15.5|1 years 6 months 17 days 4 hours 30 minutes 15.5 seconds|
// |0    |3     |0    |0   |12   |0   |0.0 |3 months 12 hours                                       |
// |2    |0     |0    |0   |0    |0   |0.0 |2 years                                                 |
// |0    |0     |1    |0   |0    |0   |0.0 |7 days                                                  |
// +-----+------+-----+----+-----+----+----+--------------------------------------------------------+

Notice that the two weeks and three days in the first row combine into 17 days in the interval — weeks have no independent representation, they're just a shorthand for days.

Adding intervals to timestamps

The most common reason to build an interval is to add it to a timestamp or date. Once you have an interval column, plain arithmetic works:

val df = Seq(
  ("Alice", "2026-01-15 09:00:00"),
  ("Bob",   "2026-03-01 12:30:00"),
  ("Carol", "2026-05-20 18:45:00"),
).toDF("name", "start_at")

val df2 = df
  .withColumn("start_at", to_timestamp(col("start_at")))
  .withColumn(
    "trial_ends_at",
    col("start_at") + expr("make_interval(0, 0, 2, 0, 0, 0, 0)")
  )
  .withColumn(
    "renewal_at",
    col("start_at") + expr("make_interval(1, 0, 0, 0, 0, 0, 0)")
  )

df2.show(false)
// +-----+-------------------+-------------------+-------------------+
// |name |start_at           |trial_ends_at      |renewal_at         |
// +-----+-------------------+-------------------+-------------------+
// |Alice|2026-01-15 09:00:00|2026-01-29 09:00:00|2027-01-15 09:00:00|
// |Bob  |2026-03-01 12:30:00|2026-03-15 12:30:00|2027-03-01 12:30:00|
// |Carol|2026-05-20 18:45:00|2026-06-03 18:45:00|2027-05-20 18:45:00|
// +-----+-------------------+-------------------+-------------------+

The two-week trial extends start_at by 14 days. The one-year renewal lands on the same calendar day a year later — interval arithmetic respects month and year boundaries, so February 29 to March 1 transitions and similar edge cases follow calendar rules rather than fixed day counts.

make_ym_interval

make_ym_interval builds an ANSI SQL year-month interval — a typed value that only carries years and months, with no day or time component.

def make_ym_interval([years[, months]]): Column — via expr()

The make_ym_interval function first appeared in version 3.2.0. Unlike make_interval, the result has a strict INTERVAL YEAR TO MONTH type, which is useful when modeling things that genuinely have no time-of-day meaning — billing terms, contract lengths, age in years and months.

val df = Seq(
  ("Basic",      1, 0),
  ("Pro",        1, 6),
  ("Enterprise", 3, 0),
  ("Monthly",    0, 1),
).toDF("plan", "years", "months")

val df2 = df
  .withColumn("term", expr("make_ym_interval(years, months)"))

df2.printSchema()
// root
//  |-- plan: string (nullable = true)
//  |-- years: integer (nullable = false)
//  |-- months: integer (nullable = false)
//  |-- term: interval year to month (nullable = false)

df2.show(false)
// +----------+-----+------+----------------------------+
// |plan      |years|months|term                        |
// +----------+-----+------+----------------------------+
// |Basic     |1    |0     |INTERVAL '1-0' YEAR TO MONTH|
// |Pro       |1    |6     |INTERVAL '1-6' YEAR TO MONTH|
// |Enterprise|3    |0     |INTERVAL '3-0' YEAR TO MONTH|
// |Monthly   |0    |1     |INTERVAL '0-1' YEAR TO MONTH|
// +----------+-----+------+----------------------------+

The schema shows the column type as interval year to month — distinct from the untyped calendar interval produced by make_interval. The display format '1-6' reads as "one year, six months".

make_dt_interval

make_dt_interval is the day-time counterpart: it builds an ANSI SQL day-time interval from days, hours, minutes, and seconds. There are no year or month components.

def make_dt_interval([days[, hours[, mins[, secs]]]]): Column — via expr()

The make_dt_interval function first appeared in version 3.2.0. Use it for durations that are naturally measured in elapsed time rather than calendar units — job runtimes, response times, time spent in a state.

val df = Seq(
  ("upload",   0, 0,  45, 30.0),
  ("encode",   0, 2,  15,  0.0),
  ("publish",  1, 0,   0,  0.0),
  ("transfer", 0, 0,   0, 12.5),
).toDF("job", "days", "hours", "mins", "secs")

val df2 = df
  .withColumn(
    "duration",
    expr("make_dt_interval(days, hours, mins, secs)")
  )

df2.printSchema()
// root
//  |-- job: string (nullable = true)
//  |-- days: integer (nullable = false)
//  |-- hours: integer (nullable = false)
//  |-- mins: integer (nullable = false)
//  |-- secs: double (nullable = false)
//  |-- duration: interval day to second (nullable = true)

df2.show(false)
// +--------+----+-----+----+----+-------------------------------------+
// |job     |days|hours|mins|secs|duration                             |
// +--------+----+-----+----+----+-------------------------------------+
// |upload  |0   |0    |45  |30.0|INTERVAL '0 00:45:30' DAY TO SECOND  |
// |encode  |0   |2    |15  |0.0 |INTERVAL '0 02:15:00' DAY TO SECOND  |
// |publish |1   |0    |0   |0.0 |INTERVAL '1 00:00:00' DAY TO SECOND  |
// |transfer|0   |0    |0   |12.5|INTERVAL '0 00:00:12.5' DAY TO SECOND|
// +--------+----+-----+----+----+-------------------------------------+

The resulting type is interval day to second, displayed as 'D HH:MM:SS'. The secs parameter accepts fractional values, so sub-second precision is preserved.

Choosing between the three

  • Use make_ym_interval when the value only has year and month components and you want the schema to enforce that (billing terms, ages).
  • Use make_dt_interval when the value only has day-and-below components and you want the schema to enforce that (durations, elapsed time).
  • Use make_interval when you need to mix both — for example, "1 year, 6 months, and 3 days" — or when you just want a quick calendar interval to add to a timestamp.

For building dates and timestamps from component columns rather than intervals, see make_date and make_timestamp. For adding or subtracting fixed numbers of days to a date, see date_add and date_sub. For computing the difference between two dates in months, see months_between.

Example Details

Created: 2026-05-17 11:06:50 PM

Last Updated: 2026-05-17 11:06:50 PM