Spark Scala Signum and Sign
signum returns -1.0 for negative numbers, 0.0 for zero, and 1.0 for positive numbers. It's useful when you care about the direction of a value but not its magnitude — flagging gains vs. losses, classifying deltas, or branching on the sign of a difference. Spark also exposes the SQL-only aliases sign, negative, and positive for working with the sign of a column.
def signum(e: Column): Column
def signum(columnName: String): Column
The function returns a DoubleType regardless of the input type. Pass either a Column or a column name as a String.
val df = Seq(
-42,
-7,
0,
13,
256,
).toDF("value")
val df2 = df
.withColumn("sign", signum(col("value")))
df2.show(false)
// +-----+----+
// |value|sign|
// +-----+----+
// |-42 |-1.0|
// |-7 |-1.0|
// |0 |0.0 |
// |13 |1.0 |
// |256 |1.0 |
// +-----+----+
The same behavior applies to floating-point inputs — the output is still -1.0, 0.0, or 1.0:
val df = Seq(
-3.14,
-0.5,
0.0,
2.71828,
99.999,
).toDF("value")
val df2 = df
.withColumn("sign", signum(col("value")))
df2.show(false)
// +-------+----+
// |value |sign|
// +-------+----+
// |-3.14 |-1.0|
// |-0.5 |-1.0|
// |0.0 |0.0 |
// |2.71828|1.0 |
// |99.999 |1.0 |
// +-------+----+
Classifying the direction of a change
A common use case is reducing a numeric delta to a three-way classification: down, unchanged, or up. Apply signum to the difference between two columns and you get a categorical signal you can pivot on, group by, or join against.
val df = Seq(
("Alice", 100.00, 95.50),
("Bob", 80.25, 92.75),
("Carol", 60.00, 60.00),
("Dave", 45.10, 50.00),
("Eve", 110.00, 88.00),
).toDF("name", "expected", "actual")
val df2 = df
.withColumn("diff", col("actual") - col("expected"))
.withColumn("direction", signum(col("actual") - col("expected")))
df2.show(false)
// +-----+--------+------+-----------------+---------+
// |name |expected|actual|diff |direction|
// +-----+--------+------+-----------------+---------+
// |Alice|100.0 |95.5 |-4.5 |-1.0 |
// |Bob |80.25 |92.75 |12.5 |1.0 |
// |Carol|60.0 |60.0 |0.0 |0.0 |
// |Dave |45.1 |50.0 |4.899999999999999|1.0 |
// |Eve |110.0 |88.0 |-22.0 |-1.0 |
// +-----+--------+------+-----------------+---------+
The Dave row shows a classic floating-point artifact: 50.0 - 45.1 is 4.899999999999999 rather than 4.9. The exact magnitude doesn't matter to signum though — anything above zero collapses to 1.0. If you need magnitude precision rather than just the sign, see abs or round.
sign: the SQL alias for signum
Spark SQL also exposes the function under the name sign, which is a direct alias for signum. It isn't available in the Scala functions object, so you call it through expr():
sign(expr) — via expr()
Both functions produce identical results:
val df = Seq(
-5,
0,
7,
).toDF("value")
val df2 = df
.withColumn("signum", signum(col("value")))
.withColumn("sign", expr("sign(value)"))
df2.show(false)
// +-----+------+----+
// |value|signum|sign|
// +-----+------+----+
// |-5 |-1.0 |-1.0|
// |0 |0.0 |0.0 |
// |7 |1.0 |1.0 |
// +-----+------+----+
In Scala code, prefer signum — it's the native Column function and avoids the string-based SQL parsing of expr(). The sign name is mainly useful if you're porting SQL queries verbatim.
negative and positive
Two related SQL-only utilities are negative and positive. They aren't sign-detection functions like signum — they're sign-application functions. negative(x) returns the negated value of x (equivalent to unary -), and positive(x) returns x unchanged (equivalent to unary +). Both are SQL-only:
negative(expr) — via expr()
positive(expr) — via expr()
val df = Seq(
-8,
-1,
0,
3,
42,
).toDF("value")
val df2 = df
.withColumn("negative", expr("negative(value)"))
.withColumn("positive", expr("positive(value)"))
df2.show(false)
// +-----+--------+--------+
// |value|negative|positive|
// +-----+--------+--------+
// |-8 |8 |-8 |
// |-1 |1 |-1 |
// |0 |0 |0 |
// |3 |-3 |3 |
// |42 |-42 |42 |
// +-----+--------+--------+
In Scala, you almost never need these — -col("value") flips the sign and col("value") leaves it alone. They exist mostly so that SQL queries using explicit +expr or -expr prefixes have a function form to fall back on.
Null handling
When the input is null, all four functions return null:
val df = Seq(
Some(-5),
Some(0),
None,
Some(12),
).toDF("value")
val df2 = df
.withColumn("signum", signum(col("value")))
.withColumn("negative", expr("negative(value)"))
df2.show(false)
// +-----+------+--------+
// |value|signum|negative|
// +-----+------+--------+
// |-5 |-1.0 |5 |
// |0 |0.0 |0 |
// |null |null |null |
// |12 |1.0 |-12 |
// +-----+------+--------+
Related functions
For the magnitude of a number with the sign stripped, see abs. For conditional branching on the result of signum (e.g., mapping -1.0/0.0/1.0 to labels like "down"/"flat"/"up"), see when and otherwise.