The Reference You Need
Spark Scala Examples
Simple spark scala examples to help you quickly complete your data etl pipelines. Save time digging through the spark scala function api and instead get right to the code you need...
Page 2 of 9
-
signum and sign in Spark Scala: Get the Sign of a Numeric DataFrame Column
signum returns -1.0 for negative numbers, 0.0 for zero, and 1.0 for positive numbers. It's useful when you care about the direction of a value but not its magnitude — flagging gains vs. losses, classifying deltas, or branching on the sign of a difference. Spark also exposes the SQL-only aliases sign, negative, and positive for working with the sign of a column.
-
factorial in Spark Scala: Compute the Factorial of an Integer Column in a DataFrame
The factorial function returns the factorial of an integer column — the product of all positive integers up to and including the input value (n! = n × (n-1) × ... × 2 × 1). It's useful anywhere you need to compute permutations, combinations, or other counting expressions inline in a DataFrame.
-
degrees and radians in Spark Scala: Convert Between Angle Units on DataFrame Columns
The degrees and radians functions convert DataFrame columns between the two ways of measuring angles. radians turns degrees into radians; degrees does the reverse. They're the unit-conversion helpers you reach for whenever your data is in degrees but you need to feed it into Spark's trig functions, which all expect radians.
-
Trigonometric Functions in Spark Scala: sin, cos, tan and More on DataFrame Columns
Spark Scala exposes the full set of trigonometric functions from java.lang.Math as DataFrame column functions: the basics (sin, cos, tan), their inverses (asin, acos, atan, atan2), the reciprocals (cot, csc, sec), and the hyperbolic versions of all of them. Every input and output is in radians, not degrees — use the radians function to convert if your data is in degrees.
-
exp and expm1 in Spark Scala: Exponential Functions on DataFrame Columns
Spark provides two exponential functions: exp computes e^x and expm1 computes e^x - 1. They're the inverses of log and log1p respectively, and you'll reach for them whenever you need to undo a log transform, compute compound growth, or work with continuous decay.
-
log, log2, log10, log1p, and ln in Spark Scala: Logarithms in a DataFrame
Spark provides a family of logarithm functions: log for natural log or an arbitrary base, log2 and log10 for the two most common bases, log1p for accurate results near zero, and ln (SQL-only) as an alias for the natural log. They all return Double and treat non-positive inputs as null rather than raising errors.
-
sqrt, cbrt, and pow in Spark Scala: Square Roots, Cube Roots, and Powers in a DataFrame
The sqrt, cbrt, and pow functions compute square roots, cube roots, and arbitrary powers of numeric columns. They return Double regardless of input type and behave like Java's Math.sqrt, Math.cbrt, and Math.pow — including how they handle negative inputs and special values like NaN.
-
ceil, floor, and rint in Spark Scala: Rounding to Integers in a DataFrame
The ceil, floor, and rint functions round a numeric column to an integer. ceil rounds up toward positive infinity, floor rounds down toward negative infinity, and rint rounds to the nearest integer using banker's rounding for exact halves. ceil and floor also accept a scale argument to round to a specific number of decimal places.
-
round and bround in Spark Scala: Rounding Numeric Columns in a DataFrame
The round and bround functions round numeric columns to a given number of decimal places. They differ in how they handle exact halves: round rounds half away from zero (the most common convention), while bround uses banker's rounding, which rounds half to the nearest even number to reduce bias in large aggregations.
-
abs in Spark Scala: Compute the Absolute Value of a Numeric Column in a DataFrame
The abs function returns the absolute value of a numeric column — the value with its sign stripped. It works on any numeric type (integers, longs, doubles, decimals) and is most often used when you care about the magnitude of a number but not its direction.