Example Spark Scala Code and Functions

The Reference You Need

Spark Scala Examples

Simple spark scala examples to help you quickly complete your data etl pipelines. Save time digging through the spark scala function api and instead get right to the code you need...

Page 2 of 9

width_bucket in Spark Scala: Equiwidth Histogram Buckets in a DataFrame

2026-06-02 11:00:54 PM

width_bucket assigns a numeric value to an equiwidth histogram bucket given a range and a bucket count. It's the right tool when you need to bin continuous values into fixed-size groups — age brackets, price tiers, score ranges — without writing a chain of when expressions.
pmod in Spark Scala: Positive Modulo for DataFrame Columns

2026-06-01 10:02:42 PM

The pmod function returns the positive remainder of dividing one column by another. Unlike the standard % operator, which mirrors the sign of the dividend, pmod keeps the result non-negative whenever the divisor is positive. That makes it the right tool for hash bucketing and any cyclic indexing where a negative remainder would point you at the wrong bucket.
hypot in Spark Scala: Compute the Hypotenuse of Two DataFrame Columns

2026-05-31 10:53:11 PM

The hypot function computes sqrt(a² + b²) for two numeric inputs without the intermediate overflow or underflow that a naive implementation would produce. It's the standard tool for distances between points, vector magnitudes, and anywhere the Pythagorean theorem applies.
signum and sign in Spark Scala: Get the Sign of a Numeric DataFrame Column

2026-05-31 10:33:12 PM

signum returns -1.0 for negative numbers, 0.0 for zero, and 1.0 for positive numbers. It's useful when you care about the direction of a value but not its magnitude — flagging gains vs. losses, classifying deltas, or branching on the sign of a difference. Spark also exposes the SQL-only aliases sign, negative, and positive for working with the sign of a column.
factorial in Spark Scala: Compute the Factorial of an Integer Column in a DataFrame

2026-05-30 10:30:27 PM

The factorial function returns the factorial of an integer column — the product of all positive integers up to and including the input value (n! = n × (n-1) × ... × 2 × 1). It's useful anywhere you need to compute permutations, combinations, or other counting expressions inline in a DataFrame.
degrees and radians in Spark Scala: Convert Between Angle Units on DataFrame Columns

2026-05-28 10:28:24 PM

The degrees and radians functions convert DataFrame columns between the two ways of measuring angles. radians turns degrees into radians; degrees does the reverse. They're the unit-conversion helpers you reach for whenever your data is in degrees but you need to feed it into Spark's trig functions, which all expect radians.
Trigonometric Functions in Spark Scala: sin, cos, tan and More on DataFrame Columns

2026-05-26 10:57:35 PM

Spark Scala exposes the full set of trigonometric functions from java.lang.Math as DataFrame column functions: the basics (sin, cos, tan), their inverses (asin, acos, atan, atan2), the reciprocals (cot, csc, sec), and the hyperbolic versions of all of them. Every input and output is in radians, not degrees — use the radians function to convert if your data is in degrees.
exp and expm1 in Spark Scala: Exponential Functions on DataFrame Columns

2026-05-25 10:38:01 PM

Spark provides two exponential functions: exp computes e^x and expm1 computes e^x - 1. They're the inverses of log and log1p respectively, and you'll reach for them whenever you need to undo a log transform, compute compound growth, or work with continuous decay.
log, log2, log10, log1p, and ln in Spark Scala: Logarithms in a DataFrame

2026-05-24 10:34:33 PM

Spark provides a family of logarithm functions: log for natural log or an arbitrary base, log2 and log10 for the two most common bases, log1p for accurate results near zero, and ln (SQL-only) as an alias for the natural log. They all return Double and treat non-positive inputs as null rather than raising errors.
sqrt, cbrt, and pow in Spark Scala: Square Roots, Cube Roots, and Powers in a DataFrame

2026-05-24 10:14:05 PM

The sqrt, cbrt, and pow functions compute square roots, cube roots, and arbitrary powers of numeric columns. They return Double regardless of input type and behave like Java's Math.sqrt, Math.cbrt, and Math.pow — including how they handle negative inputs and special values like NaN.

Previous Next

Showing 11 to 20 of 86 results