Job Board
Consulting

Spark Scala Bin and Conv

bin returns the binary string representation of a long integer. conv is the more general tool — it converts a number string from one base to another, covering decimal, hex, octal, binary, or anything in between. Use them when you're working with bit flags, color codes, or any data where the representation matters as much as the value.

Converting longs to binary with bin

def bin(e: Column): Column

def bin(columnName: String): Column

bin takes a long integer column and returns its binary representation as a string. The two overloads are equivalent — one takes a Column, the other takes a column name. There's no leading zero padding; the output is just the minimal binary digits needed.

val df = Seq(
  0L,
  5L,
  12L,
  255L,
  1024L,
).toDF("value")

val df2 = df
  .withColumn("binary", bin(col("value")))

df2.show(false)
// +-----+-----------+
// |value|binary     |
// +-----+-----------+
// |0    |0          |
// |5    |101        |
// |12   |1100       |
// |255  |11111111   |
// |1024 |10000000000|
// +-----+-----------+

5 becomes 101, 12 becomes 1100, and 255 (the largest value that fits in a byte) becomes eight ones.

Negative numbers and nulls

bin operates on the underlying 64-bit two's complement representation, so negative numbers come out as 64-bit binary strings. Null inputs produce null outputs.

val df = Seq(
  ("Alice", Some(7L)),
  ("Bob",   None),
  ("Carol", Some(-1L)),
  ("Dave",  Some(42L)),
).toDF("name", "flags")

val df2 = df
  .withColumn("flags_binary", bin(col("flags")))

df2.show(false)
// +-----+-----+----------------------------------------------------------------+
// |name |flags|flags_binary                                                    |
// +-----+-----+----------------------------------------------------------------+
// |Alice|7    |111                                                             |
// |Bob  |null |null                                                            |
// |Carol|-1   |1111111111111111111111111111111111111111111111111111111111111111|
// |Dave |42   |101010                                                          |
// +-----+-----+----------------------------------------------------------------+

-1 becomes 64 ones — that's the two's complement representation of -1 in a 64-bit signed integer. If you only care about positive values, this is rarely an issue, but it's worth knowing if you might pass negative numbers in.

Converting between arbitrary bases with conv

def conv(num: Column, fromBase: Int, toBase: Int): Column

conv is the general-purpose base converter. Pass a string column containing a number, the base it's currently in, and the base you want it converted to. Both bases can be anywhere from 2 to 36.

The input is a string column — not a numeric one — because the digits in bases above 10 (like A-F for hex) aren't valid numbers.

val df = Seq(
  "0",
  "10",
  "255",
  "1024",
  "65535",
).toDF("decimal")

val df2 = df
  .withColumn("binary", conv(col("decimal"), 10, 2))
  .withColumn("hex",    conv(col("decimal"), 10, 16))
  .withColumn("octal",  conv(col("decimal"), 10,  8))

df2.show(false)
// +-------+----------------+----+------+
// |decimal|binary          |hex |octal |
// +-------+----------------+----+------+
// |0      |0               |0   |0     |
// |10     |1010            |A   |12    |
// |255    |11111111        |FF  |377   |
// |1024   |10000000000     |400 |2000  |
// |65535  |1111111111111111|FFFF|177777|
// +-------+----------------+----+------+

The same input produces different representations depending on the target base. Hex output uses uppercase A-F.

Converting hex back to decimal or binary

conv is symmetric — flip the fromBase and toBase arguments to go the other direction. Hex strings are common in color codes, memory addresses, and binary data dumps.

val df = Seq(
  "FF",
  "1A",
  "DEADBEEF",
  "100",
).toDF("hex")

val df2 = df
  .withColumn("decimal", conv(col("hex"), 16, 10))
  .withColumn("binary",  conv(col("hex"), 16,  2))

df2.show(false)
// +--------+----------+--------------------------------+
// |hex     |decimal   |binary                          |
// +--------+----------+--------------------------------+
// |FF      |255       |11111111                        |
// |1A      |26        |11010                           |
// |DEADBEEF|3735928559|11011110101011011011111011101111|
// |100     |256       |100000000                       |
// +--------+----------+--------------------------------+

Note that 100 in hex is 256 in decimal — the input is interpreted in the source base, not as the literal string of digits.

Converting binary strings

conv also handles binary strings — useful when you have a stored representation and want to recover the original number or convert it to a more compact form.

val df = Seq(
  "10",
  "1010",
  "11111111",
  "100000000000",
).toDF("binary")

val df2 = df
  .withColumn("decimal", conv(col("binary"), 2, 10))
  .withColumn("hex",     conv(col("binary"), 2, 16))

df2.show(false)
// +------------+-------+---+
// |binary      |decimal|hex|
// +------------+-------+---+
// |10          |2      |2  |
// |1010        |10     |A  |
// |11111111    |255    |FF |
// |100000000000|2048   |800|
// +------------+-------+---+

bin vs conv

bin(x) is equivalent to conv(cast(x, "string"), 10, 2) for non-negative values — both produce the binary representation. The differences:

  • Input type: bin takes a long column directly. conv requires a string column.
  • Negative numbers: bin uses 64-bit two's complement (negative numbers become long strings of ones). conv doesn't natively handle negative number conversion in the same way — it works on the unsigned interpretation of the digits you give it.
  • Flexibility: conv can target any base from 2 to 36; bin only produces binary.

If you have a numeric column and just want binary, use bin. If you need any other base or you're starting from a string, use conv.

For converting integers and strings to hexadecimal specifically, see hex and unhex. For other low-level encoding tasks, see base64 and unbase64 and decode and encode.

Example Details

Created: 2026-04-27 10:48:49 PM

Last Updated: 2026-04-27 10:48:49 PM