Example Spark Scala Code and Functions

The Reference You Need

Spark Scala Examples

Simple spark scala examples to help you quickly complete your data etl pipelines. Save time digging through the spark scala function api and instead get right to the code you need...

Page 2 of 2

Trim Functions: Usage and Examples for trim, ltrim, and rtrim in Spark Scala DataFrames

2023-08-07 04:14:24 PM

When doing string manipulations in Spark Scala Data Frames, trim is a frequently used function that can quickly help clean up or trim the whitespace (or any characters) from the start and ends of a string column.
Run a Single Test Using SBT in Your Spark Scala Project

2023-08-05 11:11:00 AM

It's also really easy to run a single test using sbt, but the syntax is rather convoluted and can be hard to rember. To run a specific test you specify the test suite and the complete test name as follow:
Converting a Struct Type to a JSON String in Spark Scala

2023-08-03 02:59:00 PM

Using a struct type in Spark Scala DataFrames offers different benefits, from type safety, more flexible logical structures, hierarchical data and of course working with structured data.
Random Functions, Rand, Randn and Examples

2023-08-01 05:15:14 AM

Generating random values is a common need when creating data etl pipeline. They are useful for machine learning pipelines, data sampling and testing to mimic real data (synthetic data). The rand functions fill this need when working with DataFrames.
Spark Scala Functions, Spark SQL API

2023-07-31 02:25:00 PM

The Spark SQL Functions API is a powerful tool provided by Apache Spark's Scala library. It provides many familiar functions used in data processing, data manipulation and transformations. Anyone who has experience with SQL will quickly understand many of the capabilities and how they work with DataFrames.
Array Union, Spark Scala SQL API Function

2023-07-25 02:25:00 PM

The array_union function in Spark Scala takes two arrays as input and returns a new array containing all unique elements from the input arrays, removing any duplicates. When one or more of the arrays are null the entire result is null.

Showing 11 to 16 of 16 results