Separating logic from Spark engine-unit testing

Let's start by separating logic from the Spark engine.

In this section, we will cover the following topics:

  • Creating a component with logic
  • Unit testing of that component
  • Using the case class from the model class for our domain logic

Let's look at the logic first and then the simple test.

So, we have a BonusVerifier object that has only one method, quaifyForBonus, that takes our userTransaction model class. According to our login in the following code, we load user transactions and filter all users that are qualified for a bonus. First, we need to test it to create an RDD and filter it. We need to create a SparkSession and also create data for mocking an RDD or DataFrame, and then test the whole Spark API. Since this involves logic, we will test it in isolation. The logic is as follows:

package com.tomekl007.chapter_6
import com.tomekl007.UserTransaction
object BonusVerifier {
private val superUsers = List("A", "X", "100-million")
def qualifyForBonus(userTransaction: UserTransaction): Boolean = {
superUsers.contains(userTransaction.userId) && userTransaction.amount > 100
}
}

We have a list of super users with the A, X, and 100-million user IDs. If our userTransaction.userId is within the superUsers list, and if the userTransaction.amount is higher than 100, then the user qualifies for a bonus; otherwise, they don't. In the real world, the qualifier for bonus logic will be even more complex, and thus it is very important to test the logic in isolation.

The following code shows our test using the userTransaction model. We know that our user transaction includes userId and amount. The following example shows our domain model object, which is shared between a Spark execution integration test and our unit testing, separated from Spark:

package com.tomekl007

import java.util.UUID

case class UserData(userId: String , data: String)

case class UserTransaction(userId: String, amount: Int)

case class InputRecord(uuid: String = UUID.randomUUID().toString(), userId: String)

We need to create our UserTransaction for user ID X and the amount 101, as shown in the following example:

package com.tomekl007.chapter_6
import com.tomekl007.UserTransaction
import org.scalatest.FunSuite
class SeparatingLogic extends FunSuite {
test("test complex logic separately from spark engine") {
//given
val userTransaction = UserTransaction("X", 101)
//when
val res = BonusVerifier.qualifyForBonus(userTransaction)
//then
assert(res)
}
}

We will then pass userTransaction to qualifyForBonus and the result should be true. This user should qualify for a bonus, as shown in the following output:

Now, let's write a test for the negative use case, as follows:

test(testName = "test complex logic separately from spark engine - non qualify") {
//given
val userTransaction = UserTransaction("X", 99)
//when
val res = BonusVerifier.qualifyForBonus(userTransaction)
//then
assert(!res)
}

Here, we have a user, X, that spends 99 for which our results should be false. When we validate our code, we can see, from the following output, that our test has passed:

We have covered two cases, but in real-world scenarios, there are many more. For example, if we want to test the case where we are specifying userId, which is not from this superuser list, and we have some_new_user that spends a lot of money, in our case, 100000, we get the following result:

test(testName = "test complex logic separately from spark engine - non qualify2") {
//given
val userTransaction = UserTransaction("some_new_user", 100000)
//when
val res = BonusVerifier.qualifyForBonus(userTransaction)
//then
assert(!res)
}

Let's assume that it should not qualify, and so such logic is a bit complex. Therefore, we are testing it in a unit test way:

Our tests are very fast and so we are able to check that everything works as expected without introducing Spark at all. In the next section, we'll be changing the logic with integration testing using SparkSession.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.106.7