© Jason Lee Hodges 2019
J. L. HodgesSoftware Engineering from Scratchhttps://doi.org/10.1007/978-1-4842-5206-2_11

11. What Is Software Engineering?

Jason Lee Hodges1 
(1)
Draper, UT, USA
 

It is my hope that what you have learned in this book so far will come of great use to you as you grow technically and pursue a career in software engineering. You’ve learned about expressions and variable assignments, data types, basic program control flow, functions, classes, and several programming paradigms. These concepts are all crucial in their relevance to the practical application of programming. But what is software engineering and how is it different from programming?

Software engineering is the practice of stringently applying quantitative and disciplinary principles and standards to the process of programming. This becomes exponentially important as the programs that you write increase in scale, complexity, and impact and collaboration becomes mandatory. In such scenarios, it is not enough to simply program or “code,” instead you must engineer. This is an important distinction between a software engineer and a programmer in terms of career title.

Consider, for example, that you have been charged with the implementation of a system that could solve the problem of moving people between floors in a very large building. In this theoretical scenario, let’s pretend that the elevator has not yet been invented, but you decide to take a shot at a similar solution. At first glance, an automated pulley system seems to be the obvious solution. Fulfilling that request seems fairly simple, it requires only a passenger platform, an electric motor, a rope, and a pulley at the top and bottom of the potential floors that you want to move the platform between. So, you create this simple system, and you run it a few times to make sure it works and then open it up for others to use. It works beautifully and provides an amazing amount of efficiency for the building and its few tenants. However, tenants of other buildings start to obtain word of these efficiency gains and move into your building hoping to realize the same gains in their day-to-day operations. Suddenly, your elevator is being used by more and more passengers every day. It’s also starting to take on more and more concurrent passengers as the building fills up with tenants and people begin to get impatient. In this theoretical scenario, it is easy to see in hindsight that the rope of the elevator is eventually going to break as it wears down and too many passengers crowd your platform. If such an event were to occur, it is likely that a lot of passengers would be injured.

Applying engineering principles to the design and implementation of this theoretical elevator would account for the potential problems encountered in this scenario. Instead of only seeing the obvious simple solution and diving straight into implementation, an engineer might have considered the cost of both materials and operating expenses, analyzed all the risks, thoroughly tested with different load capacities and construction materials, and designed fail-safe mechanisms (like brakes) in the event of an emergency. Once satisfied with the result, that engineer would have provided direct communication to the building administration about the weight limit and passenger capacity for safe use, posted documentation of those limits in the elevator itself, and ensured that the system was over-engineered to operate above those limits for a reasonable margin of safety in the event that they were breached. Obviously not every problem in software engineering is worth that much planning, testing, and effort. However, many problems intended to be solved by a software application tend to grow over time. The best thing to do when considering how much testing and architecture planning is required for a particular software problem is to consider the potential risks in the solution and measure their likelihood of occurrence vs. their impact. Figure 11-1 illustrates a grid that can plot these risks.
../images/476847_1_En_11_Chapter/476847_1_En_11_Fig1_HTML.png
Figure 11-1

An illustration of plotting risks in software design based on their relative likelihood of occurrence vs. their impact if they do occur. More architecture and testing should be applied to risks that end up in the top right quadrant

It is fairly obvious to see that projects that fall in the top right quadrant require the strictest engineering practices. Examples of such projects might include creating the software that powers self-driving cars, the software behind robotic surgical equipment, the coordination software for air traffic control at airports, or the software that powers millions of dollars worth of trades on a stock exchange every day. What is less obvious is what the other three quadrants might contain. Some of the risks you are evaluating might have varying levels of importance relative to your perspective. Software that produces optimal driving routes to deliver pizza might have a high likelihood of producing less than optimal routes depending on traffic or weather conditions. However, the impact of these sub-optimal routes is low. No lives are at stake and only a minimal amount of transaction cost per delivery is on the line. But, if you have created a company whose sole business is to sell this routing service to pizza companies, you might not have a business if you continue to produce sub-optimal routes. So, do not take the evaluation of software risks lightly. Consider all possibilities and mitigate any significant risks with the following engineering strategies.

Efficiency and Optimization

The first engineering strategy to minimize risk in your software is to ensure that your programs are optimized for efficient performance. As your programs scale in size, complexity, and usage, how will your program respond? Determining how your program responds to scale is important in understanding its operating cost. Having the ability to measure the performance or the cost of a piece of software is often key in determining whether a business can be profitable or even possible. For example, if your software analyzes customer feedback to determine the sentiment of customer interactions with your product, is it possible to analyze every piece of feedback? How many computers or servers must your software run on in order to analyze each response? If your business grows and you continue to get more responses, will you have to buy more server space to run your software? If you do buy more computers, how much will it cost you each year to pay for the salaries and equipment to maintain these machines?

As you might have gathered from that example, the performance of your software can potentially be a big risk in creating and maintaining a viable software solution. So, how do you measure the performance of a program? Do you measure it in the time it takes to run? That would be a good first impression, but wouldn’t the time it takes to run depend on the computer it is running on? In order to standardize performance measurement regardless of computer hardware, software engineers turn to a process of complexity analysis known as Big O notation for measuring algorithmic efficiency. An algorithm is simply a set of repeatable instruction to accomplish a particular task. Big O notations is said to measure the order of magnitude of complexity of a specific algorithm or set of operations. It is expressed in terms of n, which is the length of input data that the algorithm in question uses to accomplish its task. The more operations that occur on the data, the more complex the algorithm. Take the procedure in Listing 11-1 for example. In this simple algorithm, we simply wish to print out each item in a list to the screen.
val items = List("apples", "oranges", "bananas")
items.foreach(println)
Listing 11-1

A procedure that prints each item from a list to the terminal

In this procedure, we are performing an operation on every item in the list of length n. Thus the Big O notation of this algorithm is expressed as O(n) (which is pronounced “Oh of n,” since it is a description of a function O with input n). Since we touch every item in the list, the computer performs the print function n times. This is known as linear complexity. As the list grows, so does the amount of work the program has to perform. If you were to plot such performance on a graph, it would look like Figure 11-2. As you can see, there is a 1 to 1 linear relationship between the amount of operations performed and the amount of data being put into the program.
../images/476847_1_En_11_Chapter/476847_1_En_11_Fig2_HTML.png
Figure 11-2

A depiction of linear algorithmic complexity

Some programs perform many more operations on their input data. If, for example, we transformed each item in our list of data to uppercase before we printed it to the screen, we would be performing two operations per item. This would be described as O(2n). However, in software engineering, we are often more concerned with how quickly the “operations performed” part of the graph grows rather than the minute specifics of each operation. What we actually care about is whether this program can be completed in seconds, minutes, hours, days, years, or even decades. The leaps between these types of measurements are known as orders of magnitude. When measuring algorithmic complexity, the order of magnitude is what is most important to measure. If it’s in the range of seconds, I don’t really care if it’s 20 seconds or 30 seconds. That level of specificity is less useful to me because that can vary depending on the hardware and the type of input data. Thus, we tend to drop any coefficients in measurements like O(2n) and generalize them to O(n) because they are still linear in complexity (just with a slightly different slope). The order of magnitude measurement of an algorithm tells me whether it is worth pursuing for my solution or if I should try something else. Listing 11-2 provides another example of a procedure where we can measure Big O notation to see a different order of magnitude.
val items = List("apples","oranges","bananas","apples")
items.foreach(item => {
      var duplicateCount = –1
      items.foreach(innerItem = > {
            if(item == innerItem) {
                   duplicateCount += 1
            }
      }
      println(s"Duplicates of ${item}: ${duplicateCount}")
}
Listing 11-2

A procedure with exponential growth in algorithmic complexity

In this example, we loop through each item of the list and then check it against every other item in the list to see if there are any duplicates. If there are, we increment the duplicate count by one. The accumulator, duplicateCount, starts at −1 because we expect to always find the item in question in the list at least once (itself). If it is found again, we can consider it a duplicate. This procedure shows us an example of an algorithm that grows exponentially. As the items in the list grow bigger, we have to check each item against even more items in the list. This is expressed in Big O notation as O(n2). There are also algorithms that grow at factorial speed and some that grow with a half-life (logarithmic) speed. Finally, there are algorithms that, no matter how much data you throw at it, always just do one thing. This is known as constant complexity and is expressed as O(1). Figure 11-3 adds depictions of different growth rates, or orders of magnitude, to our linear complexity graph along with how they are expressed in Big O notation.
../images/476847_1_En_11_Chapter/476847_1_En_11_Fig3_HTML.png
Figure 11-3

Common orders of magnitude in Big O notation complexity analysis

Understanding how to measure the algorithmic complexity of your programs is the first step in optimizing them. Of the algorithms depicted in this figure, algorithms with O(1) complexity are the fastest and O(n!) algorithms are the slowest. In the coming chapters, you are going to see examples of data structures and algorithms that you might end up using in many of your programs. These will be demonstrated along with their Big O notation that you should seek to memorize in order to be best equipped to optimize your code and reduce the risk of creating slow or cost-prohibitive programs.

Exercise 11-1

For each of the following code snippets, determine the algorithmic complexity using Big O notation given the list of input items from Listing 11-2.
  1. 1.

    println(items(1))

     
  2. 2.

    val longFruit = items.filter(fruit => fruit == "bananas")

     
  3. 3.

    items.map(fruit => fruit.toUpperCase()).find(fruit => fruit.length == 6)

     

Testing

The next engineering strategy to apply to your projects is meticulous testing through automated test scripts. Automated testing can help you determine if there are bugs in your code by ensuring that you not only know the outcome of the “golden path,” or the expected normal path of your program, but also the outcome of some of the unexpected edge cases. For example, what happens when you don’t provide your program with any data? What happens if you overload it with data? What happens if you give it unrecognizable special characters? In addition to the benefits of this exhaustive edge case exploration, testing can help you identify quickly if making changes to one part of your code breaks any other part of your code. By maintaining thorough test coverage, you can feel confident that when you ship your large-scale project, all of it will work as expected.

There are a couple of common types of tests typically used within the software engineering industry. The first is what is known as a unit test. A unit test is a small test script that tests the outcome of one function (or one small block of code) within your program. A unit test relies on the code being tested to have no external dependencies, like network or database connections or calls to other functions. Unit tests work really well with pure functions since giving it a certain input should always yield the same output. The other type of common test is known as an integration test. Integration tests typically aim to determine how different parts of your program interact with one another. For example, by making a call to one function, how does that impact the state of your entire application? These are typically much more complicated to write and maintain over time and are usually much less performant.

For the purpose of simplicity, we are going to focus on demonstrating only unit tests in this section and we will be using our NebulaOS project as an example. However, it does require some additional tooling that you will find convenient for other parts of your development experience. Unit tests can be written as simply plain functions; however, there are really good testing suites maintained by the Scala community. One such suite is called ScalaTest. In order for you to use ScalaTest as a dependency in your project, you will need to download a dependency management and project build tool called SBT (Simple Build Tool). In addition to enabling you to pull down ScalaTest from a remote repository to use in your project, this tool will allow you to pull down any external, third-party library that you want as long as you know where it is stored and what it is called. You can download or look at the documentation for SBT at www.scala-sbt.org . However, if you are using VS Code, you can simply install the SBT plugin. Figure 11-4 provides a screenshot of where to find the plugin. You will want to download the SBT plugin provided by Lightbend, the official company that backs the Scala language.
../images/476847_1_En_11_Chapter/476847_1_En_11_Fig4_HTML.jpg
Figure 11-4

A screenshot depicting where to download the Simple Build Tool (SBT) for Scala from the VS Code Extension Marketplace

From your terminal, type sbt sbtVersion to verify that you have SBT installed correctly. If it returns a version number, then it has been successfully added to your computer. If you receive an error, go to the SBT web site and follow the installation instructions for your operating system. You may prefer to download SBT outside of the VS Code Extension Marketplace if you are having trouble. If you are certain that you have installed SBT correctly but you are still getting errors from the command line, you may need to add the sbt command to your environment variables (Windows) or PATH (Mac or Linux). Refer to the “Installing Everything You Need” chapter for more instructions on how to accomplish that.

Once you have the SBT plugin installed, you can open up a terminal, ensure that you are in your project directory, and then type sbt to initialize the Simple Build Tool’s interactive shell (which is not unlike our Nebula OS shell). From this shell, several built-in SBT commands are available. One of which is a simple compile command that can replace the scalac command when inside the SBT shell (also known as a command runner).The two commands we will be using for our unit testing are the test command and the testQuick command. The first command looks through all of our project files and finds any unit tests we have written and executes them one time, printing the results of the tests to the screen. The second command does the same thing but in “watch” mode, meaning it executes the test and then listens for any changes to your files. In the event that you save one of your files, it immediately executes all of your tests again and continues in a loop until we tell it to stop.

Tip

Have you gotten tired of compiling all of your changes after each save of your program’s files? SBT provides you with a ~compile command that will run in your terminal while you are coding, listen for any changes when you save your .scala files, and auto-compile any changed files for you. You might also consider adding the Scala (sbt) plugin provided by Lightbend as well. It is a language server that provides auto-completion and error highlighting when running the ~compile command.

In order for SBT to find your test files, you must organize your tests into a very specific folder structure that the command runner is expecting. From the base of your project folder, create the following folder structure: srctestscala. From within that new scala folder, you can create as many test files as you like. Common convention is to make a test file for each .scala file that you create in your project. Most projects have a mirroring folder structure in srcmainscala so that it is easy to map back which test files belong to a corresponding production file. For the sake of convention, let’s move our NebulaOS project files into that mainscala folder. Create a new test file in your testscala folder called utilities.spec.scala. We will create tests in this file that will call the functions we have defined in our mainscalaUtilities.scala file. Programming convention dictates that developers typically add either a .spec or a .test before the file extension to allow others to easily identify this file as a test file. In our utilities.spec.scala file, add the following lines from Listing 11-3.
utilties.spec.scala
import org.scalatest._
package os.nebula {
}
build.sbt
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.5" % "test"
Listing 11-3

Initial scaffolding for a test file

The first line in this test script imports all the assets from the ScalaTest module. However, we have not yet declared ScalaTest as a dependency. In the root of your project folder, create a file named build.sbt. In that file, add the line denoted under the build.sbt file from Listing 11-3. This tells SBT to find you the external dependency for ScalaTest and bring it into your project automatically. From your terminal, type in sbt to start up the command runner. Once it is running and awaiting commands, type in compile to compile your project. If your project compiles successfully, you will receive a success response. This tells you that SBT successfully fetched and installed ScalaTest from a remote repository on the Internet for us to use in our project. Now that we can use ScalaTest, let’s write some unit tests. Listing 11-4 provides a simple example of a unit test.
import org.scalatest._
package os.nebula {
    class UtilitiesSpec extends FunSpec {
        describe("When calling the add command") {
            describe("and passing 2 + 2") {
                it("should equal 4") {
                    assertResult(4){ 2 + 2 }
                }
            }
        }
    }
}
Listing 11-4

A simple example of a unit test

After importing our dependency and declaring our package, we create class that extends ScalaTest’s FunSpec, which is simply a style of testing that we now have access to within our test file. ScalaTest provides several different styles to choose from, but this style will be very familiar to those who are used to Ruby’s Rspec or JavaScript’s Mocha testing suites. It allows us to write our tests in a flowing natural language fashion where we can describe the expected behavior of the program. Each individual unit test in our program is called using the it function which is wrapped in a describe block which organizes our unit tests by a description we provide to it. Within the body of the it function, we define an assertion. In this example, we are asserting that the result we expect should be 4 from the function we are testing, 2 + 2.

After adding this test to our test file, from your sbt command runner, type test to run all the tests in our project. The command runner should return a result with the number of tests that were run, how many passed, and how many failed. In this simple scenario, you should have had 1 test pass because 2 + 2 evaluated to 4 as our assertResult function expected. Let’s swap out 2 + 2 for our Utilities.addComand() function to see if we get the same result (provided in Listing 11-5).
describe("When calling the add command") {
    describe("and passing 2 + 2") {
        it("should equal 4") {
            assertResult(4){ Utilities.addCommand("2 + 2") }
        }
    }
 }
Listing 11-5

Testing the Utilities.addCommand function

Once you’ve modified your test to use the addCommand from our Utilities file, you can run your test again. This time, use the testQuick command to allow SBT to listen for changes to our test files. You should have received a response from the test runner similar to Listing 11-6 which tells us that our test has failed. The test failed in this case because, even though our function takes a string with addition commands, parses them, and adds them together, the final step in our addCommand function is to print the result to the screen rather than return the result. You can see that the assertion is expecting 4 but got a Unit value instead (which is the result of a function that returns nothing). This is an example of a side effect that makes a function difficult to test. If we refactor our program to return the result of adding the numbers together and pull the print function out of the addCommand function and add it into the pattern matching expression in our nebula.scala file, our test will now pass.
[info] UtilitiesSpec:
[info] When calling the add command
[info]   and passing 2 + 2
[info]   - should equal 4 *** FAILED ***
[info]     Expected 4, but got <(), the Unit value> (utilities.spec.scala:18)
Listing 11-6

An example of the message returned when a test fails

For additional information about writing tests using ScalaTest, including how to create setup and tear down functions that run before and after each test and the different possible assertions that you can call, visit the official documentation at www.scalatest.org .

Exercise 11-2

For each function in the Utilities.scala file in our Nebula project
  1. 1.

    Ensure that the function is pure and returns a value that can be tested.

     
  2. 2.

    Refactor the nebula.scala file to print the result of the functions from the command pattern matching expression.

     
  3. 3.

    Write a test for each function and ensure it passes.

     

Architecture Planning

The next risk you should seek to mitigate in your software engineering endeavors is the risk of uncoordinated or unorganized code among collaborating engineers due to lack of planning. You want to ensure that no matter how many engineers are working on your project, or who they are, what is written in your project is consistent, readable, and reusable throughout your code base. Oftentimes, this requires either a product manager, a software architect, or both to coordinate with a team to ensure that all code written by the team conforms to a set of quality and consistency guidelines. Architects or product managers also might provide a pre-ordained plan for how to tackle the project in question.

In the previous chapter, you saw a basic diagram for how one might describe relationships between classes, abstract classes, and interfaces. A similar, but more formally specified, graphical representation of a project might be formulated by an architect using what is known as Unified Modeling Language (UML). Having a UML diagram in advance of the start of a project allows all engineers to know what code they should be writing and provides a single unified plan of attack for the overall software implementation. By defining your software in terms of UML ahead of time, it makes it easy for engineers to divide the work based on the boxes in the diagram knowing that, when completed, their piece of the puzzle will most definitely fit nicely into the program as a whole as long as every other engineer stuck to the original specification. This allows for an engineer to write and test their code in a stand-alone, modular, and independent way without needing to wait for other engineers to complete their code as a precursor dependency to their task. This removes any bottlenecks in the engineering process and allows for maximum productivity. Figure 11-5 provides an example of a UML diagram for reference, although we will not be covering the formal specification in this book.
../images/476847_1_En_11_Chapter/476847_1_En_11_Fig5_HTML.png
Figure 11-5

An example of a UML diagram for reference. Notice the difference in lines and arrows. These each have specific meanings in the formal specification

In addition to coordinating using a UML diagram, architects might also mandate that all engineers stick to a specific style guide. That style guide might include rules such as
  1. 1.

    Always specify a return statement, no implied returns.

     
  2. 2.

    Always specify types, no implied types.

     
  3. 3.

    Always use immutable variables and data structures.

     
  4. 4.

    Curly braces are to open on the same line as the definition of the scope and close on the same level of indentation as the first line with all lines in between indented at least one level further.

     

These style guides are obviously based on opinion rather than any hard fast rule, but often it’s important for teams to agree upon them ahead of time in order to maintain a consistent code base. Code that does not follow the guidelines will still compile and run, so how might a team enforce such rules? With large, complicated systems, it is all but mandatory to have some form of version control system in place to keep track of changes to the software over time. Examples of such version control systems include Git and SVN (you might have heard of GitHub which contains cloud-hosted repositories of code that use the Git version control system). When making changes to code using a version control system, a team can put in place a mandatory peer review before allowing new code to be checked in to an existing project. It is during this peer review process that these style guides can be enforced for consistency. It is also a good opportunity to obtain feedback from peers among the team to ensure that your code is efficient and optimal, thus minimizing performance risk.

Software Deployment

The final risk to mitigate is the introduction of bugs into existing production software. For large enterprise software solutions that are deployed to a server farm for deployment, mitigating this risk involves using a specific deployment pattern to update production software over time without impacting existing users of that software. This pattern starts with testing code in a local or development environment, followed by a staging or quality assurance environment, and finalized by pushing to a production environment.

Local development environments are typically backed by databases with dummy data and partial code and services to minimize how much code a local computer might need to run (relative to the production server which might run a very large amount of code that could not be run completely on a local machine). The software engineer typically makes changes to this development environment, saves his or her changes in the version control system, and then passes the changes off to a peer for review. Once approved, the changes can then be sent to the staging environment for integration testing.

The staging environment is typically a mirror of the production environment. It has a separate database that is kept in sync with the production database to ensure a seamless experience when testing incoming changes from a development environment. The goal of the staging environment is to test the impact of changes from the development environment before they hit production. The staging environment, therefore, is used as a stop gap. If the development changes cause a negative effect on the staging environment, they can be rolled back to the previous version without ever affecting the production environment or the users that are using the production software. Typically, this step involves a quality assurance (QA) engineer who writes acceptance tests that must be satisfied before the staging environment can be promoted to production. These tests can be automated using software such as Selenium, but oftentimes they require manual testing as well. Once the QA engineer signs off, the software is then deployed to production.

For less critical software, deployment can follow a continuous integration process. Continuous integration (CI) is a deployment strategy wherein once a set of changes are checked in via the version control system, an automated process of tests kicks off to ensure the changes will not break any of the existing software. If all of the tests that have been set up in the CI test suite pass, the continuous integration system automatically deploys the new software to production. This type of deployment process is typically used for agile software development wherein teams are iterating rapidly on the software and need to deploy to production often.

Deciding which deployment process to choose typically involves understanding the risks and impacts to your software and business. Analyzing whether or not your software can be easily rolled back or backed up is an important consideration. Another is determining the amount of availability or uptime your software is required to have and whether there would be a significant impact to your business if your software were down for a particular amount of time due to maintenance or repairs during your deployment cycle. Understanding these principles along with fault tolerance, scalability, and distributed computing are all important in the modern era of software engineering.

Exercise 11-3

Look into additional information about the topics in this chapter:
  1. 1.

    Look up the formal specification of UML. Refactor the diagram of Weapon classes in the previous chapter using formal UML.

     
  2. 2.

    Look up different version control systems. Familiarize yourself with their commands and common usages.

     
  3. 3.

    Look up the various continuous integration systems and try to gain an understanding of how they are implemented. Examples of these systems are Jenkins, Circle CI, and Travis CI.

     

Summary

In this chapter, you learned that the difference between programming and software engineering is the amount of stringent process that surrounds engineering in order to ensure resilient and quality programs. Those processes included several key concepts. First, you were introduced to Big O notation as a means for measuring complexity. Next, you were given an introduction to unit testing as a strategy for ensuring edge cases and new code changes do not break your existing code. After that, you were introduced to architecture planning strategies that include UML diagramming, version control, style guides, and peer reviews. Finally, you were given an overview of the software deployment life cycle that is used to maintain and update production programs over time. In the next chapter, we will expound upon the engineering skills you learned in this chapter to dive deep into common data structures found in theoretical computer science to help optimize your programs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.176.166