In the previous chapter, we covered the most important security issues related to the Top 10 OWASP initiative, whose goal is, in their own words "to raise awareness about application security by identifying some of the most critical risks facing organizations".
In this chapter, we're going to review the most common issues that a developer encounters in relation to an application's performance, and we'll also look at which techniques and tips are commonly suggested in order to obtain flexible, responsive, and well-behaved software, with a special emphasis on web performance. We will cover the following topics:
According to Jim Metzler and Steve Taylor, Application Performance Engineering (APE) covers the roles, skills, activities, practices, tools and deliverables applied at every phase of the application life cycle that ensure that an application will be designed, implemented and operationally supported to meet the non-functional performance requirements.
The keyword in the definition is non-functional. It is assumed that the application works, but some aspects, such as the time taken to perform a transaction or a file upload, should be considered from the very beginning of the life cycle.
So, the problem can, in turn, be divided into several parts:
There are many possible performance metrics that we could consider: physical/virtual memory usage, CPU utilization, network and disk operations, database access, execution time, start up time, and so on.
Each type of application will suggest a distinct set of targets to care about. Also, remember that performance tests should not be carried out until all integration tests are completed.
Finally, let's say that usually, some tests are considered standard when measuring performance:
In these types of tests, it's important to clearly determine what loads to target and to also create a contingency plan for special situations (this is more usual in websites, when, for some reason, a peak in users per second is expected).
Fortunately, we can count on of an entire set of tools in the IDE to carry out these tasks in many ways. As we saw in the first chapter, some of them are available directly when we launch an application in Visual Studio 2015 (all versions, including the Community Edition).
Refer to the A quick tip on execution and memory analysis of an assembly in Visual Studio 2015 section in Chapter 1, Inside the CLR, of this book for more details about these tools, including the Diagnostic Tools launched by default after any application's execution, showing Events, CPU Usage, and Memory Usage.
As a reminder, the next screenshot shows the execution of a simple application and the predefined analysis that Diagnostic Tools show at runtime:
However, keep in mind that some other tools might be useful as well, such as Fiddler, the traffic sniffer that plays an excellent role when analyzing web performance and request/response packets' contents.
Other tools are programmable, such as the StopWatch
class, which allows us to measure the time that a block of code takes to execute with precision, and we also have Performance Counters, available in .NET since the first versions and Event Tracing for Windows (ETW).
Even in the system itself, we can find useful elements, such as Event Log (for monitoring behavior—totally programmable in .NET), or external tools explicitly thought of for Windows, such as the suite SysInternals, which we have already mentioned in the first chapter. In this case, one of the most useful tools you'll find is PerfMon (Performance Monitor), although you may remember that we've mentioned FileMon and RegMon as well.
The IDE, however—especially the 2015 and 2017 versions—contains many more functionalities to check the execution and performance at runtime. Most of this functionality is available through the Debug menu options (some at runtime and others in the edition).
However, one of the most ready-to-use tools available in the editor is a new option called Performance Tips, which shows how much time a function took to complete and it's presented in the next piece of code.
Imagine that we have a simple method that reads file information from the disk and then selects those files whose names don't contain spaces. It could be something like this:
private static void ReadFiles(string path) { DirectoryInfo di = new DirectoryInfo(path); var files = di.EnumerateFiles("*.jpg", SearchOption.AllDirectories).ToArray<FileInfo>(); var filesWoSpaces = RemoveInvalidNames(files); //var filesWoSpaces = RemoveInvalidNamesParallel(files); foreach (var item in filesWoSpaces) { Console.WriteLine(item.FullName); } }
The RemoveInvalidNames
method uses another simple CheckFile
method. Its code is as follows:
private static bool CheckFile(string fileName) { return (fileName.Contains(" ")) ? true : false; } private static List<FileInfo> RemoveInvalidNames(FileInfo[] files) { var validNames = new List<FileInfo>(); foreach (var item in files) { if (CheckFile(item.Name)==true) { validNames.Add(item); } } return validNames; }
We could have inserted the CheckFile
functionality inside RemoveInvalidNames
, but applying the single responsibility principle has some advantages here, as we will see.
Since the selection of files will take some time, if we establish a breakpoint right before the foreach
loop, we will be informed of the time in one of these tips:
Of course, the real value in these code fragments is that we can see the whole process and evaluate it. This is not only about the time it takes, but also about the behavior of the system. So, let's put another breakpoint at the end of the method and see what happens:
As you can see, the entire process took about 1.2 seconds. And the IDE reminds us that we can open Diagnostic Tools to check how this code behaved and have a detailed summary, as the next compound screenshot shows (note that you will see it in three different docked windows inside the tools):
In this manner, we don't need to explicitly create a StopWatch
instance to measure how much the process delayed.
These Performance Tips report the time spent, indicating what is less than or equal to (<=) a certain amount. This means that they consider the overhead of the debugging process (symbol loading, and so on), excluding it from the measurement. Actually, the greatest accuracy is obtained on CLR v4.6 and Windows 10.
As for the CPU graph, it uses all the available cores, and when you find a spike it would be interesting to check, even if doesn't reach 100%, for different types of problems, which we will enumerate later (keep in mind that this feature is not available until debugging ends).
Actually, we can trace sentences one by one and see exactly where most of the time is spent (and where we should revise our code in search for improvements).
If you reproduce this code on your machine, depending on the number of files read, you'll see that in the bottom window of the Diagnostic Tools menu, there is a list that shows every event generated and the time it took to be processed, as shown in the following screenshot:
Thanks to IntelliTrace, you can exactly configure the way you want the debugger to behave in general or for a specific application. Just go to Tools | Options and select Intellitrace Events (it has a separate entry in the tree view).
This allows the developer to select the types of events they're interested in. For instance, if we want to monitor the Console events, we can select which are the ones we need to target in our application:
To test this, I coded a very simple Console application to show a couple of values and the number of rows and columns available:
Console.WriteLine("Largest number of Window Rows: " + Console.LargestWindowHeight); Console.WriteLine("Largest number of Window Columns: " + Console.LargestWindowWidth); Console.Read();
Once IntelliTrace is configured to show the activities of this application, named ConsoleApplication1
, we can follow all its events in Event Window and later select an event of our interest to us and check Activate Historical Debugging in it:
Once we do that, the IDE relaunches the execution, and, now, the Autos, Locals, and Watch windows appear again but show the values that the application managed at that precise time during the execution.
In practice, it's like recording every step given by the application at runtime, including the values of any variable, object, or component that we had previously selected as a target during the process (refer to the next screenshot):
Also, note that the information provided also includes an exact indication of the time spent by every event at runtime.
Moreover, other profiles for different aspects of our application are possible. We can configure them in the Debugger menu under the Start Diagnostic Tools Without Debugging option.
Observe that profiles can be attached to distinct applications in the system, not just the one we're building. A new configuration page opens, and the Analysis Target option shows distinct types of applications, as you can see in the next screenshot.
It could be the current application (ConsoleApplication1
), a Windows Store App (either running or already installed), browsing to a web page on a Windows phone, select any other executable, or launch an ASP.NET application running on IIS:
And this is not all in relation to performance and IntelliTrace. If you select the Show All Tools link, more options are presented, which relate to distinct types of applications and technologies to be measured.
In this way, in the Not Applicable Tools link, we see other interesting features, such as the following:
The next screenshot shows these options:
As you can see, these options appear as Not Applicable since they don't make sense in a Console app.
Once we launch the profile in the Start button, an assistant starts and we have to select the type of target: CPU Sampling, Instrumentation (to measure function calls), .NET Memory Allocation, and Resource Contention Data (concurrency), which can detect threads waiting for other threads.
In the assistant's last screen, we have a checkbox that indicates whether we want to launch the profiling immediately afterwards. The application will be launched and, when the execution is over, a profiling report is generated and presented in a new window:
We have several views available: Summary, Marks (which presents all related timing at the execution), and Processes (obviously, showing information about any process involved in the execution).
This latest option is especially interesting in the results we obtain. Using the same ConsoleApplication1
file, I'm going to add a new method that creates a Task
object and sleeps execution until 1500
ms:
private static void RunANewTask() { Task task = Task.Run(() => { Console.WriteLine("Task started at: " + DateTime.Now.ToLongTimeString()); Thread.Sleep(1500); Console.WriteLine("Task ended at: " + DateTime.Now.ToLongTimeString()); }); Console.WriteLine("Task finished: " + task.IsCompleted); task.Wait(); // Blocked until the task finishes }
If we activate this option of processes in the profiler, we're shown a bunch of options to analyze, and the report generated holds information to filter data in distinct ways depending on what we need: Time Call Tree, Hot Lines, Report Comparison (with exports), Filters, and even more.
For example, we can view the Call Stack at the time the view was collected by double-clicking on an event inside the Diagnostic Tools menu:
Note how we have presented information related to Most Contended Resources and Most Contended Threads, with a breakdown of each element monitored: either handles or thread numbers. This is one of the features that, although available in previous versions of Visual Studio, should be managed via Performance Counters, as you can read in Maxim Goldin's article Thread Performance - Resource Contention Concurrency Profiling in Visual Studio 2010, available as part of MSDN Magazine at https://msdn.microsoft.com/en-us/magazine/ff714587.aspx.
Besides the information shown in the screenshot, a lot of other views give us more data about the execution: Modules, Threads, Resources, Marks, Processes, Function Details, and so on.
The next capture shows what you will see if you follow these steps:
To summarize, you just learned how the IDE provides a wide set of modern, updated tools, and it's just a matter of deciding which one is the best solution for the analysis required.
As we saw in the previous chapter, modern browsers offer new and exciting possibilities to analyze web page behavior in distinct ways.
Since it is assumed that the initial landing time is crucial in the user's perception, some of these features relate directly to performance (analyzing content, summarizing request time for every resource, presenting graphical information to catch potential problems with a glimpse, and so on).
The Network tab, usually present in most of the browsers, shows a detailed report of loading times for every element in the current page. In some cases, this report is accompanied by a graphical chart, indicating which elements took more time to complete.
In some cases, the names might vary slightly, but the functionality is similar. For instance, in Edge, you have a Performance tab, which records activity and generates detailed reports, including graphical information.
In Chrome, we find its Timeline tab, a recording of the page performance, which also presents a summary of the results.
Finally, in Firefox, we have an excellent set of tools to check the performance, starting with the Net tab, which analyzes the download time for every request and even presents a detailed summary when we pass the cursor over each element in the list, allowing us to filter these requests by categories: HTML, CSS, JS, images, plugins, and so on, as shown in the following screenshot:
Also, in Chrome, we find another interesting tab: Audits. The purpose is to monitor distinct aspects of page behaviors, such as the correct usage (and the impact) of CSS, combining JavaScript files to improve the overall performance (the operation called Bundling and Minifying), and, in general, a complete list of issues that Chrome considers improvable, mainly in two aspects: Network Utilization and Web Page Performance. The next screenshot shows the final report on a simple page:
To end this review of performance features linked to browsers, also consider that in some browsers, we find a Performance tab, specifically included to load response times or similar utilities, such as PageInsights in the case of Chrome and a similar one in Firefox (I would especially recommend Firefox Developer Edition for its highly useful features for a developer).
In this case, you can record a session in which Firefox gets all the required information to give a view of the performance, which you can later analyze in many forms:
Note that performance is mainly focused on JavaScript usage, but it is highly customizable for other aspects of a page's behavior.
Just like with any other software process, we can conceive performance-tuning as a cycle. During this cycle, we try to identify and get rid of any slow feature or bottleneck, up to the point at which the performance objective is reached.
The process goes through data collection (using the tools we've seen), analyzing the results, and changes in configuration, or sometimes in code, depending on the solution required.
After each cycle of changes is completed, you should retest and measure the code again in order to check whether the goal has been reached and your application has moved closer to its performance objectives. Microsoft's MSDN suggests a cycle process that we can extrapolate for several distinct scenarios or types of applications.
Keep in mind that software tuning often implies tuning the OS as well. You should not change the system's configuration in order to make a particular application perform correctly. Instead, try to recreate the final environment and the possible (or predictable) ways in which that environment is going to evolve.
Only when you are absolutely sure that your code is the best possible should you suggest changes in the system (memory increase, better CPUs, graphic cards, and so on).
The following graphic, taken from the official MSDN documentation, highlights this performance cycle:
As you probably know, the operating system uses Performance Counters (a feature installed by default), to check its performance and eventually notify the user about performance limitations or poor behavior.
Although they're still available, the new tools that we've seen in the IDE provide a much better and integrated method to check and analyze the application's performance.
The official documentation in MSDN gives us some clues that we can keep in mind in the process of bottleneck detection and divides the possible origins mainly into four categories (each one proposing a distinct management): CPU, memory, disk I/O, and network I/O.
For .NET applications, some recommendations are assumed correctly when identifying the possible bottlenecks:
Every possible bottleneck might have a distinct root cause, and we should carefully analyze the possible origins based on questions such as these: is it because of my code or is it the hardware? If it is a hardware problem, is there a way to accelerate the process implied through software improvements? And so on.
At the time of determining bottlenecks in .NET, you can still use (besides all those tools we've already seen) Performance Counters, although the previous techniques we've seen are supposed to ease the detection process considerably.
However, the official recommendations linked to some of the issue detections are still a valuable clue. So, the key here would be to look for the equivalent.
There are several types depending on the feature to be measured, as MSDN suggests:
The key with these counters is, if you find out an increase in private bytes while the # of bytes in all heap counters remains the same, that means there is some kind of unmanaged memory consumption. If you observe an increase in both counters, then the problem is in the managed memory consumption.
BinaryReaders
).%Time
in GC, counter..NET CLR exceptions# of exceptions thrown/sec
.ThreadContext Switches/sec
, now we can check it with the previously seen Analysis Target feature.The identification of this symptom is usually done by observing two performance counters:
.NET CLR LocksAndThreadsContention Rate/sec
.NET CLR LocksAndThreadsTotal # of Contentions
Your application is said to have a contention rate issue or one that encounters thread contention when there is a meaningful increase in these two values. The responsible code should be identified and rewritten.
As mentioned earlier, besides the set of tools we've seen, it is possible to combine these techniques with software tools especially designed to facilitate our own performance measures.
The first and best known is the Stopwatch
class, which belongs to the System.Diagnostics
namespace, which we've already used in the first chapters to measure sorting algorithms, for example.
The first thing to remember is that depending on the system, the Stopwatch
class will offer different values. These values can be queried at first if we want to know how far we can get accurate measurements. Actually, this class holds two important properties: Frequency
and IsHighResolution
. Both properties are read-only.
Additionally, some methods complete a nice set of functionalities. Let's review what they mean:
Frequency
: This gets the frequency of the timer as a number of ticks per second. The higher the number, the more precise our Stopwatch
class can behave.IsHighResolution
: This indicates whether the timer is based on a high-resolution performance counter.Elapsed
: This gets the total elapsed time that is measured.ElapsedMilliseconds
: This is the same as Elapsed
, but it is measured in milliseconds.ElapsedTicks
: This is the same as Elapsed
, but it is measured in ticks.IsRunning
: This is a Boolean value that indicates whether Stopwatch
is still in operation.The Stopwatch
class also has some convenient methods to facilitate these tasks: Reset
, Restart
, Start
, and Stop
, whose functionality you can easily infer by their names.
So let's use our reading file method from the previous and present tests, together with a Stopwatch
to check these features with some basic code:
var resolution = Stopwatch.IsHighResolution; var frequency = Stopwatch.Frequency; Console.WriteLine("Stopwatch initial use showing basic properties"); Console.WriteLine("----------------------------------------------"); Console.WriteLine("High resolution: " + resolution); Console.WriteLine("Frequency: " + frequency); Stopwatch timer = new Stopwatch(); timer.Start(); ReadFiles(pathImages); timer.Stop(); Console.WriteLine("Elapsed time: " + timer.Elapsed);
Using this basic approach, we have a simple indication of the total time elapsed in the process, as shown in the next screenshot:
We can get more precision using the other properties provided by the class. For example, we can measure the basic unit of time Stopwatch
uses in attempting to get the nanosecond thanks to the Frequency
property.
Besides, the class also has a static StartNew()
method, which we can use for simple cases like these; so, we can change the preceding code in this manner:
static void Main(string[] args) { //BasicMeasure(); for (int i = 1; i < 9; i++) { PreciseMeasure(i); Console.WriteLine(Environment.NewLine); } Console.ReadLine(); } private static void PreciseMeasure(int step) { Console.WriteLine("Stopwatch precise measuring (Step " + step +")"); Console.WriteLine("------------------------------------"); Int64 nanoSecPerTick = (1000L * 1000L * 1000L) / Stopwatch.Frequency; Stopwatch timer = Stopwatch.StartNew(); ReadFiles(pathImages); timer.Stop(); var milliSec = timer.ElapsedMilliseconds; var nanoSec = timer.ElapsedTicks / nanoSecPerTick; Console.WriteLine("Elapsed time (standard): " + timer.Elapsed); Console.WriteLine("Elapsed time (millisenconds): " + milliSec + "ms"); Console.WriteLine("Elapsed time (nanoseconds): " + nanoSec + "ns"); }
As you can see, we use a small loop to perform the measure three times. So, we can compare results and have a more accurate measure, calculating the average.
Also, we're using the static StartNew
method of the class since it's valid for this test (think of some cases in which you might need several instances of the Stopwatch
class to measure distinct aspects or blocks of the application, for instance).
Of course, the results won't be exactly the same in every step of the loop, as we see in the next screenshot showing the output of the program (keep in mind that depending on the task and the machine, these values will vary considerably):
Also, note that due to the system's caching and allocation of resources, every new loop seems to take less time than the previous one. This is the case in my machine depending on the distinct system's state. If you need close evaluations, it is recommended that you execute these tests at least 15 or 20 times and calculate the average.
Optimizing web applications is, for many specialists, a sort of a black art compound of so many features, that actually, there are a lot of books published on the subject.
We will focus on .NET, and, therefore, on ASP.NET applications, although some of the recommendations are extensible to any web application no matter how it is built.
Many studies have been carried on the reasons that move a user to uninstall an application or avoid using it. Four factors have been identified:
So, battery considerations apart, the application should be fast, fluid and efficient. But what do these keywords really mean for us?
In any case, the overall performance is usually linked to the following areas:
So, let's quickly review some aspects to keep in mind at the time of optimizing these factors, along with some other tips generally accepted as useful when improving the page's performance.
There are a few techniques that are widely recognized to be useful when optimizing IIS, so I'm going to summarize some of these tips offered by Brian Posey in Top Ten Ways To Pump Up IIS Performance (https://technet.microsoft.com/es-es/magazine/2005.11.pumpupperformance.aspx) in a Microsoft TechNet article:
<recycle>
element in web.config
allows you to tune this behavior.There are many tips to optimize ASP.NET in the recent versions that correspond to bug fixes, improvements, and suggestions made to the development team by developers all over, and you'll find abundant literature on the Web about it. For instance, Brij Bhushan Mishra wrote an interesting article on this subject (refer to http://www.infragistics.com/community/blogs/devtoolsguy/archive/2015/08/07/12-tips-to-increase-the-performance-of-asp-net-application-drastically-part-1.aspx), recommending some not-so-well-known aspects of the ASP.NET engine.
Generally speaking, we can divide optimization into several areas: general and configuration, caching, load balancing, data access, and client side.
Some general and configuration rules apply at the time of dealing with optimization of ASP.NET applications. Let's see some of them:
HttpApplication httpApps = HttpContext.ApplicationInstance; //Loads a list with active modules in the ViewBag HttpModuleCollection httpModuleCollections = httpApps.Modules; ViewBag.modules = httpModuleCollections; ViewBag.NumberOfLoadedModules = httpModuleCollections.Count;
You should see something like the following screenshot to help you decide which is in use and which is not:
Web.config
file:<system.webServer> <modules> <removename="FormsAuthentication" /> <removename="DefaultAuthentication" /> <removename="AnonymousIdentification" /> <removename="RoleManager" /> </modules> </system.webServer>
web.config
) and disable ViewState if you are not using it: <pages buffer="true" enableViewState="false">
.application_start
event:// Removes view engines ViewEngines.Engines.Clear(); //Add Razor Engine ViewEngines.Engines.Add(newRazorViewEngine());
runAllManagedModulesForAllRequests
, which we can find in Web.config
or applicationHost.config
files. It's similar to the previous one in a way since it forces the ASP.NET engine to run for every request, including those that are not necessary, such as CSS, image files, JavaScript files, and so on.Web.config
, where these resources are located, and indicate it in the same modules section that we used earlier, assigning this attribute value to false
:<modulesrunAllManagedModulesForAllRequests="false">
Web.config
, you can add the following:<urlCompression doDynamicCompression="true" doStaticCompression="true" dynamicCompressionBeforeCache="true"/>
First of all, you should consider the Kernel Mode Cache. It's an optional feature that might not be activated by default.
<system.webServer> <staticContent> <clientCachecacheControlMode="UseMaxAge"cacheControlMaxAge="1.00:00:00" /> </staticContent> </system.webServer>
<OutputCache>
attribute linked to an action
method. In this case, caching can be more granular using only information linked to a given function.[OutputCache(Duration=10, VaryByParam="none")] public ActionResult Index() { return View(); }
<OutputCache>
directive, except VaryByControl
.localStorage
and sessionStorage
attribute, which offer the same functionality but with a number of advantages in security and very fast access:We've already mentioned some techniques for faster data access in this book, but in general, just remember that good practices almost always have a positive impact on access, such as some of the patterns we've seen in Chapter 10, Design Patterns. Also, consider using repository patterns.
Another good idea is the use of AsQueryable
, which only creates a query that can be changed later on using Where
clauses.
Besides what we can obtain using web gardens and web farms, asynchronous controllers are recommended by MSDN all over the documentation, whenever an action depends on external resources.
Using the async/await structure that we've seen, we create non-blocking code that is always more responsive. Your code should then look like the sample provided by the ASP.NET site (http://www.asp.net/mvc/overview/performance/using-asynchronous-methods-in-aspnet-mvc-4):
public async Task<ActionResult>GizmosAsync() { var gizmoService = newGizmoService(); returnView("Gizmos", await gizmoService.GetGizmosAsync()); }
As you can see, the big difference is that the Action
method returns Task<ActionResult>
instead of ActionResult
itself. I recommend that you read the previously mentioned article for more details.
Optimization in the client side can be a huge topic, and you'll find hundreds of references on the Internet. The following are some of the most used and accepted practices:
Using B/M |
Without B/M |
Change | |
---|---|---|---|
File requests |
9 |
34 |
256% |
KB sent |
3.26 |
11.92 |
266% |
KB received |
388.51 |
530 |
36% |
Load time |
510 MS |
780 MS |
53% |
As the documentation explains: The bytes sent had a significant reduction with bundling as browsers are fairly verbose with the HTTP headers they apply on requests. The received reduction in bytes is not as large because the largest files (Scriptsjquery-ui-1.8.11.min.js
and Scriptsjquery-1.7.1.min.js
) are already minified. Note that the timings on the sample program used the Fiddler tool to simulate a slow network. (From the Fiddler Rules menu, select Performance and then select Simulate Modem Speeds.)
18.216.151.164