Rules to Better Application Performance

​​

Hold on a second! How would you like to view this content?
Just the title! A brief blurb! Gimme everything!
  1. Do you know where your goal posts are?

    When starting on the path of improving application performance, it is always important to know when you can stop. The goal posts would depend on the type of application being written and the number of active users of the application and the budget. Some examples of performance goals are:​​​

    • ​Every page loads in under 100 ms
    • Able to handle 1000 active concurrent users
    • Getting an A on Google Page Index
    • Search page should return results in under 100ms​​

    With the goal posts firmly in sight, the developers can begin performance tuning the application.​

  2. Do you establish a baseline?

    most important part of performance tuning is being able to quantify the process. This is why it’s super important that before you touch any code or SQL, that a measurement be taken of the current performance. Now the general rule is to make sure the tests are being run on the same or similar hardware to production and that the test always be run on the *same infrastructure* - otherwise you’d be comparing apples with oranges .

    ​​Once you establish that baseline, you can then incrementally measure the performance impact for each change being made. This way you can measure effort vs reward as you could be working on a tweak for weeks that only improves performance by a few milliseconds whereas spending an hour to bundle and minify assets might yield a 50% improvement.

  3. Do you know the best load testing tools for web applications?

    The testing tool used will depend on the number of users you want to target. 

    For a small number of users (up to 100) you can use the built in Visual Studio Web Tests to record steps that a user would do on your site.

    testingtools.png
    Figure: Start the recording
    testingtools.png
    Figure: Record the steps a user would typically take on your website
    testingtools.png
    Figure: Customize parameters in your recording to dynamically draw data from a database
    testingtools.png
    Figure: Add a load test
    testingtools.png
    Figure: Configure settings
    testingtools.png
    Figure: Configure users
    testingtools.png
    Figure: Add Web tests
    testingtools.png
    Figure: Run tests

    Alternatively for more users 200+ you should use cloud based offerings like:

    • Azure Web Tests (you can use your existing web tests) 
    • Loader.io 
    • LoadStorm.net 
    testingtools.jpg
    Figure: Load Storm results
  4. Do you stress tests your infrastructure before testing your application?

    The infrastructure that your application is deployed to is often never tested but can be the culprit for performance issues due to misconfiguration or virtual machine resource contention. We recommend setting up a simply load test on the infrastructure like setting up a web server that serves 1 image and having the load tests simply fetch that image.

    This simple test will highlight: 

    • Maximum performance you can expect (are your goals realistic for the infrastructure)
    • Identify any network related issues
      • Uplink bandwidth, DDOS protection, firewall issues
    infratests.jpg
    Figure: Work out the maximum performance of the infrastructure before starting

    ​Note: if you have other servers in the mix, then you can make another simple test to pull records from the database to check the DB server as well.

  5. Do you know where bottlenecks can happen?

    For modern applications, there are many layers and moving parts that need to seamlessly work together to deliver our application to the end user. 

    bottleneck.png
    Figure: Bottlenecks can happen anywhere!

    The issues can be in:

    SQL Server

    • Slow queries 
    • Bad configuration 
    • Bad query plans 
    • Lack of resources 
    • Locking

    Business Logic

    • Inefficient code 
    • Chatty code 
    • Long running processes 
    • Not making use of multicore processors

    Front end

    • Too many requests to server a page 
    • Page size
    • Large images
    • No Caching

    Connection between SQL and Web

    • Lack of bandwidth
    • Too much chatter

    Connection between Web and Internet

    • Poor uplink ( e.g. 1mbps uploads)
    • Too many hops

    Connection between Web and End users

    • Geographic ally too far (e.g. US servers, AU users)

    Infrastructure

    • Misconfiguration
    • ​Resource contention
  6. Do you know how to find performance problems with Application Insights?

    ​​​​Once you have set up your Application Insights as per the rule 'Do you know how to set up Application Insights.' Once you have your daily failed requests down to zero, you can start looking for performance problems. You will discover that uncovering your performance related problems are relatively straightforward.​​

    The main focus of the first blade is the 'Overview timeline' chart, which gives you a birds eye view of the health of your application.

    performance-1.jpg
    Figure: There are 3 spikes to investigate (one on each graph), but which is the most important?

    Developers can see the following insights:

    • Number of requests to the server and how many have failed (First blue graph)
    • The breakdown of your page load times (Green Graph)
    • How the application is scaling under different load types over a given period
    • When your key usage peaks occur

    Always investigate the spikes first, notice how the two blue ones line up? That should be investigated, however,​ notice that the green peak is actually at 4 hours. This is definitely the first thing we'll look at.

    performance 2.png
    Figure: The 'Average of Browser page load time by URL base' graph will highlight the slowest page.

    As we can see that a single request took four hours in the 'Average of Browser page load time by URL base' graph, it is important to examine this request.

    It would be nice to see the prior week for comparison, however, we're unable to in this section.

    performance-3.png
    Figure: In this case, the user agent string gives away the cause, Baidu (a Chinese search engine) got stuck and failed to index the page.

    At this point, we'll create a PBI to investigate the problem and fix it.

    (Suggestion to Microsoft, please allow annotating the graph to say we've investigated the spike)

    The other spike which requires investigation is in the server response times. To investigate it, click on the blue spike. This will open the Server response blade that allows you to compare the current server performance metrics to the previous weeks. 

    performance-4.jpg
    Figure: In this case, the most important detail to action is the Get Healthcheck issue

    In this view, we find performance related issues when the usage graph shows similarities to the previous week but the response times are higher. When this occurs, click and drag on the timeline to select the spike and then click the magnifying glass to ‘zoom in’. This will reload the ‘Average of Server response time by Operation name’ graph with only data for the selected period.

  7. Do you know how to investigate performance problems in a .NET app?

    Working out why the performance of an application has suddenly degraded can be hard.  This rule covers some investigations steps that can help determine the cause of performance problems.

    1. Use Application Insights to determine when the application last had acceptable performance​

    Follow the Do you know how to find performance problems with Application Insights?​ rule to determine when the decrease in performance began to occur.  Its important to determine if the performance degradation occurred gradually or if there was a dramatic drop-off in performance.

    2. Look for changes that coincide with the performance issue​​

    There are three general cases that can cause performance issues:

    1. A change to software or hardware.  Your deployment tool (such as Octopus) can tell you if there has been a software deployment, and you can work with your network admin to determine if there has been infrastructure changes.
    2. The load factor on the application can change.  Application Insights can help you determine if the load factor on the application has increased.
    3. A hardware issue or network issue can occur that interferes with normal operation.​  The Windows Event ​Log and other sys admin monitoring tools can alert you to infrastructure issues like this.

    3. ​Dealing With Code Related Issues​

    If a software release has caused the performance problems, it is important to work out the code delta between the software release that worked well and the new release with the performance issues.  Your software repository should have the necessary metadata to allow you to trace code deltas between release numbers.  Inspect all the changes that have occurred for obvious performance issues like bad EF code, unnecessary loops and chatty network calls.  See Do you know where bottlenecks can happen?​ for more information on performance issues that can be introduced with code changes.

    4. Dealing with Database Related Issues​​

    Application Insights can help determine which tier of an application is performing poorly, and if it is determined that the performance issue is occurring in the database, a new feature in SQL Server 2016 makes finding these performance issues much easier.  Query Store is like having a light-weight version of SQL Profiler running all the time, and is enabled at a database level using the Database Properties dialog:


    Once Query Store has been enabled for a particular database, it needs to run for a number of days to collect performance data.  It is generally a good idea to enable Query Store for important production databases before performance problems occur.  Detailed information on regressed queries, overall resource consumption, the worst performing queries, and detailed information such as query plans for a specific SQL statement can then be retrieved using SQL Server Management Studio (SSMS).



    Once Query Store has been collecting performance information on a database for an extended period, a rich collection of information is available.  It is possible to show regressed queries by comparing a Recent time interval (2 weeks in the diagram below) compared to a baseline History period (the Last Year in the diagram below) to see queries that have begun to perform poorly.

    In the diagram we can see the total duration for a query (top left), the execution plans that have been used on a particular query (top right) and the details of a selected execution plan in the bottom pane.  The actual SQL statement that was executed is also visible, allowing the query to be linked back to a particular EF code statement.

    The Top Resource Consuming Queries tab is extremely valuable for performance tuning a database.  You can see the Top 25 Queries by:

    • Duration
    • CPU Time

    • Execution Count
    • Logical Reads
    • Logical Writes
    • Memory consumption
    • Physical Reads

    All of these readings can be broken down using the statical measures of:

    • Total
    • Average
    • Min
    • Max
    • Std Deviation

    As with the Regressed Queries tab, the query plan history and details of a particular query plan are available for inspection.  This provides all the required information to track down the part of the application that is calling the poorly performing SQL, and also provides insight into how to fix the poor performance depending on which SQL step is taking the most time.