Rules to Better Application Insights

​​​​​

Hold on a second! How would you like to view this content?
Just the title! A brief blurb! Gimme everything!

 

  1. Do you know why you want to use Application Insights?

    Knowing the holistic health of your application is important once it has been deployed into production. Getting feedback on your Availability, errors, performance and usage is an important part of DevOps.
    We recommend using Application Insights, as getting it set up and running is quick, simple and relatively painless.
     
    Application Insights will tell you if your application goes down or runs slowly under load. If there are any uncaught exceptions, you'll be able to drill into the code to pinpoint the problem. You can also find out what your users are doing with the application so that you can tune it to their needs in each development cycle.  
    Figure:  When developing a public website, you wouldn't deploy without Google Analytics to track metrics about user activity.
     
    Figure: For similar reasons, you shouldn't deploy a web application without metric tracking on performance and exceptions
    a.  You need a portal for your app
    b.  You need to know spikes are dangerous
    c.  You need to monitor:
    ​i.   errors
    ii.  performance
    iii. Usage
    Figure: Spikes on an Echidna are dangerous
    Figure: Spikes on an Echidna are dangerous 
    Spikes on a graph are dangerous
    Figure: Spikes on a graph are dangerous

    To add Application Insights to your application, make sure you follow the rule Do you know how to set up Application Insights?

    Can't use Application Insights? Check out the following rule Do you use the best exception handling library ?

  2. Do you know how to set up Application Insights (in SharePoint)?

    ​​ The best approach of setting up Application Insights in SharePoint is a bit different than adding to normal web application.

    (Note: To check the normal way of setting up Application Insights via Visual Studio, please read "How to set up Application Insights ")

    With a web application you are developing you have full control of web.config and have access to it in your Visual ​Studio project, and can follow "How to set up Application Insights " to set up Application Insights. This way Visual Studio will do all the modifications for you automatically.

    But when you develop on SharePoint, you do not have a full copy of web.config in your Visual Studio project, the web.config will be initialized on the SharePoint server when a new SharePoint site is created. This means Visual Studio cannot be used to update the web.config for you. Although you can modify SharePoint web.config via coding, that involves lots of development and testing work against your SharePoint server.

    The best process to implement Applications Insights in SharePoint can be split into two parts:

    1. Implement App Insight JavaScript in master page (via Visual Studio)  or web pages individually via embedded​ code, there are two good articles include the detail steps
    2. Use Application Insights Status Monitor configuration tool to add DLLs reference and update web.config​ (no coding work involved), there are two articles include the detail steps
  3. Do you know how to set up Application Insights?

    ​The easiest way to get started with Application Insights is to follow the documentation on MSDN https://azure.microsoft.com/en-us/documentation/articles/app-insights-get-started/ 

    Lets take a look at the overview and our tips to help you get the most out of Application Insights.

    An overview of the setup steps Application Insights requires that you make 2 general modifications to your application:

    1. On the client side, manually add a Javascript tracker to your web page header (i.e. by placing directly on each page or through a "master page" or "layout template"), this modification enables the "browser page loading time" monitor and can track client-side exceptions:
      app-insights-browser-loading-time.jpg
      Browser side stats have been enabled with the JavaScript tracker
    2. On the server side, add the Application Insights DLL references and update web.config, these modifications enable the "server response time", "server request" and "failed requests" monitors. This step can either be done within Visual Studio when right-clicking on a project in Solution Explorer, but it can also be done with the server monitoring tool on ASP.NET applications you don't have control over (e.g. SharePoint).
    3. server-response-requests-failed-requests.jpg
      Server side stats have been enabled now that it has been added to the ASP.NET pipeline

    Tip #1 – Add enhanced Exception tracking to your application 
    The default set up and configuration of Application Insights will send generic performance stats and Exceptions. If you will be using Application Insights to look deeper into these Exceptions then it is important to make sure the full stack trace is sent when Exceptions occur. This can be added to your application by adding code for all unhandled exceptions. Follow this documentation page for more information https://azure.microsoft.com/en-us/documentation/articles/app-insights-asp-net-exceptions/

    Tip #2 – Add Web tests to monitor performance metrics over time
    As soon as you have configured Application Insights, you should immediately add a web test to track the general performance trends  over time. More information can be found at this rule https://rules.ssw.com.au/do-you-add-web-tests-to-application-insights-to-montior-trends-over-time

    Tip #3 – What if you don't have the source code of your ASP.NET application

    This rule on how to add Application Insights to a SharePoint application shows that you can use the Application Insights monitor to add the .dlls and modify the web.config file of a deployed application https://rules.ssw.com.au/application-insights-in-sharepoint

  4. Errors – Do you know the daily process to improve the health of your web application?

    ​​​​​​​Application Insights can provide an overwhelming amount of errors in your web application, so use just-in-time bug processing to handle them.

    The goal is to each morning check your web application's dashboard and find zero errors. However, what happens if there are multiple errors? Don't panic, follow this process to improve your application's health.

    20-08-2014-11-50-59-AM-compressor.png
    Figure: Every morning developers check Application Insights for errors​

    Once you have found an exception you can drill down into it to discover more context around what was happening. You can find out the user's browser details, what page they tried to access, as well as the stack trace (Tip: make sure you follow the rule on How to set up Application Insights to enhance the stack trace).

    Figure: Drilling down into an exception to discover more.

    It's easy to be overwhelmed by all these issues, so don't create a bug for each issue or even the top 5 issues. Simply create one bug for the most critical issue. Reproduce, fix and close the bug then you can move onto the next one and repeat. This is just-in-time bug processing and will move your application towards better health one step at a time.

    20-08-2014-12-04-31-PM-compressor.png
    Figure: Bad example - creating all the bugs
    20-08-2014-12-06-16-PM-compressor.png 
    Figure: Good example - create the first bug (unfortunately bug has to be created manually)
  5. Do you add Web Tests to Application Insights to monitor trends over time?

    As soon as you have configured Application Insights, you should immediately add a Web Test to track general performance trends over time. You can configure test agents to access your application from different locations around the globe to give a general idea of what users will experience. ​

    Instructions on how to add Web Tests can be found on MSDN https://azure.microsoft.com/en-us/documentation/articles/app-insights-monitor-web-app-availability

    Setting up a Web Test will allow you to query and see how the performance of your application has  changed over a period of time and to help you spot any anomalies. It can be useful to query over a long period of time (e.g. a year) and see if the performance has stayed the same or if there have been any regressions in responsiveness.

    App Insights Web Test.png
    Good Example - You can clearly see the point where we deployed a fix to production to improve the initial page load.

    You have the ability to drill down into web test results, to get an overview of the response time of the resources on a page. This can help discover if certain resources are slowing the response time.

    App Insights Web Test drilldown.png
    Good Example - Reviewing the Web test results, provides vital information .​

  6. Do you create a Sprint Review/Retro email?

    ​​After any Sprint Review and Retrospective, an email should be sent to all the stakeholders to update them on the outcome from the sprint:
    • Subject: <Client Name> Sprint XX Review/Retro
    • This is a reply to the Sprint Forecast email
    • Screenshot of Burndown from TFS
    • Breakdown of work completed (including current code coverage value)
    • Link to test environment
    • Relevant notes from the retrospective

    Hi [Product Owner],

    Sprint in Review: [Sprint Number]
    Sprint Goal: [Goal​]
    Sprint Duration: [Numbe​r of weeks]
    Project: [Project Name]
    Project Portal: [Link to project Portal]
    Test Environment:     [Link to test environment]
    Product Owner: [Product Owner Name]

    Attendees: (Optional as they may be in the to and CC)

     ​

    Sprint Review

     

    ID Title State
    24124UI ImprovementsDone
    24112Integrate Business Logic to MVC appDone
    24097StylingNew
    Figure: Sprint Backlog from [Link to Sprint Backlog in TFS]

     

    As per http://rules.ssw.com.au/Management/RulesToBetterScrumUsingTFS/Pages/RetrospectiveMeeting.aspx, we review:

    1. Sprint Burndown (a quick overview of the sprint)

    Figure: Sprint Burndown

    2. Code Coverage (hopefully tests are increasing each sprint)
    XXX

    3. Velocity (Optional)
    XXX

    4. Burnup (for the release - the whole project, how are we tracking for the big picture?)

    Release Burnup.jpg
    Figure: Release Burnup

    5. Production Deployments (How many times did we deploy to Producti​on?)

    production-deploy.jpg
    Figure: Deployments from Octopus Deploy

    6​​. Application Health Overview Timeline (For the entire spirnt)

    Application Health Overview Timeline.png

    ​Sprint Retrospective

    As part of our commitment to inspect and adapt as a team we conduct a Sprint Retrospective at the end of every Sprint. Here are the results of our Sprint Retrospective:

    What went well?
    <insert what went well from retro>

    What didn’t go so well?
    <insert what did not went well from retro>

    What improvements will be made for the next Sprint?
    <insert what improvements will be made for the next Sprint>

    Definition of Ready - Optional

    <insert the definition of Ready. Normally that the PBIs are Sized with Acceptance criteria added>

    Definition of Done - Optional

    <insert Definition of Done. Normally that it compiles, meets the acceptance criteria, and a test please has been sent if relevant>​

    <This is as per the rule: http://rules.ssw.com.au/Management/RulesToBetterScrumUsingTFS/Pages/Do-you-create-a-Sprint-Review-email.aspx>

    Figure: Good Example - Template for Sprint Review/Retro Email. Subject: Sprint xxx Review/Retro
  7. Are you alerted when your site goes down?

    ​Nothing is worse than having your site down being unaware for a long period of time.

    Application Insights can help you minimize the downtime by sending you an Email alert when your site becomes unavailable. You should create an availability test and enable the alert option as soon as your site goes live.

    error 503.png

    site down2.png

    Bad example: Site was down over the weekend unnoticed

    ​​


    test.png

    Good example: Availability tests are created for multiple locations

    ​​​

    alert 2.png 

    Good example: Email alert is enabled to minimize the downtime

    ​​


  8. Do you have a Preflight Checklist?

    Before starting any work, you should ensure developers take a look at your Application Insights data to make sure everything is performing correctly.


    Most developers check only this first item before starting their work:

    1. Check Unit Tests are Green

    unittests.png

    Figure: Tests are green. I'm ready to start work... or am I?


    More advanced teams check their application insights data as well. This includes:​​

    2. Look for any new Unhandled Exceptions​

    ​See Do you know the daily process to improve the health of your web application?

    ​​App-Insights-Failures.png

    Figure: Unhandled Exceptions - Is there anything you don't know about here?


    3. Look for any obvious performance issues (Server then client).

    See Do you know how to find performance problems with Application Insights?

    performance-4.jpg 

    Figure: Performance - The Server Responses tab shows the slowest running pages.

  9. Do you know how to analyse your web application usage with Application Insights?

    You've set up your Application Insights as per the rule 'Do you know how to set up Application Insights.'

    Your daily failed requests are down to zero & You've tightened up any major performance problems.

    Now you will discover that understanding your users usage within your app is child's play.

    ​The Application Insights provides devs with two different levels of usage tracking. The first is provided out of the box, made up of user, session, and page view data. However, it is more useful to set up custom telemetry, which enables you to track users effectively as they move through your app.


    usage-1.png

    Figure: The most frequent event is someone filling out their timesheet.

    ​It is very straightforward to add these to an application by adding a few lines of code to the hot points of your app. Follow this link to read more (https://azure.microsoft.com/en-us/documentation/articles/app-insights-api-custom-events-metrics/).

     

    Feel constricted by the Application Insights custom events blade? Then you can export your data and display it in PowerBI in a number of interesting ways. 

    ​​Sugarlearning PowerBi.png

    Figure: Power Bi creates an easy to use and indepth dashboard for viewing the health of the applicaiton. 

    Previously we would have had to perform a complicated set up to allow Application Insights and Power BI to communcate. (Follow this link to learn more). Now it is as easy as adding the the Application Insights content pack. 

    ContentPack.png

    Figure: Content packs make it simple to interact and pull data from third-party services.


  10. Do you know how to find performance problems with Application Insights?

    ​​Once you have set up your Application Insights as per the rule 'Do you know how to set up Application Insights.' Once you have your daily failed requests down to zero, you can start looking for performance problems. You will discover that uncovering your performance related problems are relatively straightforward.
     

    The main focus of the first blade is the 'Overview timeline' chart, which gives you a birds eye view of the health of your application.​

    performance-1.jpg

    Figure: There are 3 spikes to investigate (one on each graph), but which is the most important?


    Developers can see the following insights:

    • Number of requests to the server and how many have failed (First blue graph)
    • The breakdown of your page load times (Green Graph)
    • How the application is scaling under different load types over a given period
    • When your key usage peaks occur

     

    ALways investigate the spikes first, notice how the two blue ones line up? That should be investigated, however notice that the green peak is actually at 4 hours. This is definitely the first thing we'll look at.

    performance 2.png

    Figure: The 'Average of Browser page load time by URL base' graph will highlight the slowest page.


    As we can see that a single request took four hours in the 'Average of Browser page load time by URL base' graph, it is important to examine this request.

    It would be nice to see the prior week for comparison, however we're unable to in this section.


    performance-3.png

    Figure: In this case the user agent string gives away the cause, Baidu (a Chinese search engine) got stuck and failed to index the page.

     

    At this point we'll create a PBI to investigate the problem and fix it.

    (Suggestion to Microsoft, please allow annotating the graph to say we've investigated the spike)

     

    The other spike which requires investigation is in the server response times. To investigate it, click on the blue spike. This will open the Server response blade that allows you to compare the current server performance metrics to the previous weeks. 

    performance-4.jpg

    Figure : In this case, the most important detail to action is the Get Healthcheck issue

    In this view, we find performance related issues when the usage graph is shows similarities to the previous week but the response times are higher. When this occurs, click and drag on the timeline to select the spike and then click the magnifying glass to ‘zoom in’. This will reload the ‘Average of Server response time by Operation name’ graph with only data for the selected period.​