APM tips blog

Blog about application monitoring.

Do Not Use Context Initializers

| Comments

I’ve been already writing about telemetry initializers. Using them you can add custom properties to telemetry items reported by your application. For every telemetry item Initialize method of all configured telemetry initializer will be called.

Typical use of telemetry initializer is to set properties like user name, operation ID or timestamp. If you look at ApplicationInsights.config file you’ll find a list of default telemetry initializers used for web applications monitoring.

You’d also discover a set of context initializers defined in the standard ApplicationInsights.config file.

The difference between telemetry initializer and context initializer is that context initializer will only be called once per TelemetryClient instance. So context initializers collects information that is not specific to telemetry item and will be the same for all telemetry items. Examples may be application version, process ID and computer name.

So why you should NOT use context initializer?

Reason 1. You never know when context initializer will be called.

Context initializers will be called no later than the first access to Context property of TelemetryClient object instance will be made. If the only TelemetryClient object is one you constructed - you can control when context initializers will be executed. However if you are using features like requests monitoring, dependencies monitoring or performance counters collection - number of telemetry clients will be created under the hood. So if you add context initializer to TelemetryConfiguration.Active.ContextInitializers collection programmatically you can never guarantee that those TelemetryClient objects were not initialized already.

In fact with the changes in 0.16 SDK some initialization ordering was changed and if you’ll add context initializer programmatically in Global.asax to the TelemetryConfiguration.Active.ContextInitializers collection - it will NOT be run for web requests telemetry module. So requests and exception items will not be stamped with the properties you need.

Reason 2. Once set it cannot be unset.

There are cases when you want to change globally set properties in running application. For instance you may need to change environment name from pre-production to production after VIP swap. You cannot achieve it using context initializers. You can remove context initializer from the collection TelemetryConfiguration.Active.ContextInitializers, but properties set by it will still be attached to telemetry items reported from the already created instances of TelemetryClient.

Reason 3. You can do it with telemetry initializer.

You can always use telemetry initializer instead of context initializer. The only thing you need to do is to cache the property value you need to set so it will not be calculated every time.

So instead of context initializer that will set application version like this:

1
2
3
4
5
6
7
public class AppVersionContextInitializer : IContextInitializer
{
    public void Initialize(TelemetryContext context)
    {
        context.Component.Version = GetApplicationVersion();
    }
}

you can create telemetry initializer that will do the same:

1
2
3
4
5
6
7
8
9
public class AppVersionTelemetryInitializer : ITelemetryInitializer
{
    string applicationVersion = GetApplicationVersion();

    public void Initialize(ITelemetry telemetry)
    {
        telemetry.Context.Component.Version = applicationVersion;
    }
}

Reason 4. Context initializer name is confusing.

If you think what context initializer is doing - it initializes TelemetryClient, not telemetry context. If you’ll construct an instance of TelemetryContext context initializer will not be called. So why is it called context initializer and not client initializer? My telemetry item also have a context. Will context initializer be called when it is constructed? And why not?

The whole reason I’m writing this article is that we regularly getting questions about it and found out that the concept of context initializer if very confusing.

So even if you understood the concept - somebody else may be confused by it and hit a problem with initialization ordering. It is typically hard to troubleshoot why some properties were not added to your telemetry items in production, especially if the issues is timing specific.

The biggest reason we still have a concept of Context Initializer is to keep backward compatibility and do not break existing context initializers you might already have. We are thinking of moving away from context initializers in the future versions and mark this interface as deprecated.

DIY: Data Sampling

| Comments

[Updated 3-Sept-2015]: updated as channel was renamed

Now when pricing for Application Insights is announced you might be wondering - how can you make these prices truly cloud-friendly and only pay for what you are using. You may also wonder - how to fit into throttling limits even if you are willing to pay for all the telemetry data your application produces. Note, you wouldn’t need any of this if your application doesn’t have a big load. There are great filtering and grouping capabilities in Application Insights UI so you will be better off having all the data on the server.

There are four techniques to minimize the amount of data your application reports - separate traffic, filter not interesting data, sample and aggregate. Today I’ll explain how to implement sampling.

Sampling will only work for high-load applications. The idea is to send only every n-th request to the server. With the normal distribution of requests and high load you’ll get statistically correct values for all sorts of aggregations (don’t forget to multiply all values you see in UI to n).

There is no out-of-the-box support for sampling or filtering in Application Insights today. So the idea of implementing it will be to replace standard channel with the custom-made. For every telemetry item this channel may decide whether to send it to the portal or not.

First, define a new class. You’ll need to have your own instance of ServerTelemetryChannel from the NuGet package Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel. I’ve also defined public property SampleEvery so you can configure how much data to sample out using configuration file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public class RequestsSamplingChannel : ITelemetryChannel, ITelemetryModule
{
    private int counter = 0;

    private ServerTelemetryChannel channel;

    public int SampleEvery { get; set; }

    public RequestsSamplingChannel()
    {
        this.channel = new ServerTelemetryChannel();
    }

    public void Initialize(TelemetryConfiguration configuration)
    {
        this.channel.Initialize(configuration);
    }
}

Now you should implement Send method. In this example I apply sampling only to Requests so performance counters, metrics, dependencies and traces will not be sampled. If telemetry item is of type RequestTelemetry I’d increment counter and every SampleEvery-th time will send this item using standard channel:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public void Send(ITelemetry item)
{
    if (item is RequestTelemetry)
    {
        int value = Interlocked.Increment(ref this.counter);
        if (value % this.SampleEvery == 0)
        {
            this.channel.Send(item);
        }
        else
        {
            //value was sampled out. Do nothing
        }
    }
    else
    {
        this.channel.Send(item);
    }
}

For all other properties and methods - just proxy them to the standard channel:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public bool DeveloperMode
{
    get
    {
        return this.channel.DeveloperMode;
    }
    set
    {
        this.channel.DeveloperMode = value;
    }
}

public string EndpointAddress
{
    get
    {
        return this.channel.EndpointAddress;
    }
    set
    {
        this.channel.EndpointAddress = value;
    }
}

public void Flush()
{
    this.channel.Flush();
}

public void Dispose()
{
    this.channel.Dispose();
}

Now you can use this channel. In ApplicationInsights.config file replace the TelemetryChannel node with the following. You can read more on how Application Insights SDK instantiate objects from configuration in my previous post:

1
2
3
<TelemetryChannel Type="ApmTips.RequestsSamplingChannel, ApmTips">
  <SampleEvery>10</SampleEvery>
</TelemetryChannel>

You can implement all sort of interesting sampling algorithms using this approach. Instead of counter you can use random generated value or even RequestTelemetry.ID property that is fairly random.

Next time I’ll cover other ways to minimize the amount of data you are sending to Application Insights.

New SDK Version Published

| Comments

The version 0.15 of Application Insights SDK was published on NuGet. Good news - this SDK reverts back API changes made in 0.14. Specifically it returns back TelemetryContext.Properties mentioned here.

More changes:

  • Couple NuGet packages were renamed to make names more descriptive:
    • PerformanceCollector package renamed to PerfCounterCollector
    • RuntimeTelemetry renamed to DependencyCollector
  • New property Operation.SyntheticSource now available on TelemetryContext. Now you can mark your telemetry items as “not a real user traffic” and specify how this traffic was generated. As an example by setting this property you can distinguish traffic from your test automation from load test traffic.
  • Application Insights Web package now detects the traffic from Availability monitoring of Application Insights and marks it with specific SyntheticSource property.
  • Channel logic was moved to the separate NuGet called Microsoft.ApplicationInsights.PersistenceChannel.

JavaScript SDK Is Now on GitHub

| Comments

Please welcome JavaScript SDK on github. Recently I explained snippet code you need to inject to your project to use Application Insights JavaScript SDK. Now you can see the code of the main script. Please report issues on github and welcome to contribute!

Where Are the Telemetry Context Properties?

| Comments

Recently the change was made. Upgrading from Application Insights SDK version 0.13 to the version 0.14 you may notice that some public interfaces were changed. Specifically, public property Properties was removed from TelemetryContext class. Yes, one that mentioned in the documentation and blog post. One that is very important to enable many scenarios.

It is not a change we plan to keep for a long time. Public interface will be reverted back soon.

I thought a lot on how to explain this change and what led to this. Now I know what was the motivation of this change, I can tell that semantic versioning is designed to experiment with API surface, I know why it wasn’t immediately reverted. However I better not to go into details. I want to assure you that we understand that such a big API change should not happen again without the notice.

Version 0.14 of SDK brings great new features like ability to monitor custom performance counters. If you want to use this features and need custom properties I’d recommend to wait for the new version of SDK. If you don’t want to wait - here is a workaround (it will only work in 0.14 SDK).

If you are using context initializer - convert it to telemetry initializer. Now in telemetry initializer use ISupportProperties interface to set properties for telemetry item:

1
2
3
4
5
6
7
8
9
10
11
public class WorkaroundTelemetryInitializer : ITelemetryInitializer
{
    public void Initialize(ITelemetry telemetry)
    {
        var propsTelemetry = telemetry as ISupportProperties;
        if (propsTelemetry != null)
        {
            propsTelemetry.Properties["environment"] = "development";
        }
    }
}

Again, this workaround will only be needed for the version 0.14 and will not work in the next version of SDK.

Javascript Snippet Explained

| Comments

Big thanks to Scott Southwood who helped to prepare this post.

For end user monitoring Application Insights requires to add this JavaScript snippet to the page:

1
2
3
4
5
6
7
8
var appInsights=window.appInsights||function(config){
    function s(config){t[config]=function(){var i=arguments;t.queue.push(function(){t[config].apply(t,i)})}}var t={config:config},r=document,f=window,e="script",o=r.createElement(e),i,u;for(o.src=config.url||"//az416426.vo.msecnd.net/scripts/a/ai.0.js",r.getElementsByTagName(e)[0].parentNode.appendChild(o),t.cookie=r.cookie,t.queue=[],i=["Event","Exception","Metric","PageView","Trace"];i.length;)s("track"+i.pop());return config.disableExceptionTracking||(i="onerror",s("_"+i),u=f[i],f[i]=function(config,r,f,e,o){var s=u&&u(config,r,f,e,o);return s!==!0&&t["_"+i](config,r,f,e,o),s}),t
}({
    instrumentationKey:"d2cb4759-8e2c-453a-996c-e65c9d0e946a"
});

window.appInsights=appInsights;
appInsights.trackPageView();

This snippet will make initial set up for end user tracking and then download the reset of the monitoring logic from CDN.

There are number of reasons why this script needs to be injected into the page html code. First, placing script into the page will not require additional download at the early stage of the page loading phase. So page loading time will not be affected. Second, it provides you API to track metrics and events so you don’t need to check whether the full Application Insights script is already loaded or not. Third, this script is working in application domain so it can subscribe on onerror callback and get a full error stack. Due to security restrictions browser will not give you the full error stack if you subscribe on onerror callback from the script downloaded from the different domain. It also takes cookies from application domain so they can be used for user/session tracking. Here is more detailed explanation of what it is doing.

First, we check that Application Insights object haven’t been created yet. If so - we will create it:

1
var appInsights = window.appInsights || function (config) {

Next goes a helper function that will define a new callback with the name passed as an argument. Methods like appInsights.trackEvent will be created using this helper. Initial implementation of this method is to put an object into the queue for further processing. “Real” implementation will come with Application Insights javascript file downloaded from CDN later:

1
2
3
4
5
6
7
8
   function s(config) {
         t[config] = function () {
                var i = arguments;
                t.queue.push(function () {
                       t[config].apply(t, i)
                })
         }
   }

Now real appInsights object is defined under the pseudo name t. Initially it only contains config field. More fields and methods will be created later:

1
   var t = { config: config },

Bunch of constants and variables:

1
2
3
4
5
6
                r = document,
                f = window,
                e = "script",
                o = r.createElement(e),
                i,
                u;

In the beginning of this for loop snippet will create script element in DOM model so Application Insights javascript file will be download later from CDN:

1
2
   for (o.src = config.url || "//az416426.vo.msecnd.net/scripts/a/ai.0.js",
                r.getElementsByTagName(e)[0].parentNode.appendChild(o),

Store domain cookies in appInsights object:

1
                t.cookie = r.cookie,

Create an events queue:

1
                    t.queue = [],

And now there is an actual loop. Iterating thru the collection methods like appInsights.trackEvent will be created (remember helper s in the very beginning of the snippet):

1
2
3
4
                i = ["Event", "Exception", "Metric", "PageView", "Trace"];
                i.length;
                )
                s("track" + i.pop());

Now, subscribe to onerror callback to catch javascript errors in page initialization. You can disable thie logic by setting disableExceptionTracking property to false:

1
   return config.disableExceptionTracking ||

Using the same helper from the beggining of script define appInsights._onerror method:

1
2
                (i = "onerror",
                s("_" + i),

Now save existing window.onerror (f is window, see above in constants section) callback from the page and replace it with the new implementation. New implementation will chain the call to appInsights._onerror and call to initial window.onerror implementation:

1
2
3
4
5
                u = f[i],
                f[i] = function (config, r, f, e, o) {
                       var s = u && u(config, r, f, e, o);
                       return s !== !0 && t["_" + i](config, r, f, e, o), s
                }),

Return freshly created appInsights object from the method:

1
2
                t
}

Constructor function is created. Now call it, providing initial configuration. The only required piece of configuration is instrumentationKey:

1
2
3
({
       instrumentationKey: "@RequestTelelemtry.Context.InstrumentationKey"
});

Now we store appInsights as a global variable and save into the queue pageView event:

1
2
window.appInsights = appInsights;
appInsights.trackPageView();

Read more on out-of-the box usage analytics. Inforamtion on tracking custom events and metrics here.

Status Monitor for Cloud Services

| Comments

Cool article on how to install Status Monitor on your web role. Don’t forget to install Microsoft.ApplicationInsights.Web NuGet package for your web project.

Now in order to track dependencies on worker roles you need to do the same and one additional step - set environment variables to tell worker role where those new components are:

1
2
3
4
5
6
7
8
9
10
11
<ServiceDefinition name="MyService" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
   <WorkerRole name="<name>">
      <Runtime>
         <Environment>
            <Variable name="COR_ENABLE_PROFILING" value="1" />
            <Variable name="COR_PROFILER" value="{324F817A-7420-4E6D-B3C1-143FBED6D855}" />
            <Variable name="MicrosoftInstrumentationEngine_Host" value="{CA487940-57D2-10BF-11B2-A3AD5A13CBC0}" />
         </Environment>
      </Runtime>
   </WebRole>
</ServiceDefinition>

More on how to set environment variables for your worker role is here.

Don’t forget to install NuGet package Microsoft.ApplicationInsights.RuntimeTelemetry on your worker role and instantiate TelemetryClient at least once on worker role startup.

Bug With StatusMonitor 5.0 Uninstall

| Comments

Once you’ve started using Application Insights it is essential to install Status Monitor. Status Monitor will enable dependencies tracking as mentioned in one of the previous posts. We use Status Monitor to track dependencies for our own internal services. As Brian Harry wrote about Visual Studio Online services “smaller services are better” we have quite a few interconnected services. Knowing that service you depend on became slower or started failing at a glance is very important.

As we monitor our own services with Application Insights - for some of our services we have startup task that installs Status Monitor. Month ago a small bug in Status Monitor was one of the reasons of quite a serious outage.

Status Monitor in a nutshell is just an installer of Application Insights components and UI to see status of monitoring (as name suggests). By itself it doesn’t collect any application telemetry or running any background services. So you may ask - why I’m saying that Status Monitor caused the outage?

And the answer is simple. We are committed to dog food. So we try to run the latest bits of Application Insights SDK and every service restart we are trying to download the latest components. Unfortunately, Status Monitor 5.0 has an issue - when it uninstalls it leaves some registry settings in bad state. Specifically, it make Environment string empty for these three services:

1
2
3
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\IISADMIN
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\W3SVC
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\WAS

So after uninstall IIS will try to start and will fail as it doesn’t expect Environment string to be empty. Here is how it surfaced when you run iisreset /start command:

1
The IIS Admin Service or the World Wide Web Publishing Service, or a service dependent on them failed to start.  The service, or dependent services, may had an error during its startup or may be disabled

And these messages you’ll see in Event Log:

1
2
3
4
5
6
7
8
9
10
11
Log Name:      System
Source:        Microsoft-Windows-IIS-IISReset
Date:          2/25/2015 9:08:15 AM
Event ID:      3201
Task Category: None
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      sergey-surface
Description:
IIS start command received from user SERGEY-SURFACE\Sergey. The logged data is the status code.
1
2
3
4
5
6
7
8
9
10
11
12
Log Name:      System
Source:        Service Control Manager
Date:          2/25/2015 9:08:15 AM
Event ID:      7000
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      sergey-surface
Description:
The Windows Process Activation Service service failed to start due to the following error: 
The parameter is incorrect.
1
2
3
4
5
6
7
8
9
10
11
12
Log Name:      System
Source:        Service Control Manager
Date:          2/25/2015 9:08:15 AM
Event ID:      7001
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      sergey-surface
Description:
The World Wide Web Publishing Service service depends on the Windows Process Activation Service service which failed to start because of the following error: 
The parameter is incorrect.

Solution is simple - right after uninstall of Status Monitor 5.0 - install the new one or delete “Environment” string from registry keys mentioned above.

There will be some excited features coming in Status Monitor in future and I hope you will never run into issue upgrading it.

Request Name and Url

| Comments

In my previous post on web requests tracking http module I mentioned that Application Insights http module has some smart logic to collect request name. This logic is needed to make meaningful aggregation on UI side. In the screenshot below you see that requests are grouped by request name. Aggregations like number of requests and average execution time for requests were calculated for some pages. And those aggregations completely unusable for requests to “__browserLink”:

Here is how request name calculation logic works today:

  1. ASP.NET MVC support. Request name is calculated as “VERB controller/action”.
  2. ASP.NET MVC Web API support. Following the logic above both requests “/api/movies/” and “/api/movies/5” will be resulted in “GET movies”. So to support Web API request name includes the list of all names of routing parameters in case if “action” parameter wasn’t found. In example above you’ll see requests “GET movies” and “GET movies[id]”.
  3. If routing table is empty or doesn’t have “controller” - HttpRequest.Path will be used as a request name. This property doesn’t include domain name and query string.

Application Insights web SDK will send request name “as is” with regards to letter case. Grouping on UI will be case sensitive so “GET Home/Index” will be counted separately from “GET home/INDEX” even though in many cases they will result in the same controller and action execution. The reason for that is that urls in general are case sensitive (http://www.w3.org/TR/WD-html40-970708/htmlweb.html) and you may want to see if all 404 happened when customer were requesting the page in certain case.

Known issues:

  1. There is no smart request name calculation for attributes-based routing today
  2. Custom implementation of routing is not supported out of the box. You’ll need to implement your own WebOperationNameTelemetryInitializer implementation to override standard behavior.