APM tips blog

Blog about application monitoring.

DIY: Data Sampling

| Comments

[Updated 3-Sept-2015]: updated as channel was renamed

Now when pricing for Application Insights is announced you might be wondering - how can you make these prices truly cloud-friendly and only pay for what you are using. You may also wonder - how to fit into throttling limits even if you are willing to pay for all the telemetry data your application produces. Note, you wouldn’t need any of this if your application doesn’t have a big load. There are great filtering and grouping capabilities in Application Insights UI so you will be better off having all the data on the server.

There are four techniques to minimize the amount of data your application reports - separate traffic, filter not interesting data, sample and aggregate. Today I’ll explain how to implement sampling.

Sampling will only work for high-load applications. The idea is to send only every n-th request to the server. With the normal distribution of requests and high load you’ll get statistically correct values for all sorts of aggregations (don’t forget to multiply all values you see in UI to n).

There is no out-of-the-box support for sampling or filtering in Application Insights today. So the idea of implementing it will be to replace standard channel with the custom-made. For every telemetry item this channel may decide whether to send it to the portal or not.

First, define a new class. You’ll need to have your own instance of ServerTelemetryChannel from the NuGet package Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel. I’ve also defined public property SampleEvery so you can configure how much data to sample out using configuration file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public class RequestsSamplingChannel : ITelemetryChannel, ITelemetryModule
{
    private int counter = 0;

    private ServerTelemetryChannel channel;

    public int SampleEvery { get; set; }

    public RequestsSamplingChannel()
    {
        this.channel = new ServerTelemetryChannel();
    }

    public void Initialize(TelemetryConfiguration configuration)
    {
        this.channel.Initialize(configuration);
    }
}

Now you should implement Send method. In this example I apply sampling only to Requests so performance counters, metrics, dependencies and traces will not be sampled. If telemetry item is of type RequestTelemetry I’d increment counter and every SampleEvery-th time will send this item using standard channel:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public void Send(ITelemetry item)
{
    if (item is RequestTelemetry)
    {
        int value = Interlocked.Increment(ref this.counter);
        if (value % this.SampleEvery == 0)
        {
            this.channel.Send(item);
        }
        else
        {
            //value was sampled out. Do nothing
        }
    }
    else
    {
        this.channel.Send(item);
    }
}

For all other properties and methods - just proxy them to the standard channel:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public bool DeveloperMode
{
    get
    {
        return this.channel.DeveloperMode;
    }
    set
    {
        this.channel.DeveloperMode = value;
    }
}

public string EndpointAddress
{
    get
    {
        return this.channel.EndpointAddress;
    }
    set
    {
        this.channel.EndpointAddress = value;
    }
}

public void Flush()
{
    this.channel.Flush();
}

public void Dispose()
{
    this.channel.Dispose();
}

Now you can use this channel. In ApplicationInsights.config file replace the TelemetryChannel node with the following. You can read more on how Application Insights SDK instantiate objects from configuration in my previous post:

1
2
3
<TelemetryChannel Type="ApmTips.RequestsSamplingChannel, ApmTips">
  <SampleEvery>10</SampleEvery>
</TelemetryChannel>

You can implement all sort of interesting sampling algorithms using this approach. Instead of counter you can use random generated value or even RequestTelemetry.ID property that is fairly random.

Next time I’ll cover other ways to minimize the amount of data you are sending to Application Insights.

New SDK Version Published

| Comments

The version 0.15 of Application Insights SDK was published on NuGet. Good news - this SDK reverts back API changes made in 0.14. Specifically it returns back TelemetryContext.Properties mentioned here.

More changes:

  • Couple NuGet packages were renamed to make names more descriptive:
    • PerformanceCollector package renamed to PerfCounterCollector
    • RuntimeTelemetry renamed to DependencyCollector
  • New property Operation.SyntheticSource now available on TelemetryContext. Now you can mark your telemetry items as “not a real user traffic” and specify how this traffic was generated. As an example by setting this property you can distinguish traffic from your test automation from load test traffic.
  • Application Insights Web package now detects the traffic from Availability monitoring of Application Insights and marks it with specific SyntheticSource property.
  • Channel logic was moved to the separate NuGet called Microsoft.ApplicationInsights.PersistenceChannel.

JavaScript SDK Is Now on GitHub

| Comments

Please welcome JavaScript SDK on github. Recently I explained snippet code you need to inject to your project to use Application Insights JavaScript SDK. Now you can see the code of the main script. Please report issues on github and welcome to contribute!

Where Are the Telemetry Context Properties?

| Comments

Recently the change was made. Upgrading from Application Insights SDK version 0.13 to the version 0.14 you may notice that some public interfaces were changed. Specifically, public property Properties was removed from TelemetryContext class. Yes, one that mentioned in the documentation and blog post. One that is very important to enable many scenarios.

It is not a change we plan to keep for a long time. Public interface will be reverted back soon.

I thought a lot on how to explain this change and what led to this. Now I know what was the motivation of this change, I can tell that semantic versioning is designed to experiment with API surface, I know why it wasn’t immediately reverted. However I better not to go into details. I want to assure you that we understand that such a big API change should not happen again without the notice.

Version 0.14 of SDK brings great new features like ability to monitor custom performance counters. If you want to use this features and need custom properties I’d recommend to wait for the new version of SDK. If you don’t want to wait - here is a workaround (it will only work in 0.14 SDK).

If you are using context initializer - convert it to telemetry initializer. Now in telemetry initializer use ISupportProperties interface to set properties for telemetry item:

1
2
3
4
5
6
7
8
9
10
11
public class WorkaroundTelemetryInitializer : ITelemetryInitializer
{
    public void Initialize(ITelemetry telemetry)
    {
        var propsTelemetry = telemetry as ISupportProperties;
        if (propsTelemetry != null)
        {
            propsTelemetry.Properties["environment"] = "development";
        }
    }
}

Again, this workaround will only be needed for the version 0.14 and will not work in the next version of SDK.

Javascript Snippet Explained

| Comments

Big thanks to Scott Southwood who helped to prepare this post.

For end user monitoring Application Insights requires to add this JavaScript snippet to the page:

1
2
3
4
5
6
7
8
var appInsights=window.appInsights||function(config){
    function s(config){t[config]=function(){var i=arguments;t.queue.push(function(){t[config].apply(t,i)})}}var t={config:config},r=document,f=window,e="script",o=r.createElement(e),i,u;for(o.src=config.url||"//az416426.vo.msecnd.net/scripts/a/ai.0.js",r.getElementsByTagName(e)[0].parentNode.appendChild(o),t.cookie=r.cookie,t.queue=[],i=["Event","Exception","Metric","PageView","Trace"];i.length;)s("track"+i.pop());return config.disableExceptionTracking||(i="onerror",s("_"+i),u=f[i],f[i]=function(config,r,f,e,o){var s=u&&u(config,r,f,e,o);return s!==!0&&t["_"+i](config,r,f,e,o),s}),t
}({
    instrumentationKey:"d2cb4759-8e2c-453a-996c-e65c9d0e946a"
});

window.appInsights=appInsights;
appInsights.trackPageView();

This snippet will make initial set up for end user tracking and then download the reset of the monitoring logic from CDN.

There are number of reasons why this script needs to be injected into the page html code. First, placing script into the page will not require additional download at the early stage of the page loading phase. So page loading time will not be affected. Second, it provides you API to track metrics and events so you don’t need to check whether the full Application Insights script is already loaded or not. Third, this script is working in application domain so it can subscribe on onerror callback and get a full error stack. Due to security restrictions browser will not give you the full error stack if you subscribe on onerror callback from the script downloaded from the different domain. It also takes cookies from application domain so they can be used for user/session tracking. Here is more detailed explanation of what it is doing.

First, we check that Application Insights object haven’t been created yet. If so - we will create it:

1
var appInsights = window.appInsights || function (config) {

Next goes a helper function that will define a new callback with the name passed as an argument. Methods like appInsights.trackEvent will be created using this helper. Initial implementation of this method is to put an object into the queue for further processing. “Real” implementation will come with Application Insights javascript file downloaded from CDN later:

1
2
3
4
5
6
7
8
   function s(config) {
         t[config] = function () {
                var i = arguments;
                t.queue.push(function () {
                       t[config].apply(t, i)
                })
         }
   }

Now real appInsights object is defined under the pseudo name t. Initially it only contains config field. More fields and methods will be created later:

1
   var t = { config: config },

Bunch of constants and variables:

1
2
3
4
5
6
                r = document,
                f = window,
                e = "script",
                o = r.createElement(e),
                i,
                u;

In the beginning of this for loop snippet will create script element in DOM model so Application Insights javascript file will be download later from CDN:

1
2
   for (o.src = config.url || "//az416426.vo.msecnd.net/scripts/a/ai.0.js",
                r.getElementsByTagName(e)[0].parentNode.appendChild(o),

Store domain cookies in appInsights object:

1
                t.cookie = r.cookie,

Create an events queue:

1
                    t.queue = [],

And now there is an actual loop. Iterating thru the collection methods like appInsights.trackEvent will be created (remember helper s in the very beginning of the snippet):

1
2
3
4
                i = ["Event", "Exception", "Metric", "PageView", "Trace"];
                i.length;
                )
                s("track" + i.pop());

Now, subscribe to onerror callback to catch javascript errors in page initialization. You can disable thie logic by setting disableExceptionTracking property to false:

1
   return config.disableExceptionTracking ||

Using the same helper from the beggining of script define appInsights._onerror method:

1
2
                (i = "onerror",
                s("_" + i),

Now save existing window.onerror (f is window, see above in constants section) callback from the page and replace it with the new implementation. New implementation will chain the call to appInsights._onerror and call to initial window.onerror implementation:

1
2
3
4
5
                u = f[i],
                f[i] = function (config, r, f, e, o) {
                       var s = u && u(config, r, f, e, o);
                       return s !== !0 && t["_" + i](config, r, f, e, o), s
                }),

Return freshly created appInsights object from the method:

1
2
                t
}

Constructor function is created. Now call it, providing initial configuration. The only required piece of configuration is instrumentationKey:

1
2
3
({
       instrumentationKey: "@RequestTelelemtry.Context.InstrumentationKey"
});

Now we store appInsights as a global variable and save into the queue pageView event:

1
2
window.appInsights = appInsights;
appInsights.trackPageView();

Read more on out-of-the box usage analytics. Inforamtion on tracking custom events and metrics here.

Status Monitor for Cloud Services

| Comments

Cool article on how to install Status Monitor on your web role. Don’t forget to install Microsoft.ApplicationInsights.Web NuGet package for your web project.

Now in order to track dependencies on worker roles you need to do the same and one additional step - set environment variables to tell worker role where those new components are:

1
2
3
4
5
6
7
8
9
10
11
<ServiceDefinition name="MyService" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
   <WorkerRole name="<name>">
      <Runtime>
         <Environment>
            <Variable name="COR_ENABLE_PROFILING" value="1" />
            <Variable name="COR_PROFILER" value="{324F817A-7420-4E6D-B3C1-143FBED6D855}" />
            <Variable name="MicrosoftInstrumentationEngine_Host" value="{CA487940-57D2-10BF-11B2-A3AD5A13CBC0}" />
         </Environment>
      </Runtime>
   </WebRole>
</ServiceDefinition>

More on how to set environment variables for your worker role is here.

Don’t forget to install NuGet package Microsoft.ApplicationInsights.RuntimeTelemetry on your worker role and instantiate TelemetryClient at least once on worker role startup.

Bug With StatusMonitor 5.0 Uninstall

| Comments

Once you’ve started using Application Insights it is essential to install Status Monitor. Status Monitor will enable dependencies tracking as mentioned in one of the previous posts. We use Status Monitor to track dependencies for our own internal services. As Brian Harry wrote about Visual Studio Online services “smaller services are better” we have quite a few interconnected services. Knowing that service you depend on became slower or started failing at a glance is very important.

As we monitor our own services with Application Insights - for some of our services we have startup task that installs Status Monitor. Month ago a small bug in Status Monitor was one of the reasons of quite a serious outage.

Status Monitor in a nutshell is just an installer of Application Insights components and UI to see status of monitoring (as name suggests). By itself it doesn’t collect any application telemetry or running any background services. So you may ask - why I’m saying that Status Monitor caused the outage?

And the answer is simple. We are committed to dog food. So we try to run the latest bits of Application Insights SDK and every service restart we are trying to download the latest components. Unfortunately, Status Monitor 5.0 has an issue - when it uninstalls it leaves some registry settings in bad state. Specifically, it make Environment string empty for these three services:

1
2
3
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\IISADMIN
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\W3SVC
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\WAS

So after uninstall IIS will try to start and will fail as it doesn’t expect Environment string to be empty. Here is how it surfaced when you run iisreset /start command:

1
The IIS Admin Service or the World Wide Web Publishing Service, or a service dependent on them failed to start.  The service, or dependent services, may had an error during its startup or may be disabled

And these messages you’ll see in Event Log:

1
2
3
4
5
6
7
8
9
10
11
Log Name:      System
Source:        Microsoft-Windows-IIS-IISReset
Date:          2/25/2015 9:08:15 AM
Event ID:      3201
Task Category: None
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      sergey-surface
Description:
IIS start command received from user SERGEY-SURFACE\Sergey. The logged data is the status code.
1
2
3
4
5
6
7
8
9
10
11
12
Log Name:      System
Source:        Service Control Manager
Date:          2/25/2015 9:08:15 AM
Event ID:      7000
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      sergey-surface
Description:
The Windows Process Activation Service service failed to start due to the following error: 
The parameter is incorrect.
1
2
3
4
5
6
7
8
9
10
11
12
Log Name:      System
Source:        Service Control Manager
Date:          2/25/2015 9:08:15 AM
Event ID:      7001
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      sergey-surface
Description:
The World Wide Web Publishing Service service depends on the Windows Process Activation Service service which failed to start because of the following error: 
The parameter is incorrect.

Solution is simple - right after uninstall of Status Monitor 5.0 - install the new one or delete “Environment” string from registry keys mentioned above.

There will be some excited features coming in Status Monitor in future and I hope you will never run into issue upgrading it.

Request Name and Url

| Comments

In my previous post on web requests tracking http module I mentioned that Application Insights http module has some smart logic to collect request name. This logic is needed to make meaningful aggregation on UI side. In the screenshot below you see that requests are grouped by request name. Aggregations like number of requests and average execution time for requests were calculated for some pages. And those aggregations completely unusable for requests to “__browserLink”:

Here is how request name calculation logic works today:

  1. ASP.NET MVC support. Request name is calculated as “VERB controller/action”.
  2. ASP.NET MVC Web API support. Following the logic above both requests “/api/movies/” and “/api/movies/5” will be resulted in “GET movies”. So to support Web API request name includes the list of all names of routing parameters in case if “action” parameter wasn’t found. In example above you’ll see requests “GET movies” and “GET movies[id]”.
  3. If routing table is empty or doesn’t have “controller” - HttpRequest.Path will be used as a request name. This property doesn’t include domain name and query string.

Application Insights web SDK will send request name “as is” with regards to letter case. Grouping on UI will be case sensitive so “GET Home/Index” will be counted separately from “GET home/INDEX” even though in many cases they will result in the same controller and action execution. The reason for that is that urls in general are case sensitive (http://www.w3.org/TR/WD-html40-970708/htmlweb.html) and you may want to see if all 404 happened when customer were requesting the page in certain case.

Known issues:

  1. There is no smart request name calculation for attributes-based routing today
  2. Custom implementation of routing is not supported out of the box. You’ll need to implement your own WebOperationNameTelemetryInitializer implementation to override standard behavior.

More on ApplicationInsights.config

| Comments

If you were using my instructions on proxying Application Insights data - please note that format of configuration file has changed and you should not use tag “InProcess” when specifying an endpoint. I updated that post and want to explain how ApplicationInsights.config instantiates objects. This applies to every object you can configure in this configuration file - be it TelemetryInitializer, ContextInitializer, TelemetryModule or Channel.

Main idea behind ApplicationInsights.config file is that this config file should not be required. In ideal world all aspects of monitoring should be coded into your application. That’s why we try to avoid any dependencies on file format or schema for SDK objects.

Every object you can configure in ApplicationInsights.config can define “Type”. It also may have any number of child xml nodes which will be used to initiate corresponding properties of constructed object. For instance the following configuration snippet will construct object of type “ApmTips.Tools.PropertiesContextInitializer, ApmTips.Tools” and assign value “Bar” to property “Foo”. Since this object is defined in ContextInitializers section Application Insights SDK will ensure that class implements “IContextInitializer” interface.

1
2
3
4
5
<ContextInitializers>
  <Add Type="ApmTips.Tools.PropertiesContextInitializer, ApmTips.Tools">
    <Foo>Bar</Foo>
  </Add>
</ContextInitializers>

Corresponding class should look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
namespace ApmTips.Tools
{
    using Microsoft.ApplicationInsights.DataContracts;
    using Microsoft.ApplicationInsights.Extensibility;

    public class PropertiesContextInitializer : IContextInitializer
    {
        public string Foo { get; set; }

        public void Initialize(TelemetryContext context)
        {
            context.Properties["Foo"] = this.Foo;
        }
    }
}

And when you run a program it will add additional property “Foo” with the value “Bar” to every telemetry data item:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
  "name":"Microsoft.ApplicationInsights.Request",
  "time":"2015-02-06T17:13:21.1222232+00:00",
  "iKey":"key",
  "tags":{
    ...
    "ai.device.model":"Surface Pro 3",
    "ai.device.machineName":"sergey-surface",
    "ai.operation.name":"GET Home/Index",
    "ai.operation.id":"3518850146076059859"
  },
  "data":{
    "baseType":"RequestData",
    "baseData":{
      "name":"GET Home/Index",
      "startTime":"2015-02-06T17:13:21.1222232+00:00",
      "duration":"00:00:02.5762099",
      "responseCode":"200",
      ...
      "properties":{
        "Foo":"Bar",
      }
    }
  }
}

This will apply to TelemetryChannel node as well. You can override the Type attribute of this node to specify your own channel. “DeveloperMode” and “EndpointAddress” are just public properties of the InProcessTelemetryChannel class that is assumed when Type is not specified.