APM tips blog

Blog about application monitoring.

Send Correlation Identifiers

| Comments

Small post on how to send correlation identifiers to the application monitored by Application Insights. It is a reflection on investigation what Application Insights Availability tests need to send to the application to natively correlate test execution identifier with the telemetry produced by that application.

Here is a small asp.net core test application I used. It writes in response request telemetry properties so can be easily used with curl.

1
2
3
4
5
6
7
var r = context.Features.Get<RequestTelemetry>();
await context.Response.WriteAsync(
    "RequestTelemetry: " +
        " operation_id=" + r?.Context.Operation.Id +
        " parentId=" + r?.Context.Operation.ParentId +
        " id=" + r?.Id +
        " source=" + (r?.Source ?? "") + "\n");

Basic correlation

Http correlation protocol used by Application Insights is posted on GitHub. The good thing about this protocol is that it’s flexible and works with the most identity schemes you may have. If you want to correlate entire distributed transaction by the given identifier - the only thing you need to do is to send it as a Request-ID header.

1
2
3
4
5
6
curl -H "Request-ID:{78505740-5180-4809-968e-39284bde1a4e}" http://localhost:5000

RequestTelemetry:
    operation_id={78505740-5180-4809-968e-39284bde1a4e}
    parentId={78505740-5180-4809-968e-39284bde1a4e}
    id=|{78505740-5180-4809-968e-39284bde1a4e}.989973aa_

In the example above I formatted identifier as a GUID. For the real life implementation, I’d suggest formatting this GUID as a 16-bytes array in hex. Like 4bf92f3577b34da6a3ce929d0e0e4736. It will be consistent with the future direction on correlation protocol.

Sequencing

If one test sends multiple requests to one or many applications - you’d want to send a different id-s with every request. It’s easy to do. Just append to the test execution identity any random seed and sequence number of the request. In the example below seed is sd and sequencing starts with 1.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
curl -H "Request-ID:|{78505740-5180-4809-968e-39284bde1a4e}.sd_1" http://localhost:5000
RequestTelemetry:
    operation_id= {78505740-5180-4809-968e-39284bde1a4e}
    parentId=|{78505740-5180-4809-968e-39284bde1a4e}.sd_1
    id=|{78505740-5180-4809-968e-39284bde1a4e}.sd_1.989973ad_

curl -H "Request-ID:|{78505740-5180-4809-968e-39284bde1a4e}.sd_2" http://localhost:5000
RequestTelemetry:
    operation_id= {78505740-5180-4809-968e-39284bde1a4e}
    parentId=|{78505740-5180-4809-968e-39284bde1a4e}.sd_2
    id=|{78505740-5180-4809-968e-39284bde1a4e}.sd_2.989973ae_

curl -H "Request-ID:|{78505740-5180-4809-968e-39284bde1a4e}.sd_3" http://localhost:5000
RequestTelemetry:
    operation_id= {78505740-5180-4809-968e-39284bde1a4e}
    parentId=|{78505740-5180-4809-968e-39284bde1a4e}.sd_3
    id=|{78505740-5180-4809-968e-39284bde1a4e}.sd_3.989973af_

Application identifier

As I mentioned before for the better application map, you’d need to propagate an app-id of the calling component. First, having instrumentation key you can get app-id:

1
2
curl  https://dc.services.visualstudio.com/api/profiles/074608ec-29c0-41f1-a7c6-54f30d520629/appId
cbf775c7-b52e-4533-8673-bd6fbd7ab04a

Then you can send app-id as a Request-Context header:

1
2
3
4
5
6
7
8
9
10
curl
    -H "Request-ID:|{78505740-5180-4809-968e-39284bde1a4e}.sd_3"
    -H "Request-Context: appId=cid-v1:cbf775c7-b52e-4533-8673-bd6fbd7ab04a"
    http://localhost:5000

RequestTelemetry:
    operation_id={78505740-5180-4809-968e-39284bde1a4e}
    parentId=|{78505740-5180-4809-968e-39284bde1a4e}.sd_3
    id=|{78505740-5180-4809-968e-39284bde1a4e}.sd_3.ca349ca3_
    source=cid-v1:cbf775c7-b52e-4533-8673-bd6fbd7ab04a

This way you can identify two components to correlate telemetry. RequestTelemetry’s source field points to the component that sent original request.

Summary

You can use this correlation technique when you run some synthetic traffic on your application or call it from some mobile application.

Use Network Information API With Application Insights

| Comments

Recently Chrome enabled support of Networking API. The main idea of the API is to give you control over site behavior on weak connection. The first step to decide how much to invest into network-specific optimizations is to collect statistics. Here is how you can do it with Application Insights JavaScript SDK.

There are two things you may want to do. First, extend all events with the network properties information. You can analyze how long page loads on all connections or how AJAX calls behavior changes on weak networks. Second - track network switches. For long living pages like SPA applications, this statistic may be interesting.

Network properties

This example shows how to add network properties to all telemetry items sent from the page.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
appInsights.queue.push(() => {
  appInsights.context.addTelemetryInitializer((envelope) => {
    var telemetryItem = envelope.data.baseData;

    telemetryItem.properties = telemetryItem.properties || {};
    try {
      telemetryItem.properties["navigator.connection.type"] = navigator.connection.type;
      telemetryItem.properties["navigator.connection.downlink"] = navigator.connection.downlink;
      telemetryItem.properties["navigator.connection.rtt"] = navigator.connection.rtt;
      telemetryItem.properties["navigator.connection.downlinkMax"] = navigator.connection.downlinkMax;
      telemetryItem.properties["navigator.connection.effectiveType"] = navigator.connection.effectiveType;
    } catch (e) {
      telemetryItem.properties["navigator.connection.type"] = "Navigation API not supported";
    }
  });
});

I’m traveling today. This picture shows what I saw in airport:

Once I connected to the internet in plane I see other values:

You can query analyze pageViews now using simple query:

1
2
pageViews
  | summarize sum(itemCount) by tostring(customDimensions["navigator.connection.effectiveType"])

Network change event

To get the network change event, you need to subscribe on change event. When function called - track an event with the new network properties.

1
2
3
4
5
6
7
8
9
10
11
navigator.connection.addEventListener('change', logNetworkInfo);

function logNetworkInfo() {
  appInsights.trackEvent("NetworkChanged", {
    "navigator.connection.type": navigator.connection.type,
    "navigator.connection.downlink": navigator.connection.downlink,
    "navigator.connection.rtt": navigator.connection.rtt,
    "navigator.connection.downlinkMax": navigator.connection.downlinkMax,
    "navigator.connection.effectiveType": navigator.connection.effectiveType
  });
}

Summary

As you see, it’s straightforward to enrich your telemetry. You may now better understand your customers. Based on telemetry you can prioritize optimizing your site for faster or slower networks.

Send Metric to Application Insights

| Comments

I already posted how to send telemetry to Application Insights REST endpoint using PowerShell one-liner. This post shows how to send metric using curl.

Here is a minimal JSON represents the metric. Set iKey where to send the metric, time this metric reported for and metrics collection. Note, that baseType should be set to MetricData. The field name in envelope is redundant in the context of this API.

Here is an example of JSON:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
    "iKey": "f4731d25-188b-4ec1-ac44-9fcf35c05812",
    "time": "2017-10-27T00:01:52.9586379Z",
    "name": "MetricData",
    "data": {
        "baseType": "MetricData",
        "baseData": {
            "metrics": [
                {
                    "name": "Custom metric",
                    "value": 1,
                    "count": 1
                }
            ]
        }
    }
}

Microsoft bond definition for the MetricData document is located in ApplicationInsights-Home repository.

Now you can send this JSON to Application Insights endpoint:

1
2
3
4
5
6
7
8
curl -d '{"name": "MetricData", "time":"2017-10-27T00:01:52.9586379Z",
    "iKey":"f4731d25-188b-4ec1-ac44-9fcf35c05812",
    "data":{"baseType":"MetricData","baseData":
    {"metrics":[{"name":"Custom metric","value":1,"count":1}]}}}'
    https://dc.services.visualstudio.com/v2/track

StdOut:
{"itemsReceived":1,"itemsAccepted":1,"errors":[]}

You can send multiple new-line delimited metrics in one http POST.

1
2
3
4
5
6
7
8
9
10
11
12
13
curl -d $'{"name": "MetricData", "time":"2017-10-27T00:01:52.9586379Z",
    "iKey":"f4731d25-188b-4ec1-ac44-9fcf35c05812",
    "data":{"baseType":"MetricData","baseData":
    {"metrics":[{"name":"Custom metric on line 1","value":1,"count":1}]}}}\n

    {"name": "MetricData", "time":"2017-10-27T00:01:52.9586379Z",
    "iKey":"f4731d25-188b-4ec1-ac44-9fcf35c05812",
    "data":{"baseType":"MetricData","baseData":
    {"metrics":[{"name":"Custom metric on line 2","value":1,"count":1}]}}}'
    https://dc.services.visualstudio.com/v2/track

StdOut:
{"itemsReceived":2,"itemsAccepted":2,"errors":[]}

If you want to add a few dimensions to your metric - you can use properties collection.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
{
    "iKey": "f4731d25-188b-4ec1-ac44-9fcf35c05812",
    "time": "2017-10-27T00:01:52.9586379Z",
    "name": "MetricData",
    "data": {
        "baseType": "MetricData",
        "baseData": {
            "metrics": [
                {
                    "name": "Custom metric",
                    "value": 1,
                    "count": 1
                }
            ],
            "properties": {
                "dimension1": "value1",
                "dimension2": "value2"
            }
        }
    }
}

Now you can create stacked area chart using analytics query:

1
2
3
customMetrics
    | summarize avg(value) by tostring(customDimensions.dimension1), bin(timestamp, 1m)
    | render areachart kind=stacked

There are no limits on number of dimensions or its cardinality. You can even summarize by derived field. For example, you can aggregate a metric by substring of a dimension value.

1
2
3
4
customMetrics
    | extend firstChar = substring(tostring(customDimensions.dimension1), 0, 2)
    | summarize avg(value) by firstChar, bin(timestamp, 1m)
    | render areachart kind=stacked

You can also specify standard dimensions using tags. This way you associate your metric with the specific application role or role instance. Or mark it with the user and account. Using standard dimensions enable better integration with the rest of telemetry. This example lists a few standard dimensions you can specify:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
{
    "iKey": "f4731d25-188b-4ec1-ac44-9fcf35c05812",
    "time": "2017-10-27T00:01:52.9586379Z",
    "name": "MetricData",
    "tags": {
        "ai.application.ver": "v1.2",
        "ai.operation.name": "CheckOut",
        "ai.operation.syntheticSource": "TestInProduction: Validate CheckOut",
        "ai.user.accountId": "Example.com",
        "ai.user.authUserId": "sergey@example.com",
        "ai.user.id": "qwoijcas",
        "ai.cloud.role": "Cart",
        "ai.cloud.roleInstance": "instance_0"
    },
    "data": {
        "baseType": "MetricData",
        "baseData": {
            "metrics": [
                {
                    "name": "Custom metric",
                    "value": 1,
                    "count": 1
                }
            ]
        }
    }
}

You can also set more aggregates for the metric. Besides value (which is treated as sum) and count you can specify min, max, and standard deviation stdDev

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{
    "iKey": "f4731d25-188b-4ec1-ac44-9fcf35c05812",
    "time": "2017-10-27T00:01:52.9586379Z",
    "name": "MetricData",
    "data": {
        "baseType": "MetricData",
        "baseData": {
            "metrics": [
                {
                    "name": "Custom metric",
                    "value": 5,
                    "min": 0,
                    "max": 3,
                    "stdDev": 1.52753,
                    "count": 3
                }
            ]
        }
    }
}

Price of a single metric

Application Insights charges $2.3 per Gb of telemetry. Let’s say the metric document you send is 500 bytes. Metric with just name and value have size of 200 bytes. So 500 bytes includes few dimensions. If you send one metric per minute, you are paying for 24 * 60 * 500 * 30 bytes per month or 0.02 Gb per month. If you send this metric from 5 different instances of your application - it is 0.1 Gb or 23 cents. I’m not taking into account the first free Gb you are getting every month.

With Application Insights today, you cannot pay flat rate for a metric. On other hand, you get rich analytics language on every metric document you sent, not just access to metric aggregates.

Metrics REST API shortcomings

Multiple metrics in a single document

Metrics document schema defines an array of metrics. However Application Insights only supports one element in this array.

When this limitation is removed - every metric in collection supports its own set of dimensions. Today, dimensions are set on document level to align with all other telemetry types that Application Insights support.

Aggregation Interval

Application Insights assumes 1-minute aggregation for all reported metrics. You can easily work around this assumption using Analytics queries.

In standard metrics aggregator custom property MS.AggregationIntervalMs used to indicate the aggregation interval. This property used primarily to smooth out metrics after Flush was called before the aggregation period ended.

APIs of other metrics solutions:

AWS CloudWatch

SignalFX

Google StackDriver

Two Types of Correlation

| Comments

Application Map in Application Insights supports two types of correlation. One shows nodes in the map as instrumentation keys and another - as roles inside a single instrumentation key. The plan is to combine them. In this post, I explain why there are two separate maps today and how Application Map is built from telemetry events.

Single instrumentation key Application Map

Application Insights data model defines incoming requests and outgoing dependency calls. When SDK collects these events - it populates request’s source field and dependency’s target. Now it’s easy to draw an Application Map in a form of a star. Application is in the center of the star and every node it connected to describes the source of incoming request or target or outgoing. These two queries show how to do it:

1
2
dependencies | summarize sum(itemCount) by target
requests | summarize sum(itemCount) by source

When you have an Application Map is in a form of a star, you may notice that some http dependency calls are actually the calls to another component of your application. If both components send data to the same instrumentation key you can easily follow this call by joining telemetry. Dependency call of component A has an id matching the parentId of incoming request of component B. Typically those components will de deployed separately and would have a different cloud_roleName field. So by running this query you get the list of all components (defined as cloud_roleName) talking to each other:

1
2
3
dependencies 
  | join requests on $left.id == $right.operation_ParentId 
  | summarize sum(itemCount) by from = cloud_RoleName1, to = cloud_RoleName

This query joins outgoing from component A dependency call to the request incoming to the component B. Now you can see a map where every node is a separate cloud_RoleName. Note, that some of dependency calls are made to external components. To draw those you’d still need to use a target field from before. With the slight modification - you need to group it by cloud_roleName:

1
dependencies | summarize sum(itemCount) by from = cloud_RoleName, to = target

This example shows how to build an application map from code. First - define some constants:

1
2
string SINGLE_INSTRUMENTATION_KEY = "3b162b68-47d7-4a8c-b031-c246206696e3";
var TRACE_ID = Guid.NewGuid().ToString();

First component - let’s call it Frontend reports RequestTelemetry and related DependencyTelemetry. Both define the .Cloud.RoleName context property to identify the component.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
var r = new RequestTelemetry(
    name: "PostFeedback",
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    responseCode: "200",
    success: true)
{
    Source = "" //no source specified
};
r.Context.Operation.Id = TRACE_ID; // initiate the logical operation ID (trace id)
r.Context.Operation.ParentId = null; // this is the first span in a trace
r.Context.Cloud.RoleName = "Frontend"; // this is the name of the node on app map

new TelemetryClient() { InstrumentationKey = SINGLE_INSTRUMENTATION_KEY }.TrackRequest(r);

var d = new DependencyTelemetry(
    dependencyTypeName: "Http",
    target: $"myapi.com", //host name
    dependencyName: "POST /feedback",
    data: "https://myapi.com/feedback/text='feedback text'",
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    resultCode: "200",
    success: true);
d.Context.Operation.ParentId = r.Id;
d.Context.Operation.Id = TRACE_ID;
d.Context.Cloud.RoleName = "Frontend"; // this is the name of the node on app map

new TelemetryClient() { InstrumentationKey = SINGLE_INSTRUMENTATION_KEY }.TrackDependency(d);

Frontend component needs to pass global trace ID Context.Operation.Id and dependency call ID d.Id to the next component. Typically those identities are sent via http header.

Component called API Service tracks incoming request and instantiates context properties .Context.Operation.ParentId and .Context.Operation.Id from the incoming request headers. These context properties allow to join dependency call from Frontend to request in API Service.

In this example API Service in turn calls the external API.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
r = new RequestTelemetry(
    name: "POST /feedback",
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    responseCode: "200",
    success: true)
{
    Url = new Uri("https://myapi.com/feedback/text='feedback text'")
};
r.Context.Operation.Id = TRACE_ID; // received from http header
r.Context.Operation.ParentId = d.Id; // received from http header 
r.Context.Cloud.RoleName = "API Service"; // this is the name of the node on app map

new TelemetryClient() { InstrumentationKey = SINGLE_INSTRUMENTATION_KEY }.TrackRequest(r);

d = new DependencyTelemetry(
    dependencyTypeName: "http",
    target: $"api.twitter.com",
    dependencyName: "POST /twit",
    data: "https://api.twitter.com/twit",
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    resultCode: "200",
    success: true);
d.Context.Operation.ParentId = r.Id;
d.Context.Operation.Id = TRACE_ID;
d.Context.Cloud.RoleName = "API Service"; // this is the name of the node on app map

new TelemetryClient() { InstrumentationKey = SINGLE_INSTRUMENTATION_KEY }.Track(d);

Multi-role Application Map is in preview now. So in order to see it you’d need to enable it as shown on the picture:

Result of the code execution looks something like this picture:

You can see that every component of your application is represented as a separate node on Application Map. However an important limitation of this approach is that it only works when every component uses the same instrumentation key. The main reason for this limitation is that Application Insights did not support cross applications joins for long time.

Application Insights supports cross instrumentation key queries now, but joins across components are still expensive. Join-based approach may still fall apart. First, you never know in advance which instrumentation keys you need to join across. Second, rare calls may be easily sampled out when the load of telemetry is high.

Cross instrumentation key Application Map

In order to solve problems of join-based Application Map, components need to exchange an identity information. So component A shares its identity when send request to component B. And component B replying with its identity.

This approach is used for the cross instrumentation key application map. This diagram shows how:

  • Component A sends its Application Insights identity with request…
  • … and expect the target component B to send its identity back

Knowing identity of the component allows to pre-aggregate metrics and make sure that even rare calls to a certain dependent component are not sampled out.

This example shows how it works in code. First, define two separate instrumentation keys:

1
2
string FRONTEND_INSTRUMENTATION_KEY = "fe782703-16ea-46a8-933d-1769817c038a";
string API_SERVICE_INSTRUMENTATION_KEY = "2a42641e-2019-423a-a2b5-ecab34d5477d";

Next step is to get the app-id for each instrumentation key. Exposing instrumentation key to dependant services is not a good practice as it can be used to spoof telemetry. app-id identifies component, but cannot be used to submit telemetry to Application Insights.

1
2
3
4
5
6
7
8
9
10
// Obtaining APP ID for these instrumentation keys.
// We are using app ID for correlation as propagating it via HTTP boundaries do not expose the instrumentation key, but still 
// uniquely identifies the Application Insights resource
var task = new HttpClient().GetStringAsync($"https://dc.services.visualstudio.com/api/profiles/{FRONTEND_INSTRUMENTATION_KEY}/appId");
task.Wait();
var FRONTEND_APP_ID = task.Result;

task = new HttpClient().GetStringAsync($"https://dc.services.visualstudio.com/api/profiles/{API_SERVICE_INSTRUMENTATION_KEY}/appId");
task.Wait();
var API_SERVICE_APP_ID = task.Result;

Sending the Request and dependency calls from the first component. Note, that dependency now initializes Target field with additional information. API Service component returned its app-id in the http response so the Frontend component can associate it with the dependency telemetry.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
var TRACE_ID = Guid.NewGuid().ToString();


//Frontend initiates a logical operation
var r = new RequestTelemetry(
    name: "PostFeedback", //this is the name of the operation that initiated the entire distributed trace
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    responseCode: "200",
    success: true)
{
    Source = "", //no source specified
    Url = null, // you can omit it if you do not use it
};
r.Context.Operation.Id = TRACE_ID; // initiate the logical operation ID (trace id)
r.Context.Operation.ParentId = null; // this is the first span in a trace

new TelemetryClient() { InstrumentationKey = FRONTEND_INSTRUMENTATION_KEY }.TrackRequest(r);

// Frontend calls into API service. For http communication we expect that response will have a header:
// Request-Context: appId={MAS_SHAKE_APP_ID}
var d = new DependencyTelemetry(
    dependencyTypeName: "Http (tracked component)", //(tracked component) indicates that we recieved response header Request-Context
    target: $"myapi.com | cid-v1:{API_SERVICE_APP_ID}", //host name, | char and app ID from the response headers if available
    dependencyName: "POST /feedback",
    data: "https://myapi.com/feedback/text='feedback text'",
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    resultCode: "200",
    success: true);
d.Context.Operation.ParentId = r.Id;
d.Context.Operation.Id = TRACE_ID;


new TelemetryClient() { InstrumentationKey = FRONTEND_INSTRUMENTATION_KEY }.TrackDependency(d);

Component API Service also reports request and dependency telemetry. Note, that request telemetry has an identity of Frontend in the Source field.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// The following headers got propagated with the http request
//  Request-Id: |<r.Id>
//  Request-Context: appId=cid-v1:{DEVICES_APP_ID}
//

//Request got received by API service:

r = new RequestTelemetry(
    name: "POST /feedback", //this is the name of the operation that initiated the entire distributed trace
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    responseCode: "200",
    success: true)
{
    Source = $" | cid-v1:{FRONTEND_APP_ID}", // this is the value from the request headers
    Url = new Uri("https://myapi.com/feedback/text='feedback text'"), // you can omit it if you do not use it
};
r.Context.Operation.Id = TRACE_ID; // initiate the logical operation ID (trace id)
r.Context.Operation.ParentId = d.Id; // this is the first span in a trace

new TelemetryClient() { InstrumentationKey = API_SERVICE_INSTRUMENTATION_KEY }.TrackRequest(r);

d = new DependencyTelemetry(
    dependencyTypeName: "http",
    target: $"api.twitter.com",
    dependencyName: "POST /twit",
    data: "https://api.twitter.com/twit",
    startTime: DateTimeOffset.Now,
    duration: TimeSpan.FromSeconds(1),
    resultCode: "200",
    success: true);
d.Context.Operation.ParentId = r.Id;
d.Context.Operation.Id = TRACE_ID;

new TelemetryClient() { InstrumentationKey = API_SERVICE_INSTRUMENTATION_KEY }.Track(d);

This picture shows how application map looks like:

Future directions

There are many improvements coming in the Application Insights distributed applications monitoring story. Specifically for Application Map we are working on optimizing join queries and speed up the map rendering. Application Map is more reliable with metric pre-aggregations. There will be advanced filtering and grouping capabilities to slice and dice the map.

Query Multiple Applications

| Comments

Sometimes you need to execute a query on all Application Insights resources. It can be useful when you are searching some data and not sure which application has it. Or you want to create a report on all your apps. This script shows all the sdk versions your applications are instrumented with. You can use it to make sure that SDKs are up to date.

You can run this script from your computer or from Azure CLI bash console.

  1. First get the AAD token az account get-access-token --query accessToken.
  2. After that you take the list of Application Insights resources az resource list --namespace microsoft.insights --resource-type components. In this script, I only get apps from the current subscription. Previous post shows how to iterate over subscriptions.
  3. For every resource script runs a query by querying https://management.azure.com$ID/api/query/ with the access token from the first step.
  4. Finally, script uses python to parse JSON and extract interesting information.

Here is a whole script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/bin/bash
ACCESS_TOKEN=$(az account get-access-token --query accessToken | sed -e 's/^"//' -e 's/"$//')
QUERY="requests | union pageViews | where timestamp>ago(1d)|summarize by sdkVersion"
JSON_PATH='["Tables"][0]["Rows"]'

az resource list --namespace microsoft.insights --resource-type components --query [*].[id] --out tsv \
  | while read ID;
    do
      HTTP_RESPONSE=$(curl --silent --write-out "HTTPSTATUS:%{http_code}" --get \
        -H "Authorization: Bearer $ACCESS_TOKEN" \
        --data-urlencode "api-version=2014-12-01-preview" \
        --data-urlencode "query=$QUERY" \
        "https://management.azure.com$ID/api/query/")

      printf "$ID "
      HTTP_BODY=$(echo $HTTP_RESPONSE | sed -e 's/HTTPSTATUS\:.*//g')
      HTTP_STATUS=$(echo $HTTP_RESPONSE | tr -d '\n' | sed -e 's/.*HTTPSTATUS://')

      if [ $HTTP_STATUS -eq 200  ]; then
        echo "$HTTP_BODY" | python -c "import json,sys;obj=json.load(sys.stdin);print(obj$JSON_PATH)"
      else
        echo "$HTTP_STATUS"
      fi
    done

You get the result like this:

1
2
3
4
#sergey@Azure:~$ ./runquery.sh
#/subscriptions/<GUID>/resourceGroups/RG1/providers/microsoft.insights/components/webteststools [['web:2.3.0-979']]
#/subscriptions/<GUID>/resourceGroups/RG2/providers/microsoft.insights/components/myapp [['a_web:2.3.0-1223']]
#/subscriptions/<GUID>/resourceGroups/RG3/providers/microsoft.insights/components/apmtips [['javascript:1.0.11']]

Find Application by It’s Instrumentation Key

| Comments

Meant to show how to use the new Azure Cloud Shell. Unfortunately two scenarios I wanted to use it for are not that easy to implement. If you have time - go comment and upvote these two issues: azure-cli#3457 and azure-cli#3641.

Here is how you can find out the name of the application given its instrumentation key. This situation is not that rare. Especially if you have access to quite a few subscriptions and monitor many services deployed to different environments and regions. You have an instrumentation key in configuration file, but not sure where to search for telemetry.

First got to Azure Cloud Shell. It gives you bash and allows you to access all your azure resources.

Second create a file findApplicationByIkey.sh with the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/bin/bash

if [ -z "$ikeyToFind" ]; then
    echo "specify the instrumentaiton key"
    exit
fi
echo "search for instrumentation key $1"
ikeyToFind=$1

# this function search for the instrumentation key in a given subscription
function findIKeyInSubscription {
  echo "Switch to subscription $1"
  az account set --subscription $1

  # list all the Application Insights resources.
  # for each of them take an instrumentation key 
  # and compare with one you looking for
  az resource list \
    --namespace microsoft.insights --resource-type components --query [*].[id] --out tsv \
      | while \
          read ID; \
          do  printf "$ID " && \
              az resource show --id "$ID" --query properties.InstrumentationKey --o tsv; \
        done \
      | grep "$ikeyToFind"
}

# run the search in every subscription...
az account list --query [*].[id] --out tsv \
    | while read OUT; do findIKeyInSubscription $OUT; done

Finally, run it: ./findApplicationByIkey.sh ce85cf15-de20-49bb-83d7-234b5116623b

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
sergey@Azure:~/Sergey$ ./findApplicationByIkey.sh ce85cf15-de20-49bb-83d7-234b5116623b
search for instrumentation key ce85cf15-de20-49bb-83d7-234b5116623b
A few accounts are skipped as they don't have 'Enabled' state. Use '--all' to display them.
Switch to subscription 5fb94e1c-7bbf-4ab8-9c51-5dda40adc12e
Switch to subscription 52f57f24-51d5-479f-a532-facd9ee907a6
Switch to subscription eec57090-02b8-48f2-b78e-a38b7a53e1ab
/subscriptions/c3becfa8-419b-4b30-b08b-a2865ace64bf/resourceGroups/MY-RG/providers/
microsoft.insights/components/test-ai-app ce85cf15-de20-49bb-83d7-234b5116623b
Switch to subscription a8308a0b-9ee1-4548-9bbf-2b1d670e0767
The client 'Sergey@' with object id '03aa4cb5-650f-45bf-8d45-474664262685' does not have 
authorization to perform action 'Microsoft.Resources/subscriptions/resources/read' over 
scope '/subscriptions/edfd8475-8c5f-45c3-b533-a5132e8f9ada'.
Switch to subscription d6043348-75b2-41cd-ba7e-e1d317619002
...
...

The answer is: /subscriptions/c3becfa8-419b-4b30-b08b-a2865ace64bf/resourceGroups/MY-RG/providers/microsoft.insights/components/test-ai-app Better than guessing.

Page View and Telemetry Correlation

| Comments

For any monitoring and diagnostics solution, it is important to provide visibility into the transaction execution across multiple components. Application Insights data model supports telemetry correlation. So you can express interconnections of every telemetry item. A significant subset of interconnections is collected by default by Application Insights SDK. Let’s talk about page view correlations and its auto-collection.

Today you can enable telemetry correlation in JavaScript SDK setting the flag disableCorrelationHeaders to false.

1
2
3
4
// Default true. If false, the SDK will add correlation headers 
// to all dependency requests (within the same domain) 
// to correlate them with corresponding requests on the server side. 
disableCorrelationHeaders: boolean;

You get page view correlated to ajax calls and corresponding server requests. Something like shown on the picture:

As you may see, correlation assumes that page view initiated the correlation. Which is not always true. I explain scenarios later in the post.

Application Insights JavaScript SDK hijacks ajax calls and insert correlation headers to it. However, there is no easy way to correlate page views to other resources (scripts or images) without specialized browser extension or “hacky heuristics.” You can use referrer value or setting short-living cookies. But neither gives you a generic and reliable solution.

SPA or single page application may introduce multiple page views correlated to each other. React components may call/contain each other:

SPA is one of the reasons telemetry correlations is not enabled by default. SPA has only one page that initiates all communication to the server. Suddenly all application telemetry may become correlated to a single page view, which is not useful information.

BTW, ability to correlating page views is a primary reason for the github issue PageView should have own ID for proper correlation. As you see, PageViews may create their own execution hierarchy in SPA and Application Insights data model should support it.

You may also want to correlate page view with the originating server request:

It is easy to implement with the few lines of code. If you are using Application Insights Web SDK is 2.4-beta1 or higher, you can write something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
varappInsights=window.appInsights||function(config){
functioni(config){t[config]=function(){vari=arguments;t.queue.push(function(){t[config]......
    instrumentationKey:"a8cdcad4-2bcb-4ed4-820f-9b2296821ef8",
    disableCorrelationHeaders: false
});

window.appInsights = appInsights;
window.appInsights.queue.push(function () {
    var serverId ="@this.Context.GetRequestTelemetry().Context.Operation.Id";
    appInsights.context.operation.id = serverId;
});

appInsights.trackPageView();

If you are using lower version of Application Insights SDK (like 2.3) – the snippet is a bit more complicated as RequestTelemetry object needs to be initialized. But still easy:

1
2
3
4
5
var serverId ="@{
    var r = HttpContext.Current.GetRequestTelemetry();
    new Microsoft.ApplicationInsights.TelemetryClient().Initialize(r);
    @Html.Raw(r.Context.Operation.Id);
}";

This snippet renders server request ID as a JavaScript variable serverId and sets it as a context’s operation ID. So all telemetry from this page shares it with the originating server request.

This approach, however, may cause some troubles for the cached pages. Page can be cached on different layers and even shared between users. Often correlating telemetry from different users is not a desired behavior.

Also - make sure you are not making it to extreme. You may want to correlate the server request with the page view that initiated the request:

As a result, all the pages user visited are correlated. Operation ID is playing the role of session id here. I’d suggest for this kind of analysis employ some other mechanisms and not use telemetry correlation fields.

Import Datasets in Application Insights Analytics

| Comments

I was thinking how to improve querying experience in Application Insights Analytics. In the previous post, I demonstrated how to use datasets in your query. Particularly I needed timezones for countries and I used datatable operator to create a dictionary of country timezones. In this post, I show how to use Application Insights data import feature to work around the user voice request “Return ISO 2/3 letter country code in the REST API”.

I downloaded the country codes from UN website and saved it as a blob in Azure. Then defined the Application Insights Analytics open schema by uploading this file as an example. I named columns and asked to use ingestion time as a required time column.

Then I used code example from the documentation for data import feature. Get the reference to the blob, created a security token and notified Application Insights about this blob storage.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
var storageAccount
= CloudStorageAccount.Parse(ConfigurationManager.AppSettings.Get("StorageConnectionString"));
var blobClient = storageAccount.CreateCloudBlobClient();
var container = blobClient.GetContainerReference("testopenschema");
var blob = container.GetBlobReferenceFromServer("countrycodes.csv");

var sasConstraints = new SharedAccessBlobPolicy();
sasConstraints.SharedAccessExpiryTime = DateTimeOffset.MaxValue;
sasConstraints.Permissions = SharedAccessBlobPermissions.Read;
string uri = blob.Uri + blob.GetSharedAccessSignature(sasConstraints);

AnalyticsDataSourceClient client = new AnalyticsDataSourceClient();
var ingestionRequest = new AnalyticsDataSourceIngestionRequest(
    ikey: "074608ec-29c0-41f1-a7c6-54f30d520629",
    schemaId: "440f9d45-9b1f-4760-9aa5-3d1bc828cedc",
    blobSasUri: uri);

await client.RequestBlobIngestion(ingestionRequest);

Originally I made a bug in an application and received the error. It means that the security token is verified right away. However the actual data upload happens after some delay. So set the expiration time for some time in future.

1
2
Ingestion request failed with status code: Forbidden.
    Error: Blob does not exist or not accessible.

Here is how successful requests and response look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
POST https://dc.services.visualstudio.com/v2/track HTTP/1.1
Content-Type: application/json; charset=UTF-8
Accept: application/json
Host: dc.services.visualstudio.com
Content-Length: 472

{
    "data": {
        "baseType":"OpenSchemaData",
        "baseData": {
            "ver":"2",
            "blobSasUri":"https://apmtips.blob.core.windows.net/testopenschema/countrycodes.csv?sv=2016-05-31&sr=b&sig=y3oWWTWvAefer7N%2FN%2B49sy4j%2BpR2NA%2F7797EvXQAQEk%3D&se=2017-05-12T00%3A09%3A12Z&sp=rl",
            "sourceName":"440f9d45-9b1f-4760-9aa5-3d1bc828cedc",
            "sourceVersion":"1"
        }
    },
    "ver":1,
    "name":"Microsoft.ApplicationInsights.OpenSchema",
    "time":"2017-05-11T00:09:14.6255207Z",
    "iKey":"074608ec-29c0-41f1-a7c6-54f30d520629"
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
HTTP/1.1 200 OK
Content-Length: 49
Content-Type: application/json; charset=utf-8
Server: Microsoft-IIS/8.5
x-ms-session-id: 0C2E28FE-6085-4DD7-BFB9-8A6195C73A2A
Strict-Transport-Security: max-age=31536000
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Name, Content-Type, Accept
Access-Control-Allow-Origin: *
Access-Control-Max-Age: 3600
X-Content-Type-Options: nosniff
X-Powered-By: ASP.NET
Date: Thu, 11 May 2017 00:09:15 GMT

{"itemsReceived":1,"itemsAccepted":1,"errors":[]}

Once data uploaded you can query it by joining standard tables with the imported data:

1
2
3
4
pageViews
  | join kind= innerunique (CountryCodes_CL)
      on $left.client_CountryOrRegion == $right.CountryOrRegion
  | project name, ISOalpha3code

Refresh data in this table periodically as Application Insights keeps data only for 90 days. You can set up an Azure Function to run every 90 days.

By the way, imported logs are also billed by size. You see it as a separate table in the bill blade. You can see how many times I run the application trying things =)…

Ip Lookup

| Comments

Application Insights makes automatic ip lookup for your telemetry. Geo information can be quite useful for monitoring, troubleshooting and usage scenarios.

I already wrote about IP address collection. Application Insights collects an IP address of the monitored service visitor. So you can group telemetry by the country of origin. This will allow to filter out long executing AJAX calls made from the countries with high latency or group usage metrics by “nighttime” visitors vs. “daytime” visitors.

First, a word of caution. Application Insights is using the snapshot of MaxMind geo IP database (Credits) from some time ago. So it may give wrong results at times and is not in sync with the demo.

For instance, this query demonstrates that not all availability tests locations geo mapped correctly by Application Insights.

1
2
3
4
5
6
7
8
9
availabilityResults 
  | where timestamp > ago(10m) 
  | join (requests 
    | where timestamp > ago(10m)) on $left.id == $right.session_Id
  | extend 
    originatingLocation = location, 
    recievedLocation = strcat(client_CountryOrRegion, " ", client_StateOrProvince, " ", client_City)
  | summarize count() 
    by originatingLocation, recievedLocation, client_IP 

This is a resultgin view. Note, some locations were not correctly mapped and some do not have city associated with it:

originatingLocation recievedLocation client_IP
US : CA-San Jose United States California San Jose 207.46.98.0
US : FL-Miami United States Florida Miami 65.54.78.0
US : TX-San Antonio United States Texas San Antonio 65.55.82.0
NL : Amsterdam Netherlands North Holland Amsterdam 213.199.178.0
US : IL-Chicago United States Illinois Chicago 207.46.14.0
IE : Dublin Ireland Leinster Dublin 157.55.14.0
JP : Kawaguchi Japan Tokyo Tokyo 202.89.228.0
RU : Moscow United Kingdom 94.245.82.0
CH : Zurich United Kingdom 94.245.66.0
HK : Hong Kong Hong Kong Long Keng 207.46.71.0
AU : Sydney United States Washington Redmond 70.37.147.0
BR : Sao Paulo Brazil Sao Paulo São Paulo 65.54.66.0
SE : Stockholm United Kingdom 94.245.78.0
SG : Singapore United States Delaware Wilmington 52.187.30.0
US : VA-Ashburn United States 13.106.106.0
FR : Paris United Kingdom 94.245.72.0

Try this query yourself for an up to date information.

I authored a simple query to check whether my blog is read during the day or night. This demo is not produciton ready and I might have messed with the timezones. However for an adhoc analysis it was OK. It also demonstrates the use of the operators datatable and the power of join:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
let timezones = datatable (timezone_location:string, shift:time)
    [
        "United States", time(-6h),
        "Canada", time(-6h),
        "Japan", time(9h),
        "Brazil", time(-3h),
        "United Kingdom", time(0),
        "Hong Kong", time(8h),
        "Ireland", time(0),
        "Switzerland", time(2h),
        "Slovenia", time(1h),
        "South Africa", time(2h),
        "Sweden", time(1h),
        "Poland", time(1h),
        "Ukraine", time(2h),
        "Netherlands", time(2h),
    ];
pageViews
 | extend timezone_location = client_CountryOrRegion
 | where timestamp > ago(10h) and timestamp < ago(5h)
 | join kind= leftouter (
     timezones
 ) on timezone_location
 | extend localtimehour = datepart("Hour", timestamp + shift)
 | project name, timezone_location, timestamp, localtimehour, isDay = iff(localtimehour > 5 and localtimehour < 20, "day", "night")
 | summarize count() by isDay
 | render piechart

Here is the result view:

Oneliner to Send Event to Application Insights

| Comments

Sometimes you need to send event to Application Insights from the command line and you cannot download the ApplicationInsights.dll and use powershell script like described here. You may need it for your startup task or deployment script. It’s a good thing Application Insights has an easy to use REST API. Here is a single line command line that runs powershell and pass the script as a parameter. I split it into multiple lines for readability, you will need to remove all newlines before running. Just replace an event name and add custom properties if needed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
powershell "$body = (New-Object PSObject 
    | Add-Member -PassThru NoteProperty name 'Microsoft.ApplicationInsights.Event' 
    | Add-Member -PassThru NoteProperty time $([System.dateTime]::UtcNow.ToString('o')) 
    | Add-Member -PassThru NoteProperty iKey "1aadbaf5-1497-ae49-8e89-cd0324aafe6b" 
    | Add-Member -PassThru NoteProperty tags (New-Object PSObject 
    | Add-Member -PassThru NoteProperty 'ai.cloud.roleInstance' $env:computername 
    | Add-Member -PassThru NoteProperty 'ai.internal.sdkVersion' 'one-line-ps:1.0.0') 
    | Add-Member -PassThru NoteProperty data (New-Object PSObject 
        | Add-Member -PassThru NoteProperty baseType 'EventData' 
        | Add-Member -PassThru NoteProperty baseData (New-Object PSObject 
            | Add-Member -PassThru NoteProperty ver 2 
            | Add-Member -PassThru NoteProperty name 'Event from one line script' 
            | Add-Member -PassThru NoteProperty properties (New-Object PSObject 
                | Add-Member -PassThru NoteProperty propName 'propValue')))) 
    | ConvertTo-JSON -depth 5; 
    Invoke-WebRequest -Uri 'https://dc.services.visualstudio.com/v2/track' -Method 'POST' -UseBasicParsing -body $body" 

Running it will return the status:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
StatusCode        : 200
StatusDescription : OK
Content           : {"itemsReceived":1,"itemsAccepted":1,"errors":[]}
RawContent        : HTTP/1.1 200 OK
                    x-ms-session-id: 960F3184-51B6-4E74-B113-88ACD106B7F3
                    Strict-Transport-Security: max-age=31536000
                    Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Name, Content-Type,...
Forms             :
Headers           : {[x-ms-session-id, 960F3184-51B6-4E74-B113-88ACD106B7F3], [Strict-Transport-Security,
                    max-age=31536000], [Access-Control-Allow-Headers, Origin, X-Requested-With, Content-Name,
                    Content-Type, Accept], [Access-Control-Allow-Origin, *]...}
Images            : {}
InputFields       : {}
Links             : {}
ParsedHtml        :
RawContentLength  : 49

And event will look like this in Application Insights Analytics.

name value
timestamp 2017-03-27T15:25:11.788Z
name Event from one line script
customDimensions {“propName”:“propValue”}
client_Type PC
client_Model Other
client_OS Windows 10
client_IP 167.220.1.0
client_City Redmond
client_StateOrProvince Washington
client_CountryOrRegion United States
client_Browser Other
cloud_RoleInstance SERGKANZ-VM
appId d4cbb70f-f58f-ac6d-8457-c2e326fcc587
appName test-application
iKey 1aadbaf5-1497-ae49-8e89-cd0324aafe6b
sdkVersion one-line-ps:1.0.0
itemId 927362e0-1301-11e7-88a4-211449da9ad2
itemType customEvent
itemCount 1

Note, sender’s IP address and location was added to the event. Also powershell will set the User-Agent like this User-Agent: Mozilla/5.0 (Windows NT; Windows NT 10.0; en-US) WindowsPowerShell/5.1.15063.0 so Application Insights detected that event was sent from Windows 10 machine.

It is much easier to use Application Insights using one of numerous SDKs, but when you need it - you can send data directly.