New Probe! Monitoring a DNS Server using CopperEgg

Are you using a DNS Server inside your organization? Use CopperEgg to monitor your DNS Servers for availability, performance and accuracy.  Monitoring and alerting can be configured to respond when Server is not available, has long response time or URL does not match.

  • Availability: alert if the DNS Server being monitored is not available.
  • Performance: By default, an alert will be triggered if the response time for a DNS Request is more than 5001 ms. User can configure the response time for alerting as needed.
  • Accuracy: Alerts will be raised if specified DNS Request cannot be successfully resolved by the DNS Server or if the resolved URL does not match the expected value.

Prerequisite

Login to your CopperEgg account.

If you don’t already have an account, sign up for a 14 day FREE TRIAL of CopperEgg (no Credit Card required).

Step 1: Add a new DNS Probe

Go to the Probes tab and click on ‘Add a Probe’ button. On the page that opens up, specify the values as shows below:

 

Figure 1: Probe for monitoring a DNS Server being created

  1. Enter a label for the probe.
  2. Select Protocol as DNS.
  3. Enter the IP Address of the DNS Name Server to be monitored.
  4. Enter the Domain Name to be resolved. A DNS resolution request with this domain name will be sent by the Probe Station to the DNS Name Server. For example, if you want to monitor copperegg.com, write copperegg.com in this input box.
  5. Enter ‘response match’ to match with probe response [Optional]. This is used to verify the accuracy of the response obtained from the DNS Server. For example, if you have added copperegg.com as the domain to be monitored, then the expected response would be the IP which copperegg.com points to, i.e. 50.116.59.244.
  6. Specify the frequency at which the Probe Stations should send out a DNS request to the DNS Server. For example, if you select 15 second as the monitoring frequency, then your probe would be sampling CopperEgg and response time per 15 seconds.
  7. Select multiple stations to get response from different locations (minimum three).
  8. Add CopperEgg tags [Optional]. While configuring alerts (See Step 4), users can specify an alert condition and a tag value. The alert condition will be checked periodically only for the set of probes with that tag added.
  9. Save the probe.

After a successful save it will launch a dashboard with a new widget of currently created probe monitoring the specified DNS server.

Step 2: Verify that the DNS Probe Widget appears in the Dashboard

Once the DNS Probe has been added, you will see a new widget for that probe pop up in Dashboard in the left menu of the Probes tab.

Widget colour may vary according to probe’s health and status:

  1.    A healthy probe, correct in all aspects be green and will look like the this:

 

Figure 2: Dashboard widget for active Probe with accurate response match

Please note that the Last Status is “Success” indicating that a valid response, with a response match was received. A Status of Success is also shown in case no response match check is configured.

The black dot indicates the peak value of response time during the past 30 minutes.

  1.    A healthy probe for which response doesn’t match looks like one shown below. Green header shows that probe is active and is responding to DNS requests, but its Last Status. A MISMATCHshows that the DNS response does not match the expected response specified during Probe setup.

 

Figure 3: Dashboard widget for active Probe with response mismatch

  1.    A unhealthy probe for an invalid DNS Server will look like one shown below. In this case Last Status: REFUSED shows that either probe domain is invalid or probe domain does cannot be resolved successfully by the DNS Server.

 

Figure 4: Dashboard widget for Probe with an invalid domain name

  1.    A unhealthy probe where the DNS Server does not respond to DNS requests in time will look like one shown below. In this case Last Status: TIMEOUT shows that the DNS Server is available but is not responding to DNS requests within 5 seconds.

 

Figure 5 : Dashboard widget for a DNS Probe which times out and does not respond to DNS requests in time

Step 3: Verify that the DNS Probe details are populated

Click on the Details button in the Probe Widget. In the Details page, you will see the following timeline charts:

  1. Response Time: The time taken by the DNS Server to respond to the DNS requests originating from the Probe stations. If your domain doesn’t respond within 5 seconds, or if nameserver is invalid, the probe will show response as timeout.

 

Figure 6: Response times for a DNS Probe

2. CopperEgg: The metric that determines whether the DNS Server is active or not. CopperEgg is either 100% for a valid probe, or 0% for an invalid probe (invalid probe means either DNS name server or DNS request is invalid)

 

Figure 7: CopperEgg values for a DNS Probe

3. Health of the Probe: An overall health measure for the DNS Server calculated by CopperEgg. Health is either 100% for a valid probe, or 0% for an invalid probe (invalid probe means either DNS name server or DNS request is invalid)

 

Step 4: Configure alerts to be notified about availability and performance issues in the DNS Server

You can configure new alerts that will be triggered when there are availability/ performance issues in the DNS Server. You can also configure the notification mechanisms for a triggered alert.

Before we proceed, let’s clarify the terminology:

  1. Alert Configuration: A configuration of an alert stating the alert conditions for raising an alert issue and the specific entities (probes, servers etc) which are to be checked periodically to see if the alert conditions are true. You can see these in Alerts Tab -> Configure Alerts.
  1. Alert Issues: Whenever an alert condition is true for any entity, an alert issue is created. You can see these in Alerts Tab -> Current Issues.

 

Figure 9: Configuring an alert that will be triggered when the Probe Response does not contain the keyword “Success”

  1. Go to Alerts Tab > Configure alerts and click on “New Alerts” button on the top right.
  2. Provide values for these fields in the New Alert page:
  • Description: A description of the alert that will be easily recognized by you and your team if the alert is triggered
  • Alert me when: Select the metric of interest and the condition upon which the alert is triggered. You can configure alerts for various conditions associated with the DNS Probes as shown below.

(Note: You can select the alert condition by typing in ‘Probe’ in the Alert me when textbox and then choosing the right option from the autofill values available in the drop down.  The following options are valid for DNS Server).

-Invalid Response: You can create alerts that trigger if the DNS response is invalid.

 

Figure 9: Configuring an alert that will be triggered when the DNS response is invalid

In the illustration above, the alerting metric is the Probe Message and the triggering condition is when the response does not contain the keyword “Success”

-Response time is very high: You can create alerts that trigger if the Probe Response time is too high.

 

Figure 10: Configuring an alert that will be triggered when the DNS response time is more than 2 secs

In the illustration above, the alerting metric is the Probe Response Total time and the triggering condition is when the response time value is greater than 2 seconds.

-Mismatch in the Response: You can create alerts that trigger if the response to the DNS Request does not match with the expected response specified in the Probe Configuration. The message (e.g. ‘mismatch’) must be in lower case.

 

Figure 11: Configuring an alert that will be triggered when the DNS response does not match with the expected value

In the illustration above, the alerting metric is the Probe Message and the triggering condition is when the response contains the keyword “mismatch”

-DNS Server timeout: You can create alerts that trigger if the DNS Server times out.

 

Figure 12: Configuring an alert that will be triggered when the DNS Server times out

In the illustration above, the alerting metric is the Probe Message and the triggering condition is when the message contains the keyword “timeout”. Here also, message must be in lowercase

  • For at least: The duration for which the alert condition must be valid for the alert to be triggered
  • Matching tags: By default, (match everything) is chosen. When an alert is configured, you can specify which probes that alert condition should operate upon using tags. You can add tags to probes and configure alerts only for probes with that specific tag. If no matching tags are specified, that alert configuration will act upon all probes in your account.
  • Excluding tags: By default, (exclude nothing) is chosen. If needed, you can disable alerting for specific probes using tags. You can add tags to probes and configure alerts to exclude all probes that have that particular tag.
  • Annotate: When enabled, an annotation is automatically created when the alert is triggered. Annotations will be visible in the custom metrics dashboard where the data stream is displayed.
  • Automatic Clear: When enabled, the alert issue is automatically cleared if the triggering condition is no longer true. If 10 probes have the necessary tags to match against an alert configuration, they will be examined periodically to check if the alert condition is true. If, say for 2 out of those 10 probes, the alert condition becomes true, then 2 alert issues will be triggered, one for each probe where the alert condition is true.If the Automatic Clear checkbox is enabled for an alert configuration, each alert issue raised due to that alert configuration is rechecked on a periodic basis and cleared if the underlying alert condition is no longer true. So, if there are 2 alerts issues from the same alert configuration (which has Automatic Clear enabled) corresponding to 2 Probes and one of those Probes becomes functional again, 1 alert issue will be cleared while the other alert issue will continue to remain an active issue.
  • Notify on clear: When enabled, notifications are also sent when the alert issue is cleared. Please do note that notifications are always sent when the alert issue is triggered.
  • Send Notifications To: Here you can configure the notification mechanisms by which the alert is communicated to you and others in your team.

Differing alerting types are supported within CopperEgg for notifying different sets of users with differing notification mechanisms.

Notification mechanisms include:

  • Email
  • SMS
  • PagerDuty
  • Twitter
  • HipChat
  • Campfire

Webhooks are also exposed for clients to configure custom notification mechanisms. More about setting up website probe alerts can be found here.