OCI : INFRASTRUCTURE OPERATIONS : oracle-cloud-infrastructure-operations :
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Monitoring / Notification / Logging
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
In this lesson we will discuss about Monitoring , Notifications and Events . From the operations point of it is important to keep track of your resources and service provisioned in Oracle Cloud and take necessary action in a manual manner or in a automated manner
We have multiple things to discuss such as below.
as well how do you integrate with Events from the OCI service .
1. Monitoring Service :
As a cloud provider , oracle has not only given various services but also gives you the ability to Monitor your resources with the help of various Metrics that are automatically captured and Alarms that you can configure to Notify you whenever something happens which is an unexpected or a behavior which needs attention.
Metrics : Metrics are constantly captured at a regular interval depending on the type of metric they would specific intervals at which they are captures and they are relayed either the health of a resource , or the capacity utilization or performance of various cloud resources you might have . As you can see when you have various resources provisioned in Oracle Cloud . The services will keep emitting information about the metrics that are being captured which could also mean - you could also have your customer specific application implemented for which you can capture metrics and ask OCI to emit them .
With Metrics being captured you have the ability to view them in an aggregated manner either from the console or with custom monitoring tools that you might have which can be integrated or you can use CLI , API , REST integration to do the needful .
Once you have identified what Metrics you want to capture and you have them aggregated across different dimensions which we will talk about later . You can also configure alarms , with the help of alarms you specify Thresholds for these metrics , Whenever an Alarm detects that a Threshold has been reached it can trigger some action in the form of notifying you in various forms and the entire monitoring service can be accessed from the console or REST APIs , SDK , Terraform and other tools that are supported . This is an Overview of Monitoring service .
In the rest of the videos you will learn how one can access these videos & use these details to do your operations activities .
You choose the compartment in which you want to get the service metrics . The resource specific namespace that you want . And I can go and add dimensions
A Metric is some information about the resource , as an example I have take the case of CPU Utilization of a compute instance
A Metric is a combination of : Namespace, Dimension & Metadata
When we look at the Metric from a service level , you choose the Namespace as to where you want data to be taken , so if you take the compute agent - CPU utilization is a metric that you see
Dimension :
Dimension is basically a filter category that you want to apply for the metric aggregation that is happening . When you say filtering criteria you might be interested to look at metric of all compute instance running in a particular availability domain or you can go-ahead and add your own dimension for specific information or what filter you want to apply . And there is a meta-data that is provided for every metric in terms of what units the measurement is given. So the meta-data will be specific to the metric that is captured which provides additional meaningful information about the metric as such .
So if I go back to the OCI browser console you can over hear add dimensions .
which would be specific to the namespace you are choosing
Dimensions are just a means to filter out data as to from that particular service whatever namespace you have chosen . The data that is being captured will be based on the dimensions that you choose .
And what you get in the Metric is having information about specific metric
You can go ahead and choose the aggregate metric explorer to choose and visualize your metric data .
In the Metric screen that I took you , You can also customize what you want see by writing - Monitoring Query Language . You can refer to OCI documentation for monitoring service to identify how do you pick the dimension and metric that you are interested in and the specific ways you write Metric queries
Syntax
- Metric - Give the Metric
- Interval -- The interval in which you want the aggregation to happen . The frequency at which you want the aggregation to happen for example at a 5 minute interval you want the metric to be aggregated for which dimension
- Group function -- And a grouping function that you want to apply for example you are interested in Max CPU Utilization at one minute intervals you give the metric to be CPU utilization
For example you are interested in one particular instance you can give "ocid" like below
or you want to gather aggregate across all instances in an availability domain you can add your filer criteria or dimension accordingly , if you go back to the browser console I have the ability to go identify what information I want to see over here . The information that I want see can be written in the form of queries from the browser console by choosing a Metric Namespace .Oci_Computeagent for this example
And I will get the information on what is happening in this one cyclic update chart . once making the chances you can click on "Update Chart" and you can get the data individually listed
Metrics Done .
Alarms :
In this session you will learn the Alarms feature while monitoring, where is this available in the console .
If I go back to my monitoring console within in the Monitoring Service we have the "Alarm Statuses" .
Alarm Definition --> Create Alarm
What is the idea behind using Alarms needs to be understood first . It is the means based on the metrics you are having you can set thresholds , so whenever the threshold is breached alarm will be triggered.
The monitoring query language can be used to define your thresholds for Alarms as to what metric or aggregation . What metric for a service you want to monitor and based on what what criteria you want the alarm to be fired .
when you create an alarm you can choose the severity of your choice
The choose the Metric Namespace for example
choose the metric that is of interest . In a one min interval the average CPU utilization is less than or equal to lets say 20% , let us wait for a certain amount of time before it triggers . You can customize your alarms to fire at a specific dimension same way as you can define your monitoring query language .
The you can set the notification through which the notification should be sent . You have not still see notification service .We will see that later . Notification service is basically a messaging system to which you can publish messages and whoever wants to subscribe can subscribe .
If you want to write your own custom query language ,give the details you can customize it .
You can customize it , look into the documentation how you can run it or come back to the basic mode of using the Alarm configuration .
Hereby you make OCI passively monitor and give you alarms and when an alarm is triggered and you get a notification you can go ahead and take action appropriately . That's about the alarm service .
Notification Overview :
This session you will get an overview on the Notification service , Is a build is Publish Subscribe mechanism , that oracle provides , like the typical streaming messaging system where in Notification enable to broadcast messages to various subscribers using a public-subscribe mechanism whereas this is secure, reliable, low latency needs for application messages that might be coming in from OCI or even externally .
How does this work
You can use the notification service to create subscriber pattern for messages that are published, wherein, event based rules or Alarms that are getting triggered . You may want to get notified now we come into a new component to understand called "Events" - This has to be understood .
We already know about the about Alarms from Monitoring that can be there. Whenever there is an alarm triggered you need to have a mechanism to notify you which is where the notification service comes into picture.
What do you do with Notifications is that you create Topics of interest and you will enable messages to be published to topics and you can have subscribers to get notified either through email or through other mechanisms to get notification whenever that particular notification gets a message published .
The notification service enables you to setup communication channels for publishing messages using different topics and subscribers . When a message is published to the topic whoever is subscribed will get the notifications sent to them based on the subscription .
So how do we do this is to be understood .
Go to Notification Session under --> Application Integration > Notifications
You can create topics of interest and once you have a Topic in place , once you have a topic in place you can go-ahead and create subscriptions for a given topic . You have various methods of subscriptions.
When a message is published to a particular topic , choose a topic for your notification and you can decide to get a E-Mail notification , or make a OCI function to be called to do some automation , you may want to call a particular URL to take action , or integrate with pager duty , slack , or sms notifications.
All these are subscribers who will be subscribed to a particular topic of interest and you will create a means through which you will published with messages , whenever they have messages published , whoever is subscribed to them will have the respective call happening .
Before we get to certain scenarios as to where these could be used. we should also understand about how could this event service that we talked about
EVENTS
In this session we will learn the event service that is build into OCI , to access Events , Go to Menu
Application Integration > Event Service
What are events in OCI , anything that happens as part of your OCI services it could be a matter of creating a compute instance , Attaching a Block Volume to your compute instance, creation of a user in IAM , anything that is happening in your Tenancy as part of the cloud infrastructure offering can be considered as an event. Which means oracle is emitting these events whenever such events occur .
It is up to you to create those rules to capture those events and take action accordingly .
You create an Event giving an display name .
You can use a tag based filter criteria or type based filter criteria . For example if I take the example of
for instance if we take compute as the service . There is a various events that are emitted by oracle from time to time. for example if you are moving an image.
Moving an image from one compartment to another , all these are examples of Events that are automatically captured .
Whenever such event happens , you can take action.
Comments
Post a Comment