FLUENT D

 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 https://www.youtube.com/watch?v=Jl-F9azcyow

 Fluent Bit is a light alternative to FluentD

Source code is available on GitHub : github.com/fluent

Installing Fleuntd https://docs.fluentd.org/v/0.12/articles/install-by-rpm

FAQ : https://www.fluentd.org/faqs

 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Wikipedia definition :

Fluentd is a cross platform open-source data collection software project originally developed at Treasure Data. It is written primarily in the Ruby programming language.  


There are many benefits to implementing fluentD .

Has the ability to collect all types of logging information's from multiple sources such as  -Databases , Application and Network servers

There are many benefits of using FleuntD for your logging aggregator and the logging analyzing system. 

Firstly you have the ability to collect all types of logging information from multiple sources such as Database, Application and Network servers . The primary feature is its uniform or logging data collected from other input sources , after collection in process data information can then be routed to multiple sources such as the cloud service , a database or other archival system . FluentD comes with fully compatible with Kubernetes & Docker for deploying and managing logging events within a cloud platform solution . Another feature of FluentD is that there are over 500 plus plugins available for fluentD to connect to all types of various softwares .


There are many reasons why to use Fluent D as your logging aggregator and analyzing archival system . Today Fluent D integrates logging with hundreds of systems because of number of plugins available . For unified log date information fluentD implements JSON which is popular machine file format and it is widely used whether collecting or recrafting Fluent D is literally able to scale literally 1000s of servers . FluentD is able to handle and manage various log types such as webservers , databases to application . According to the FluentD website , fluentD is able to aggregate log files from 50,000 servers which is an example for using FluentD for Enterprise logging solution . 



FluentD has a life cycle for each log Events 

Each lifecycle comprises of 5 different log events that are as above . 

While setting up FluentD , uses a main configuration file to connect to all of its components . 


When setting up fluentD the main configuration files is used to connect all of its components . Within the main configuration file , inputs are defined - which area also called Listeners , these Listeners are able to match specific input data as data is collected from input devices , for example , the first step is to create a match is to define a data source for the data as on on the slide 

  • The source is a web server, and the listener is a listening on the port 8888 
  • The second set is a use of a Match element , that matches any input with > test.example 

If a match occurs then the input data is outputted to standard output 

Three component are involved with each FluentD events , they are Tags, Time & Record


  • Tag :   A Tag represents the Origin of an Event 
  • Time :  The time represents the actual time or occurrence of the event
  • Record :  Record represents the content of the event log 
Fluentd provides 


The Match element is key for matching specific data from data input within fluentd .   The match comprise a method for sending output to other systems when input data matches 


The six component within the Fluentd life cycle of an event is the use of labels : Labels allow for Grouping , Filters and Output . 


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Fluentd : https://www.youtube.com/watch?v=aeGADcC-hUA

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Fluentd :Best kept secret 



How can we run fluentd on a host machine outside of a docker container as containers are ephemerals and configurations will also disappear once the containers are destroyed . However there can be scenarios as to why you want to run fluentd within your container . Namely if you are only concerned in what is happening with in a specific container . 

Now Docker released version 1.1 on April 13 couple of months ago since Docker , version 1.8 fluentd has become ship with docker . Nonetheless we have a guide available to configure fluentd on Docker . 

The first problem that we are trying to solve using Fluentd is  Logging. Logging is really a mess. 

What you see on the screen is very typical from many data engineering teams . 



What we typically experience is a gaggle of information sources .

The other problem is that we are not logging enough . 


Fluentd works on a simple mechanism of INput plugins and OUput plugins , that can listen to all of our data sources and route it to and route it to teh right places with out introducing unnecessary complexities . 

Fluent is an extensible and reliable data collection tool when I say for it has a core that augmented by various plugins 


The core differentiator is the way it is designed , 


It has this core and the thriving eco system of  third party plugins . 

The core focuses on the common concerns of collecting and managing event data while the plugins focus on specific use cases by augmenting the functionality of the core. In other words the core focuses on dividing and conquering . 

Lets dig into the plugin architecture of it. 


There are six type of plugins 

Input and Parser plugins take care of the taking in data from various such as syslogs and mobile apps , social networks , from webservers etc, once the data is in fluentd it goes through an optional plugin "Filter" to modify the data or filter the data with stream , you might want to mask certain data and you may want to send data to different places based on the value of certain fields - At this stage the data send out to different systems , so buffering is inbuilt, so if output to certain systems fail retry based on that buffer data 

Formatting is going to be your last step - Certain systems and stakeholders require their data be formatted in a certain way , other all activities can be the domainer plugins 

Lets take a deep dive a little bit more in the life time of an event. 


As you can see in the diagram , a fluentd consists of three parts 

  • Tag 
  • Timestamp
  • Body Record /or Record

Body record is essentially the payload or the message that you want to move between systems 

Tag is a very unique ID in fluentd , essentially it is a piece of text that tells fluentd where to send a data given the tag prefix that is S3, Mongodb, treasuredata, so forth will let fluentd to know where to route the data , we are going to see how the tags are implemented in our configuration files , you can also configure the tags as production or development so it routes data to the right server and to the right environment. 

Time : Time is very important as all log data comes with a timestamp , so knowing exactly what time the event process is key to knowing that the backend understands the meaning of the data , the context of the data . Once the data is input into fluentd it goes into router mechanism and based on that value of that tag - this is where the routing happens - If the tag says flew it might got to one system 

For each event once it is routed to different output and be buffered in its own way - once the buffering is done and when you are flushing the buffer the data goes into a queue that is the output plugins right logic . You might be writing to an external system over the internet , you might be writing to your own file system , you might be writing to your own local database or remote database - one key as its always logic to abstracted away from the user - when you are using fluentd you don't have to worry about these mechanisms , simply write a declarative configuration file and thats going to show you how you manage the data flow . 

So lets go into a some of those use cases , 

This is the most common , you are basically collecting data from many different sources routing it to fluentd and then we are outputting it to an external system such as Mongodb , PostgresSQL or so on.


Today fluentd supports more than 200 output systems making it a very versatile choice from which to start structured login within your organization so this configuration will look like this 

Look we have our source, sources are where we are getting our data from . 


My source plugin says we are using the tail input plugin 

Less simple forwarding : is what happens for example when your system starts to scale. 


once you have simple data forwarding covered you will go on to do more advanced forwarding. In this you are using fluentd to load balance multiple fluentd nodes , we are collecting data from many different servers , from many different data sources such as mobile apps , local file system , 

Those leaf fluentd instance are forwarding the data to big aggregated node which is in the middle , you can also channel overflow when your aggregate node is full you can configure gracefully into another one. You can also do load balancing so that you are not just sending data to one aggregated node but to several - once the aggregate instance gets all the data it periodically sends data to all its backend systems .

Fluentd support more than 200 output plugins make it more appealing. 

One of the more advance use is case is we are starting use Lambda 








Comments

Popular posts from this blog

Create OCI Infrastructure : Using Ansible

Oracle -OCI : Foundations

OCI -- Compute Instance Creation : VCN :Subnet