This post has already been read 715 times!
In this blog, I will explain how to monitor VMware Horizon DaaS. version 9.0 but the same can also be used for version 8.* and version 6.* as no CIM counters have changed.
I have been working with this product as from 2015. When I started a service provider that hosted DaaS Desktops for a partner network. Back then the product was named Desktone that later was later aquired by VMware that later renamed the product into VMware Horizon DaaS, to be part of the Horizon Family.
In this blogpost I will only focus on the python scripts used to monitor the WBEM CIM counters on the SP, RM and tenant appliances and also the Grafana dashboard to show all DaaS tenant status in one cool green Dashboard!
My DaaS Lab Environment
I have setup a full Horizon DaaS v9.0 deployment in my lab. I use one vCenter server and have deployed one tenant that I use to deploy desktops with a Nvidia T4 desktop. All for testing purposes and for playing C&C Remastered with my Son (he actually only has a thin client in his room :)).
So, I have deployed a small Ubuntu 18.04 monitoring server that functions as the center of all monitoring through telegraf, influxDB and showing the Dashboard on Grafana. I will not go into setting up this server, but you can use the following excelent howto guide on this link.
One important thing to note: you need access to the link-local network from within this monitoring server. this can be achieved by adding a second network card within the 169.254.*.* link-local network (without a gateway). Or make the necessary network routing/firewall policy’s so that the monitoring server can access the appliances on the link-local interface.
My amazing collegues at Basefarm have rewritten an old script that we used in the old days to monitor DaaS with Zabbix to be used with telegraf and InfluxDB.
The monitoring scripts are located at my github page at github.com/mbroeken/horizondaas
Please note that in our environment we have our own repo system and create packages with python3 virtualenv included. so we need to work around to get it working in your environment 🙂
In this case we have placed the script in /opt/horizondaas
sudo git clone github.com/mbroeken/horizondaas
Step 1: Update your repositories
sudo apt-get update
# Step 2: Install pip for Python 3
sudo apt-get install build-essential libssl-dev libffi-dev python-dev
sudo apt install python3-pip
# Step 3: Use pip to install virtualenv
sudo pip3 install virtualenv
# Step 4: Launch your Python 3 virtual environment, here the name of my virtual environment will be env3
virtualenv -p python3 venv
# Step 5: Activate your new Python 3 environment. There are two ways to do this
. venv/bin/activate# or
source venv/bin/activate which does exactly the same thing
# you can make sure you are now working with Python 3
# this command will show you what is going on: the python executable you are using is now located inside your virtualenv repository which python. in this case Python 3.6.9.
# now we are ready to install some required pip modules
sudo pip install pywbem
sudo pip install PyYAML
sudo pip install influxdb
If all things are good. We have now installed all required to start monitoring the Horizon DaaS from /opt/horizondaas.
You first need to configure the configuration files:
Copy the /opt/horizondaas/etc/horizondaas-example.yml and create a configuration file per tenant you want to monitor.
/opt/horizondaas/venv/bin/python3 /opt/horizondaas/bin/horizondaas -c /opt/horizondaas/bin/horizondaas/etc/hostname.domainname.local.yml
Copy and edit all files to fit your needs. SP appliances have different monitoring items than RM appliances and tenant appliances.
I found that the Horizon_DaaS_Platform_6_0_0_Monitoring PDF contains all the information you need. What counters you need per appliance type. And what alerts you would like to configure. Basically nothing has changed between version 6,7,8 and 9.
We have set some defaults in /etc/telegraf/telegraf.conf. I’m not sure if it fits your envinment, but it might be useful.
interval = "120s"
round_interval = false
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "9s"
flush_interval = "30s"
flush_jitter = "10s"
precision = ""
debug = false
quiet = false
logfile = "/var/log/telegraf/telegraf.log"
We have added all hostnames with link-local ip-adresses in the monitoring hosts file, just to make sure the right ip-adress is being monitored
and so on…
Create the horizondaas directory to store config files:
sudo mkdir /etc/telegraf/telegraf.d/horizondaas/
Every appliance needs it own .conf file located in /etc/telegraf/telegraf.d/horizondaas/
We have created the file inputs_hostname.conf
commands = ["/opt/horizondaas/venv/bin/python3 /opt/horizondaas/bin/horizondaas -c /opt/horizondaas/etc/hostname.domainname.local.yml"]
timeout = "180s"
data_format = "influx"
create the logging directory:
sudo mkdir /var/log/horizondaas/
change ownership of both folders and config files to telegraf
sudo chown telegraf.telegraf /var/log/horizondaas/ /etc/telegraf/telegraf.d/horizondaas/ -R
when finished, stop and start telegraf
sudo systemctl stop telegraf
sudo systemctl start telegraf
To see status, check
sudo systemctl status telegraf
I have exported our Horizon DaaS Grafana dashboard for you to use.
You can download Horizon DaaS-1591722956207.json and import the json file into your Grafana. You probably need to change some datasource fields to allow your own input to work.
If you found this usefull, please leave a comment or share the post on twitter!