This originated from a demand to know how much RAM is consummed by Open Cloud when running a large number of workflow at the same time on the same node.
We differentiated between differents components :
- The "oc-stack", which is the minimum set of services to be able to create and schedule a workflow execution : oc-auth, oc-datacenter, oc-scheduler, oc-front, oc-schedulerd, oc-workflow, oc-catalog, oc-peer, oc-workspace, loki, mongo, traefik and nats
- oc-monitord, which is the daemon instanciated by the scheduling daemon (oc-schedulerd) that created the YAML for argo and creates the necessary kubernetes ressources.
We monitor both parts to view how much RAM the oc-stack uses before / during / after the execution, the RAM consummed by the monitord containers and the total of the stack and monitors.
# Setup
In order to have optimal performance we used a Promox server with high ressources (>370 GiB RAM and 128 cores) to hosts two VMs composing our Kubernetes cluster, with one control plane node were the oc stack is running and a worker node with only k3s running.
## VMs
We instantiated a 2 node kubernetes (with k3s) cluster on the superg PVE (https://superg-pve.irtse-pf.ext:8006/)
### VM Control
This vm is running the oc stack and the monitord containers, it carries the biggest part of the load. It must have k3s and argo installed. We allocated **62 GiB of RAM** and **31 cores**.
### VM Worker
This VM is holding the workload for all the pods created, acting as a worker node for the k3s cluster. We deploy k3s as a nodes as explained in the K3S quick start guide :
`curl -sfL https://get.k3s.io | K3S_URL=https://myserver:6443 K3S_TOKEN=mynodetoken sh -`
The value to use for K3S_TOKEN is stored at `/var/lib/rancher/k3s/server/node-token` on the server node.
Verify that the server has been added as a node to the cluster on the control plane with `kubectl get nodes` and look for the hostname of the worker VM on the list of nodes.
###Delegate pods to the worker node
In order for the pods to be executed on another node we need to modify how we construct he Argo YAML, to add a label in the metadata. We have added the needed attributes to the `Spec` struct in `oc-monitord` on the `test-ram` branch.
We will use a script to insert in the DB the executions that will create the monitord containers.
We need to retrieve two informations to execute the scripted insertion :
- The **workflow id** for the workflow we want to instantiate, this is can be located in the DB
- A **token** to authentify against the API, connect to oc-front and retrieve the token in your browser network analyzer tool.
Add these to the `insert_exex.sh` script.
The script takes two arguments :
- **$1** : the number of executions, which are created by chunks of 10 using a CRON expression to create 10 execution**S** for each execution/namespace
- **$2** : the number of minutes between now and the execution time for the executions.