Compare commits
5 Commits
572ab5d0c4
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| 8e74e2b399 | |||
|
|
6722c365fd | ||
|
|
3da3ada710 | ||
|
|
a9b5f6dcad | ||
|
|
a10021fb98 |
BIN
docs/S3/img/argo-watch-executing.gif
Normal file
|
After Width: | Height: | Size: 3.5 MiB |
BIN
docs/S3/img/ns-creation-after-booking.gif
Normal file
|
After Width: | Height: | Size: 2.5 MiB |
BIN
docs/S3/img/secrets-created-in-s3.gif
Normal file
|
After Width: | Height: | Size: 1.9 MiB |
BIN
docs/S3/img/workflow.png
Normal file
|
After Width: | Height: | Size: 124 KiB |
44
docs/S3/reparted-S3-readme.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# Allowing reparted Pods to use S3 storage
|
||||
|
||||
As a first way to transfer data from one processing node to another we have implemented the mechanics that allow a pod to access a bucket on a S3 compatible server which is not on the same kubernetes cluster.
|
||||
|
||||
For this we will use an example Workflow run with Argo and Admiralty on the node *Control*, with the **curl** and **mosquitto** processing executing on the control node and the other processing on the *Target01* node.
|
||||
To transfer data we will use the **S3** and **output/input** annotations handled by Argo, using two *Minio* servers on Control and Target01.
|
||||
|
||||

|
||||
|
||||
|
||||
When the user launches a booking on the UI a request is sent to **oc-scheduler**, which :
|
||||
- Check if another booking is scheduled at the time requested
|
||||
- Creates the booking and workflow executions in the DB
|
||||
- Creates the namespace, service accounts and rights for argo to execute
|
||||
|
||||

|
||||
|
||||
We added another action to the existing calls that were made to **oc-datacenter**.
|
||||
|
||||
**oc-scheduler** retrieves all the storage ressources in the workflow and for each, retrieves the *computing* ressources that host a processing ressource using the storage ressource. Here we have :
|
||||
- Minio Control :
|
||||
- Control (via the first cURL)
|
||||
- Target01 (via imagemagic)
|
||||
|
||||
- Minio Target01 :
|
||||
- Control (via alpine)
|
||||
- Target01 (via cURL, openalpr and mosquitto)
|
||||
|
||||
If the computing and storage ressources are on the same node, **oc-scheduler** uses an empty POST request to the route and **oc-datacenter** create the credentials on the S3 server and store them in a kubernetes secret in the execution's namespace.
|
||||
|
||||
If the two ressources are in different nodes **oc-scheduler** uses a POST request which states it needs to retrieve the credentials, reads the response and call the appopriate **oc-datacenter** to create a kubernetes secret. This means if we add three nodes
|
||||
- A from which the workflow is scheduled
|
||||
- B where the storage is
|
||||
- C where the computing is
|
||||
|
||||
A can contact B to retrieve the credentials, post them to C for storage and then run an Argo Workflow, from which a pod will be deported to C and will be able to access the S3 server on B.
|
||||
|
||||

|
||||
|
||||
# Final
|
||||
|
||||
We can see that the different processing are able to access the required data on different storage ressources, and that our ALPR analysis is sent to the mosquitto server and to the HTTP endpoint we set in the last cURL.
|
||||
|
||||

|
||||
BIN
docs/admiralty/Capture d’écran du 2025-05-20 16-03-39.png
Normal file
|
After Width: | Height: | Size: 31 KiB |
BIN
docs/admiralty/Capture d’écran du 2025-05-20 16-04-21.png
Normal file
|
After Width: | Height: | Size: 31 KiB |
@@ -3,6 +3,85 @@
|
||||
We have written two playbooks available on a private [GitHub repo](https://github.com/pi-B/ansible-oc/tree/384a5acc0713a0fa013a82f71fbe2338bf6c80c1/Admiralty)
|
||||
|
||||
- `deploy_admiralty.yml` installs Helm and necessary charts in order to run Admiralty on the cluster
|
||||
- `setup_admiralty_target.yml` create the environment necessary to use a cluster as a target in an Admiralty federation running Argo Workflows. Create the necessary serviceAccount, target ressource and token to authentify the source
|
||||
- `add_admiralty_target.yml` creates the environment to use a cluster as a source, providing the data necessary to use a given cluster as a target.
|
||||
|
||||
# Ansible playbook
|
||||
|
||||
ansible-playbook deploy_admiralty.yml -i <REMOTE_HOST_IP>, --extra-vars "user_prompt=<YOUR_USER>" --ask-pass
|
||||
|
||||
```yaml
|
||||
- name: Install Helm
|
||||
hosts: all:!localhost
|
||||
user: "{{ user_prompt }}"
|
||||
become: true
|
||||
# become_method: su
|
||||
vars:
|
||||
arch_mapping: # Map ansible architecture {{ ansible_architecture }} names to Docker's architecture names
|
||||
x86_64: amd64
|
||||
aarch64: arm64
|
||||
|
||||
|
||||
tasks:
|
||||
- name: Check if Helm does exist
|
||||
ansible.builtin.command:
|
||||
cmd: which helm
|
||||
register: result_which
|
||||
failed_when: result_which.rc not in [ 0, 1 ]
|
||||
|
||||
- name: Install helm
|
||||
when: result_which.rc == 1
|
||||
block:
|
||||
- name: download helm from source
|
||||
ansible.builtin.get_url:
|
||||
url: https://get.helm.sh/helm-v3.15.0-linux-amd64.tar.gz
|
||||
dest: ./
|
||||
|
||||
- name: unpack helm
|
||||
ansible.builtin.unarchive:
|
||||
remote_src: true
|
||||
src: helm-v3.15.0-linux-amd64.tar.gz
|
||||
dest: ./
|
||||
|
||||
- name: copy helm to path
|
||||
ansible.builtin.command:
|
||||
cmd: mv linux-amd64/helm /usr/local/bin/helm
|
||||
|
||||
- name: Install admiralty
|
||||
hosts: all:!localhost
|
||||
user: "{{ user_prompt }}"
|
||||
|
||||
tasks:
|
||||
- name: Install required python libraries
|
||||
become: true
|
||||
# become_method: su
|
||||
package:
|
||||
name:
|
||||
- python3
|
||||
- python3-yaml
|
||||
state: present
|
||||
|
||||
- name: Add jetstack repo
|
||||
ansible.builtin.shell:
|
||||
cmd: |
|
||||
helm repo add jetstack https://charts.jetstack.io && \
|
||||
helm repo update
|
||||
|
||||
- name: Install cert-manager
|
||||
kubernetes.core.helm:
|
||||
chart_ref: jetstack/cert-manager
|
||||
release_name: cert-manager
|
||||
context: default
|
||||
namespace: cert-manager
|
||||
create_namespace: true
|
||||
wait: true
|
||||
set_values:
|
||||
- value: installCRDs=true
|
||||
|
||||
- name: Install admiralty
|
||||
kubernetes.core.helm:
|
||||
name: admiralty
|
||||
chart_ref: oci://public.ecr.aws/admiralty/admiralty
|
||||
namespace: admiralty
|
||||
create_namespace: true
|
||||
chart_version: 0.16.0
|
||||
wait: true
|
||||
```
|
||||
|
||||
BIN
docs/performance_test/100_monitors.png
Normal file
|
After Width: | Height: | Size: 34 KiB |
BIN
docs/performance_test/10_monitors.png
Normal file
|
After Width: | Height: | Size: 31 KiB |
BIN
docs/performance_test/150_monitors.png
Normal file
|
After Width: | Height: | Size: 30 KiB |
151
docs/performance_test/README.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# Goals
|
||||
|
||||
This originated from a demand to know how much RAM is consummed by Open Cloud when running a large number of workflow at the same time on the same node.
|
||||
|
||||
We differentiated between differents components :
|
||||
|
||||
- The "oc-stack", which is the minimum set of services to be able to create and schedule a workflow execution : oc-auth, oc-datacenter, oc-scheduler, oc-front, oc-schedulerd, oc-workflow, oc-catalog, oc-peer, oc-workspace, loki, mongo, traefik and nats
|
||||
|
||||
- oc-monitord, which is the daemon instanciated by the scheduling daemon (oc-schedulerd) that created the YAML for argo and creates the necessary kubernetes ressources.
|
||||
|
||||
We monitor both parts to view how much RAM the oc-stack uses before / during / after the execution, the RAM consummed by the monitord containers and the total of the stack and monitors.
|
||||
|
||||
# Setup
|
||||
|
||||
In order to have optimal performance we used a Promox server with high ressources (>370 GiB RAM and 128 cores) to hosts two VMs composing our Kubernetes cluster, with one control plane node were the oc stack is running and a worker node with only k3s running.
|
||||
|
||||
## VMs
|
||||
|
||||
We instantiated a 2 node kubernetes (with k3s) cluster on the superg PVE (https://superg-pve.irtse-pf.ext:8006/)
|
||||
|
||||
### VM Control
|
||||
|
||||
This vm is running the oc stack and the monitord containers, it carries the biggest part of the load. It must have k3s and argo installed. We allocated **62 GiB of RAM** and **31 cores**.
|
||||
|
||||
### VM Worker
|
||||
|
||||
This VM is holding the workload for all the pods created, acting as a worker node for the k3s cluster. We deploy k3s as a nodes as explained in the K3S quick start guide :
|
||||
|
||||
`curl -sfL https://get.k3s.io | K3S_URL=https://myserver:6443 K3S_TOKEN=mynodetoken sh -`
|
||||
|
||||
The value to use for K3S_TOKEN is stored at `/var/lib/rancher/k3s/server/node-token` on the server node.
|
||||
|
||||
Verify that the server has been added as a node to the cluster on the control plane with `kubectl get nodes` and look for the hostname of the worker VM on the list of nodes.
|
||||
|
||||
### Delegate pods to the worker node
|
||||
|
||||
In order for the pods to be executed on another node we need to modify how we construct he Argo YAML, to add a label in the metadata. We have added the needed attributes to the `Spec` struct in `oc-monitord` on the `test-ram` branch.
|
||||
|
||||
```go
|
||||
type Spec struct {
|
||||
ServiceAccountName string `yaml:"serviceAccountName"`
|
||||
Entrypoint string `yaml:"entrypoint"`
|
||||
Arguments []Parameter `yaml:"arguments,omitempty"`
|
||||
Volumes []VolumeClaimTemplate `yaml:"volumeClaimTemplates,omitempty"`
|
||||
Templates []Template `yaml:"templates"`
|
||||
Timeout int `yaml:"activeDeadlineSeconds,omitempty"`
|
||||
NodeSelector struct{
|
||||
NodeRole string `yaml:"node-role"`
|
||||
} `yaml:"nodeSelector"`
|
||||
}
|
||||
```
|
||||
|
||||
and added the tag in the `CreateDAG()` method :
|
||||
|
||||
```go
|
||||
b.Workflow.Spec.NodeSelector.NodeRole = "worker"
|
||||
```
|
||||
|
||||
## Container monitoring
|
||||
|
||||
Docker compose to instantiate the monitoring stack :
|
||||
- Prometheus : storing data
|
||||
- Cadvisor : monitoring of the containers
|
||||
|
||||
```yml
|
||||
version: '3.2'
|
||||
services:
|
||||
prometheus:
|
||||
image: prom/prometheus:latest
|
||||
container_name: prometheus
|
||||
ports:
|
||||
- 9090:9090
|
||||
command:
|
||||
- --config.file=/etc/prometheus/prometheus.yml
|
||||
volumes:
|
||||
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
|
||||
depends_on:
|
||||
- cadvisor
|
||||
cadvisor:
|
||||
image: gcr.io/cadvisor/cadvisor:latest
|
||||
container_name: cadvisor
|
||||
ports:
|
||||
- 9999:8080
|
||||
volumes:
|
||||
- /:/rootfs:ro
|
||||
- /var/run:/var/run:rw
|
||||
- /sys:/sys:ro
|
||||
- /var/lib/docker/:/var/lib/docker:ro
|
||||
|
||||
```
|
||||
|
||||
Prometheus scrapping configuration :
|
||||
|
||||
```yml
|
||||
scrape_configs:
|
||||
- job_name: cadvisor
|
||||
scrape_interval: 5s
|
||||
static_configs:
|
||||
- targets:
|
||||
- cadvisor:8080
|
||||
```
|
||||
|
||||
## Dashboards
|
||||
|
||||
In order to monitor the ressource consumption during our tests we need to create dashboard in Grafana.
|
||||
|
||||
We create 4 different queries using Prometheus as the data source. For each query we can use the `code` mode to create them from a PromQL query.
|
||||
|
||||
### OC stack consumption
|
||||
|
||||
```
|
||||
sum(container_memory_usage_bytes{name=~"oc-auth|oc-datacenter|oc-scheduler|oc-front|oc-schedulerd|oc-workflow|oc-catalog|oc-peer|oc-workspace|loki|mongo|traefik|nats"})
|
||||
```
|
||||
|
||||
### Monitord consumption
|
||||
|
||||
```
|
||||
sum(container_memory_usage_bytes{image="oc-monitord"})
|
||||
```
|
||||
|
||||
### Total RAM consumption
|
||||
|
||||
```
|
||||
sum(
|
||||
container_memory_usage_bytes{name=~"oc-auth|oc-datacenter|oc-scheduler|oc-front|oc-schedulerd|oc-workflow|oc-catalog|oc-peer|oc-workspace|loki|mongo|traefik|nats"}
|
||||
or
|
||||
container_memory_usage_bytes{image="oc-monitord"}
|
||||
)
|
||||
```
|
||||
|
||||
### Number of monitord containers
|
||||
|
||||
```
|
||||
count(container_memory_usage_bytes{image="oc-monitord"} > 0)
|
||||
```
|
||||
|
||||
# Launch executions
|
||||
|
||||
We will use a script to insert in the DB the executions that will create the monitord containers.
|
||||
|
||||
We need to retrieve two informations to execute the scripted insertion :
|
||||
|
||||
- The **workflow id** for the workflow we want to instantiate, this is can be located in the DB
|
||||
- A **token** to authentify against the API, connect to oc-front and retrieve the token in your browser network analyzer tool.
|
||||
|
||||
Add these to the `insert_exex.sh` script.
|
||||
|
||||
The script takes two arguments :
|
||||
- **$1** : the number of executions, which are created by chunks of 10 using a CRON expression to create 10 execution**S** for each execution/namespace
|
||||
|
||||
- **$2** : the number of minutes between now and the execution time for the executions.
|
||||
72
docs/performance_test/insert_exec.sh
Executable file
@@ -0,0 +1,72 @@
|
||||
#!/bin/bash
|
||||
|
||||
TOKEN=""
|
||||
WORFLOW=""
|
||||
|
||||
NB_EXEC=$1
|
||||
TIME=$2
|
||||
|
||||
if [ -z "$NB_EXEC" ]; then
|
||||
NB_EXEC=1
|
||||
fi
|
||||
|
||||
# if (( NB_EXEC % 10 != 0 )); then
|
||||
# echo "Met un chiffre rond stp"
|
||||
# exit 0
|
||||
# fi
|
||||
|
||||
if [ -z "$TIME" ]; then
|
||||
TIME=1
|
||||
fi
|
||||
|
||||
|
||||
EXECS=$(((NB_EXEC+9) / 10))
|
||||
echo EXECS=$EXECS
|
||||
|
||||
DAY=$(date +%d -u)
|
||||
MONTH=$(date +%m -u)
|
||||
HOUR=$(date +%H -u)
|
||||
MINUTE=$(date -d "$TIME min" +"%M" -u)
|
||||
SECOND=$(date +%s -u)
|
||||
|
||||
start_loop=$(date +%s)
|
||||
|
||||
for ((i = 1; i <= $EXECS; i++)); do
|
||||
(
|
||||
start_req=$(date +%s)
|
||||
|
||||
echo "Exec $i"
|
||||
CRON="0-10 $MINUTE $HOUR $DAY $MONTH *"
|
||||
echo "$CRON"
|
||||
|
||||
START="2025-$MONTH-$DAY"T"$HOUR:$MINUTE:00.012Z"
|
||||
|
||||
END_MONTH=$(printf "%02d" $((MONTH + 1)))
|
||||
END="2025-$END_MONTH-$DAY"T"$HOUR:$MINUTE:00.012Z"
|
||||
|
||||
# PAYLOAD=$(printf '{"id":null,"name":null,"cron":"","mode":1,"start":"%s","end":"%s"}' "$START" "$END")
|
||||
PAYLOAD=$(printf '{"id":null,"name":null,"cron":"%s","mode":1,"start":"%s","end":"%s"}' "$CRON" "$START" "$END")
|
||||
|
||||
# echo $PAYLOAD
|
||||
|
||||
curl -X 'POST' "http://localhost:8000/scheduler/$WORKFLOW" \
|
||||
-H 'accept: application/json' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "$PAYLOAD" \
|
||||
-H "Authorization: Bearer $TOKEN" -w '\n'
|
||||
|
||||
end=$(date +%s)
|
||||
duration=$((end - start_req))
|
||||
|
||||
echo "Début $start_req"
|
||||
echo "Fin $end"
|
||||
echo "Durée d'exécution $i : $duration secondes"
|
||||
)&
|
||||
|
||||
done
|
||||
|
||||
wait
|
||||
|
||||
end_loop=$(date +%s)
|
||||
total_time=$((end_loop - start_loop))
|
||||
echo "Durée d'exécution total : $total_time secondes"
|
||||
43
docs/performance_test/performance_report.md
Normal file
@@ -0,0 +1,43 @@
|
||||
We used a very simple mono node workflow which execute a simple sleep command within an alpine container
|
||||
|
||||

|
||||
|
||||
# 10 monitors
|
||||
|
||||

|
||||
|
||||
# 100 monitors
|
||||
|
||||

|
||||
|
||||
# 150 monitors
|
||||
|
||||

|
||||
|
||||
# Observations
|
||||
|
||||
We see an increase in the memory usage by the OC stack which initially is around 600/700 MiB :
|
||||
|
||||
```
|
||||
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
|
||||
7ce889dd97cc oc-auth 0.00% 21.82MiB / 11.41GiB 0.19% 125MB / 61.9MB 23.3MB / 5.18MB 9
|
||||
93be30148a12 oc-catalog 0.14% 17.52MiB / 11.41GiB 0.15% 300MB / 110MB 35.1MB / 242kB 9
|
||||
611de96ee37e oc-datacenter 0.32% 21.85MiB / 11.41GiB 0.19% 38.7MB / 18.8MB 14.8MB / 0B 9
|
||||
dafb3027cfc6 oc-front 0.00% 5.887MiB / 11.41GiB 0.05% 162kB / 3.48MB 1.65MB / 12.3kB 7
|
||||
d7601fd64205 oc-peer 0.23% 16.46MiB / 11.41GiB 0.14% 201MB / 74.2MB 27.6MB / 606kB 9
|
||||
a78eb053f0c8 oc-scheduler 0.00% 17.24MiB / 11.41GiB 0.15% 125MB / 61.1MB 17.3MB / 1.13MB 10
|
||||
bfbc3c7c2c14 oc-schedulerd 0.07% 15.05MiB / 11.41GiB 0.13% 303MB / 293MB 7.58MB / 176kB 9
|
||||
304bb6a65897 oc-workflow 0.44% 107.6MiB / 11.41GiB 0.92% 2.54GB / 2.65GB 50.9MB / 11.2MB 10
|
||||
62e243c1c28f oc-workspace 0.13% 17.1MiB / 11.41GiB 0.15% 193MB / 95.6MB 34.4MB / 2.14MB 10
|
||||
3c9311c8b963 loki 1.57% 147.4MiB / 11.41GiB 1.26% 37.4MB / 16.4MB 148MB / 459MB 13
|
||||
01284abc3c8e mongo 1.48% 86.78MiB / 11.41GiB 0.74% 564MB / 1.48GB 35.6MB / 5.35GB 94
|
||||
14fc9ac33688 traefik 2.61% 49.53MiB / 11.41GiB 0.42% 72.1MB / 72.1MB 127MB / 2.2MB 13
|
||||
4f1b7890c622 nats 0.70% 78.14MiB / 11.41GiB 0.67% 2.64GB / 2.36GB 17.3MB / 2.2MB 14
|
||||
|
||||
Total 631.2 Mb
|
||||
```
|
||||
|
||||
However over time with the repetition of a large number of scheduling that the stacks uses a larger amount of RAM.
|
||||
|
||||
Espacially it seems that **loki**, **nats**, **mongo**, **oc-datacenter** and **oc-workflow** grow overs 150 MiB. This can be explained by the cache growing in these containers, which seems to be reduced every time the containers are restarted.
|
||||
|
||||
BIN
docs/performance_test/wf_test_ram_1node.png
Normal file
|
After Width: | Height: | Size: 16 KiB |
BIN
performance_test
Normal file
|
After Width: | Height: | Size: 16 KiB |