Compare commits
21 Commits
c0f8822eb3
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| 8e74e2b399 | |||
|
|
6722c365fd | ||
|
|
3da3ada710 | ||
|
|
a9b5f6dcad | ||
|
|
a10021fb98 | ||
|
|
572ab5d0c4 | ||
|
|
4dca4b3a51 | ||
| 0cff31b32f | |||
| 91c272d58f | |||
| 55794832ad | |||
| a0340e41b0 | |||
|
|
f93d5a662b | ||
|
|
faa21b5da9 | ||
| 6ae9655ca0 | |||
|
|
b31134c6cd | ||
|
|
22e22b98b4 | ||
| fba603c9a6 | |||
|
|
91f5f44cea | ||
|
|
33bfe79f66 | ||
|
|
134889b247 | ||
|
|
e846c38719 |
BIN
docs/S3/img/argo-watch-executing.gif
Normal file
|
After Width: | Height: | Size: 3.5 MiB |
BIN
docs/S3/img/ns-creation-after-booking.gif
Normal file
|
After Width: | Height: | Size: 2.5 MiB |
BIN
docs/S3/img/secrets-created-in-s3.gif
Normal file
|
After Width: | Height: | Size: 1.9 MiB |
BIN
docs/S3/img/workflow.png
Normal file
|
After Width: | Height: | Size: 124 KiB |
44
docs/S3/reparted-S3-readme.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# Allowing reparted Pods to use S3 storage
|
||||
|
||||
As a first way to transfer data from one processing node to another we have implemented the mechanics that allow a pod to access a bucket on a S3 compatible server which is not on the same kubernetes cluster.
|
||||
|
||||
For this we will use an example Workflow run with Argo and Admiralty on the node *Control*, with the **curl** and **mosquitto** processing executing on the control node and the other processing on the *Target01* node.
|
||||
To transfer data we will use the **S3** and **output/input** annotations handled by Argo, using two *Minio* servers on Control and Target01.
|
||||
|
||||

|
||||
|
||||
|
||||
When the user launches a booking on the UI a request is sent to **oc-scheduler**, which :
|
||||
- Check if another booking is scheduled at the time requested
|
||||
- Creates the booking and workflow executions in the DB
|
||||
- Creates the namespace, service accounts and rights for argo to execute
|
||||
|
||||

|
||||
|
||||
We added another action to the existing calls that were made to **oc-datacenter**.
|
||||
|
||||
**oc-scheduler** retrieves all the storage ressources in the workflow and for each, retrieves the *computing* ressources that host a processing ressource using the storage ressource. Here we have :
|
||||
- Minio Control :
|
||||
- Control (via the first cURL)
|
||||
- Target01 (via imagemagic)
|
||||
|
||||
- Minio Target01 :
|
||||
- Control (via alpine)
|
||||
- Target01 (via cURL, openalpr and mosquitto)
|
||||
|
||||
If the computing and storage ressources are on the same node, **oc-scheduler** uses an empty POST request to the route and **oc-datacenter** create the credentials on the S3 server and store them in a kubernetes secret in the execution's namespace.
|
||||
|
||||
If the two ressources are in different nodes **oc-scheduler** uses a POST request which states it needs to retrieve the credentials, reads the response and call the appopriate **oc-datacenter** to create a kubernetes secret. This means if we add three nodes
|
||||
- A from which the workflow is scheduled
|
||||
- B where the storage is
|
||||
- C where the computing is
|
||||
|
||||
A can contact B to retrieve the credentials, post them to C for storage and then run an Argo Workflow, from which a pod will be deported to C and will be able to access the S3 server on B.
|
||||
|
||||

|
||||
|
||||
# Final
|
||||
|
||||
We can see that the different processing are able to access the required data on different storage ressources, and that our ALPR analysis is sent to the mosquitto server and to the HTTP endpoint we set in the last cURL.
|
||||
|
||||

|
||||
33
docs/WP/authentication_access_control.md
Normal file
@@ -0,0 +1,33 @@
|
||||
## General architecture
|
||||
|
||||
Each OpenCloud instance will provide an OpenId interface. This interface may be connected to an existing LDAP Server or a dedicated one.
|
||||
The main advantage of this distributed solution is that each partner will manage it's own users and profiles. It simplifies access control management as each peer does not have to be aware of other peers users, but will only define access rules globally for the peer.
|
||||
|
||||
## Users / roles / groups
|
||||
Users in opencloud belong to a peer (company), they may be part of groups within the company (organisational unit, project, ...).
|
||||
Within those groups or globally for the peer, they may have different roles (project manager, workflow designer, accountant,...).
|
||||
Roles will define the list of permissions granted to that role.
|
||||
|
||||
## User permissions definition
|
||||
|
||||
Each OpenCloud instance will manage it's users and their permissions though the user/group/role scheme defined in the previous chapter.
|
||||
On a local instance basic permissions are :
|
||||
* a user has permission to start a distributed workflow using remote peers
|
||||
* a user has permissions to view financial information on the instance
|
||||
* a user has permissions to change the service exchange rates
|
||||
|
||||
On a remote instance basic permission are :
|
||||
* exceute workflow (quota + peers subset ?)
|
||||
* store data (quota + peers subset ?)
|
||||
|
||||
|
||||
## Authentication process
|
||||
|
||||
Each OpenCloud peer will accept a company/group as a whole.
|
||||
Upon user connection, it will receive user rights form the originating OpenId connect server and apply them. ex: specific pricing for a group (company agreement, project agreement, ...)
|
||||
A collaborative workspace
|
||||
|
||||
|
||||
## Resources don't have a static url
|
||||
They will map to an internal url of the service
|
||||
Once a workflow is initialized and ready for launch temporary urls proxying to the real service will be provided to the wokflow at booking time/
|
||||
0
docs/WP/distributed_execution.md
Normal file
8
docs/WP/oc-accounting.md
Normal file
@@ -0,0 +1,8 @@
|
||||
# Description
|
||||
|
||||
The oc-acounting service will aggregate billing information for each peer in a daily(TBC) basis.
|
||||
Payment will b
|
||||
|
||||
# Requirements
|
||||
|
||||
*
|
||||
4
docs/WP/oc-currencies.md
Normal file
@@ -0,0 +1,4 @@
|
||||
# Description
|
||||
|
||||
The oc-currencies service is able to convert oc-coins current value to or from main currencies (€/$)
|
||||
It allow to display real currency total cost in all user interfaces, and to update product with a real currency fixes price to the fluctuating oc-coin value
|
||||
0
docs/WP/oc-deploy.md
Normal file
11
docs/WP/oc-own_usage.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# Description
|
||||
|
||||
The oc-own_usage service will monitor and store the consumption data for all the workflows initiated from our own OpenCloud instance.
|
||||
The collected data will be accessible both in real time and for past workflows for the user that sent them and the allowed profiles in the current OpenCloud instance
|
||||
Collected data will also be used to prevent abusive peers billing after a workflow execution.
|
||||
|
||||
# Requirements
|
||||
|
||||
* A user sending a workflow in a distributed environment shall be able to monitor it's resource consumption
|
||||
* The resource consumption shall be available in both techical data (Storage/time,RAM/time,CPU/time) and monetary (coins / currency)
|
||||
* The consumption information may filtered by peer, getting the full consumption data for each peer involved in the current workflow. This information may be use by the user to analyze/optimize its future workflows. it will aslo be used by the accounting system to check consistency between peers billing and monitored consumption.
|
||||
14
docs/WP/oc-peer.md
Normal file
@@ -0,0 +1,14 @@
|
||||
# Description
|
||||
|
||||
This component holds a database of all known peers.
|
||||
It also performs the required operation when getting a new peer/group request :
|
||||
* Shows peer identity/certificates
|
||||
* Accept or reject a peer/group as partner
|
||||
* Define allowed service
|
||||
* Define visibility
|
||||
* Create a dedicated namespace if allowed to use our Compute and quotas
|
||||
* Define storage quotas
|
||||
* Generate access keys for the services
|
||||
* Returns the answer and interfacing data to the requester
|
||||
|
||||
|
||||
13
docs/WP/oc-rates.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# Description
|
||||
|
||||
The oc-rates service define the applicable rates for services in our own OpenCloud instance
|
||||
(data storage, RAM usage, CPU time, GPU time, HPC cluster execution, ...)
|
||||
A default rate shall be defined for all public peers.
|
||||
Peers/groups (project) having a specific agreement may benefit of custom rates
|
||||
|
||||
# Requirements
|
||||
|
||||
* An authorized user (specific permission) will be able to define default rates and specific peers rates.
|
||||
* The default rates shall be accessible to every internal and external user.
|
||||
* The custom rates shall be only accessible to users belonging to the relevant peer
|
||||
*
|
||||
11
docs/WP/oc-resource-usage.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# Description
|
||||
|
||||
The oc-peers_usage service will monitor and store the consumption data of all the peers workflows involving our own OpenCloud instance.
|
||||
The collected data will be accessible both in real time and for monitoring the current OpenCloud instance workflows in order to perform peers billing.
|
||||
|
||||
# Requirements
|
||||
|
||||
* The resource consumption shall be available in both techical data (Storage/time,RAM/time,CPU/time) and monetary (coins / currency)
|
||||
* The resource consumption shall be available to the user that started a workflow/donwloaded data from our instance for the related items (related workflow(s) and data)
|
||||
* The complete resource consumtion for a peer/group(project) shall be available to users granted with a specific permission
|
||||
*
|
||||
1
docs/WP/oc-sync.md
Normal file
@@ -0,0 +1 @@
|
||||
This service offers realtime shared data synchronization between OpenCloud instances.
|
||||
63
docs/WP/rbac.md
Normal file
@@ -0,0 +1,63 @@
|
||||
|
||||
# Actions for people from my DC
|
||||
|
||||
## Search
|
||||
|
||||
- Allow internal
|
||||
- Allow distributed
|
||||
|
||||
## Workspace
|
||||
|
||||
- Allow share
|
||||
|
||||
## Workflow editor
|
||||
|
||||
- Allow edit
|
||||
- Allow book
|
||||
- Allow send
|
||||
- Allow share
|
||||
|
||||
# Resources
|
||||
|
||||
- Allow view/read/write
|
||||
|
||||
# Peer
|
||||
|
||||
- Allow requesting partnership
|
||||
- Allow accepting unknown
|
||||
|
||||
# User
|
||||
|
||||
- Allow adding
|
||||
- Allow editing
|
||||
- Allow editing myself
|
||||
|
||||
# Actions for people from other DC
|
||||
|
||||
## Search
|
||||
|
||||
- Allow search
|
||||
|
||||
## Workspace
|
||||
|
||||
- Allow share with me
|
||||
|
||||
## Workflow
|
||||
|
||||
- Allow book
|
||||
- Allow send
|
||||
- Allow share with me (implied by Workspace)
|
||||
|
||||
# Resources
|
||||
|
||||
- Allow view
|
||||
- Price depending on Peer/User/(project=>Collaborative Area) ?
|
||||
|
||||
# Peer
|
||||
|
||||
- Allow requesting partnership
|
||||
|
||||
# User
|
||||
|
||||
- Allow checking credentials
|
||||
- Allow getting profile
|
||||
71
docs/WP/workflow_design.md
Normal file
@@ -0,0 +1,71 @@
|
||||
## Workflow design rules
|
||||
|
||||
1. A data resource may be directly linked to a processing
|
||||
1. A processing resource is linked always linked to the next processing resource.
|
||||
1. If two processing resources need to exchange file data, they need to be connected to the same file storage(s)
|
||||
1. A processing shall be linked to a computing resource
|
||||
|
||||
|
||||
### Data - Resource link
|
||||
|
||||
A data resource may be linked to a processing resource.
|
||||
The processing resource shall be compatible with the data resource format and API.
|
||||
|
||||
#### Example 1 : Trivial download
|
||||
For a simple example :
|
||||
* the data resource will provide an http url to download a file.
|
||||
* the processing resource is the simple curl command that will download the file in the current computing resource
|
||||
|
||||
|
||||
#### Example 2 : Advanced download processing resource
|
||||
|
||||
For a more specific example :
|
||||
* the data resource is a complex data archive
|
||||
* the processing resource is a complex download component that can be configured to retrieve specific data or datasets from the archive
|
||||
|
||||
|
||||
### Processing - Processing link
|
||||
|
||||
Dependant processings must be graphically linked, those links allow build the workflow's acyclic diagram.
|
||||
|
||||
|
||||
### Processing - Storage link
|
||||
|
||||
A processing may be linked to one or several storage resources.
|
||||
|
||||
#### Basic storage resource types
|
||||
|
||||
Storage resource types generally require a list source - destination information tha describe reaw/write operations.
|
||||
This information is associated to the link between the processing and the storage resources.
|
||||
|
||||
*In the case of a write to storage operation :*
|
||||
* the source information specifies the local path/filename in the container where the file is created by the processing.
|
||||
* the destination information contains the url/path/filename where the file shall be stored
|
||||
|
||||
*In the case of a read from storage operation :*
|
||||
* the source information specifies the url/path/filename where the file is stored
|
||||
* the destination information contains local path/filename in the container where the file shall be created for the processing to use it.
|
||||
|
||||
##### Local cluster storage
|
||||
|
||||
The generated Argo workflow defines a local storage available to all containers in the current cluster
|
||||
This storage is available from every container under the path defined in the $LOCALSTORAGE environment varlable.
|
||||
On this special storage, as it is mounted in all container, the source - destination information is implicit.
|
||||
Any data can be read or written directly.
|
||||
|
||||
##### S3 type storages
|
||||
|
||||
Several S3 compatible storage may be used in a workflow
|
||||
* OCS3 : The global MinIO deployed in the local OpenCloud instance
|
||||
* Generic S3 : Any external accessible S3 compatible service
|
||||
* WFS3 : An internal MinIO instance deployed for the workflow duration, that instance might be exposed outside the current cluster
|
||||
|
||||
##### Custom storage types
|
||||
|
||||
|
||||
|
||||
### Processing - Computing link
|
||||
|
||||
A processing shall be connected to a computing link.
|
||||
|
||||
Argo volcano ?
|
||||
BIN
docs/admiralty/Capture d’écran du 2025-05-20 16-03-39.png
Normal file
|
After Width: | Height: | Size: 31 KiB |
BIN
docs/admiralty/Capture d’écran du 2025-05-20 16-04-21.png
Normal file
|
After Width: | Height: | Size: 31 KiB |
BIN
docs/admiralty/auth_schema.jpg
Normal file
|
After Width: | Height: | Size: 91 KiB |
90
docs/admiralty/authentication.md
Normal file
@@ -0,0 +1,90 @@
|
||||
# Current authentication process
|
||||
|
||||
We are currently able to authentify against a remote `Admiralty Target` to execute pods from the `Source` cluster in a remote cluster, in the context of an `Argo Workflow`. The resulting artifacts or data can then be retrieved in the source cluster.
|
||||
|
||||
In this document we present the steps needed for this authentication process, its flaws and the improvments we could make.
|
||||
|
||||

|
||||
|
||||
## Requirements
|
||||
|
||||
### Namespace
|
||||
|
||||
In each cluster we need the same `namespace` to exist. Hence, both namespace need to have the same resources available, mmeaning here that Argo must be deployed in the same way.
|
||||
|
||||
> We haven't tested it yet, but maybe the `version` of the Argo Workflow shoud be the same in order to prevent mismatch between functionnalities.
|
||||
|
||||
### ServiceAccount
|
||||
|
||||
A `serviceAccount` with the same name must be created in each side of the cluster federation.
|
||||
|
||||
In the case of Argo Workflows it will be used to submit the workflow in the `Argo CLI` or should be specified in the `spec.serviceAccountName` field of the Workflow.
|
||||
|
||||
#### Roles
|
||||
|
||||
Given that the `serviceAccount` will be the same in both cluster, it must be binded with the appropriate `role` in order to execute both the Argo Workflow and Admiralty actions.
|
||||
|
||||
So far we only have seen the need to add the `patch` verb on `pods` for the `apiGroup` "" in `argo-role`.
|
||||
|
||||
Once the patch is done the role the `serviceAccount` that will be used must be added to the rolebinding `argo-binding`.
|
||||
|
||||
### Token
|
||||
|
||||
In order to authentify against the Kubernetes API we need to provide the Admiralty `Source` with a token stored in a secret. This token is created on the `Target` for the `serviceAccount` that we will use in the Admiralty communication. After copying it, we replace the IP in the `kubeconfig` with the IP that will be targeted by the source to reach the k8s API. The token generated for the serviceAccount is added in the "user" part of the kubeconfig.
|
||||
|
||||
This **edited kubeconfig** is then passed to the source cluster and converted into a secret, bound to the Admiralty `Target` resource. It is presented to the the k8s API on the target cluster, first as part of the TLS handshake and then to authenticate the serviceAccount that performs the pods delegation.
|
||||
|
||||
### Source/Target
|
||||
|
||||
Each cluster in the Admiralty Federation needs to declare **all of the other clusters** :
|
||||
|
||||
- That he will delegate pods to, with the `Target` resource
|
||||
|
||||
```yaml
|
||||
apiVersion: multicluster.admiralty.io/v1alpha1
|
||||
kind: Target
|
||||
metadata:
|
||||
name: some-name
|
||||
namespace: your-namespace
|
||||
spec:
|
||||
kubeconfigSecret:
|
||||
name: secret-holding-kubeconfig-info
|
||||
```
|
||||
|
||||
- That he will accept pods from, with the `Source` resource
|
||||
|
||||
```yaml
|
||||
apiVersion: multicluster.admiralty.io/v1alpha1
|
||||
kind: Source
|
||||
metadata:
|
||||
name: some-name
|
||||
namespace: your-namespace
|
||||
spec:
|
||||
serviceAccountName: service-account-used-by-source
|
||||
```
|
||||
|
||||
|
||||
## Caveats
|
||||
|
||||
### Token
|
||||
|
||||
By default, a token created by the kubernetes API is only valid for **1 hour**, which can pose problem for :
|
||||
|
||||
- Workflows taking more than 1 hour to execute, with pods requesting creation on a remote cluster when the token is expired
|
||||
|
||||
- Retransfering the modified `kubeconfig`, we need a way that allows a secure communication of the data between to clusters running Open Cloud.
|
||||
|
||||
It is possible to create token with **infinite duration** (in reality 10 years) but the Admiralty documentation **advices against** this for security issues.
|
||||
|
||||
### resources' name
|
||||
|
||||
When coupling Argo Workflows with a MinIO server to store the artifacts produced by a pod we need to access, for example but not only, a secret containing the authentication data. If we launch a workflow on cluster A and B, the secret resource containing the auth. data can't have the same thing in cluster A and B.
|
||||
|
||||
At the moment the only time we have faced this issue is with the MinIO s3 storage access. Since it is a service that we could deploy ourself we would have the possibility to attribute naming containing an UUID linked to the OC instance.
|
||||
|
||||
## Possible improvements
|
||||
|
||||
- Pods bound token, can they be issued to the remote cluster via an http API call ? [doc](https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
|
||||
|
||||
- Using a service that contact its counterpart in the target cluster, to ask for a token with a validity set by the user in the workflow workspace. Communication over HTTPS, but how do we generate secure certificates on both ends ?
|
||||
|
||||
87
docs/admiralty/deployment.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# Deploying Admiralty on a Open Cloud cluster
|
||||
|
||||
We have written two playbooks available on a private [GitHub repo](https://github.com/pi-B/ansible-oc/tree/384a5acc0713a0fa013a82f71fbe2338bf6c80c1/Admiralty)
|
||||
|
||||
- `deploy_admiralty.yml` installs Helm and necessary charts in order to run Admiralty on the cluster
|
||||
|
||||
# Ansible playbook
|
||||
|
||||
ansible-playbook deploy_admiralty.yml -i <REMOTE_HOST_IP>, --extra-vars "user_prompt=<YOUR_USER>" --ask-pass
|
||||
|
||||
```yaml
|
||||
- name: Install Helm
|
||||
hosts: all:!localhost
|
||||
user: "{{ user_prompt }}"
|
||||
become: true
|
||||
# become_method: su
|
||||
vars:
|
||||
arch_mapping: # Map ansible architecture {{ ansible_architecture }} names to Docker's architecture names
|
||||
x86_64: amd64
|
||||
aarch64: arm64
|
||||
|
||||
|
||||
tasks:
|
||||
- name: Check if Helm does exist
|
||||
ansible.builtin.command:
|
||||
cmd: which helm
|
||||
register: result_which
|
||||
failed_when: result_which.rc not in [ 0, 1 ]
|
||||
|
||||
- name: Install helm
|
||||
when: result_which.rc == 1
|
||||
block:
|
||||
- name: download helm from source
|
||||
ansible.builtin.get_url:
|
||||
url: https://get.helm.sh/helm-v3.15.0-linux-amd64.tar.gz
|
||||
dest: ./
|
||||
|
||||
- name: unpack helm
|
||||
ansible.builtin.unarchive:
|
||||
remote_src: true
|
||||
src: helm-v3.15.0-linux-amd64.tar.gz
|
||||
dest: ./
|
||||
|
||||
- name: copy helm to path
|
||||
ansible.builtin.command:
|
||||
cmd: mv linux-amd64/helm /usr/local/bin/helm
|
||||
|
||||
- name: Install admiralty
|
||||
hosts: all:!localhost
|
||||
user: "{{ user_prompt }}"
|
||||
|
||||
tasks:
|
||||
- name: Install required python libraries
|
||||
become: true
|
||||
# become_method: su
|
||||
package:
|
||||
name:
|
||||
- python3
|
||||
- python3-yaml
|
||||
state: present
|
||||
|
||||
- name: Add jetstack repo
|
||||
ansible.builtin.shell:
|
||||
cmd: |
|
||||
helm repo add jetstack https://charts.jetstack.io && \
|
||||
helm repo update
|
||||
|
||||
- name: Install cert-manager
|
||||
kubernetes.core.helm:
|
||||
chart_ref: jetstack/cert-manager
|
||||
release_name: cert-manager
|
||||
context: default
|
||||
namespace: cert-manager
|
||||
create_namespace: true
|
||||
wait: true
|
||||
set_values:
|
||||
- value: installCRDs=true
|
||||
|
||||
- name: Install admiralty
|
||||
kubernetes.core.helm:
|
||||
name: admiralty
|
||||
chart_ref: oci://public.ecr.aws/admiralty/admiralty
|
||||
namespace: admiralty
|
||||
create_namespace: true
|
||||
chart_version: 0.16.0
|
||||
wait: true
|
||||
```
|
||||
69
docs/catalog_metadata.md
Normal file
@@ -0,0 +1,69 @@
|
||||
# Metadata
|
||||
|
||||
``` json
|
||||
{
|
||||
"id" : "string"
|
||||
"name" : "string"
|
||||
"short description" : "string"
|
||||
"description" : "string"
|
||||
"logo" : "string"
|
||||
"creator" : "peer_id"
|
||||
"owner(s)" {
|
||||
"name" : "string"
|
||||
"logo" : "string"
|
||||
}
|
||||
|
||||
"instances":
|
||||
[
|
||||
{
|
||||
"location" : "geo coord"
|
||||
"country" : "string"
|
||||
"url" : "string"
|
||||
allowed_groups : peers_group
|
||||
<specific data> see below
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Common
|
||||
* locations: location list
|
||||
|
||||
## Data
|
||||
|
||||
* personal_data: bool
|
||||
* anonymized_personal_data: bool
|
||||
* type: string
|
||||
* license
|
||||
* open_data: bool
|
||||
* quality: string
|
||||
|
||||
## ComputeUnit
|
||||
* infrastructure : Kubernetes, Docker, HW, Slurm, Condor
|
||||
* architecture : X86_64
|
||||
* access_protocol : enum(KubeAPI(https) [over SSH], DirectSSH, Slurm [over SSH], Docker [over SSH], over Opencloud, over VPN) or string
|
||||
* security_level: Secnumcloud, HDS, ... Gaia1/2/3...
|
||||
* countries
|
||||
* investors:
|
||||
* power_source:
|
||||
* usage_restrictions: string
|
||||
|
||||
## Processing
|
||||
|
||||
* url: string
|
||||
* type: container, exe
|
||||
* license: string
|
||||
* open_source: bool
|
||||
* marutity: (stable, alpha,...)
|
||||
|
||||
|
||||
## Storage
|
||||
|
||||
* type: File, S3, Database
|
||||
* url: string
|
||||
* access_protocol : enum(KubeAPI(https) [over SSH], DirectSSH, Slurm [over SSH], Docker [over SSH], over Opencloud, over VPN) or string
|
||||
* security_level: Secnumcloud, HDS, ... Gaia1/2/3...
|
||||
* country:
|
||||
* investors:
|
||||
* usage_restrictions: string
|
||||
260
docs/catalog_metadata.puml
Normal file
@@ -0,0 +1,260 @@
|
||||
@startuml
|
||||
|
||||
|
||||
class Resource
|
||||
{
|
||||
"id" : "string"
|
||||
"name" : "string"
|
||||
"short description" : "string"
|
||||
"description" : "string"
|
||||
"logo" : "string"
|
||||
"creator" : "peer_id"
|
||||
"usage_restrictions" : "string"
|
||||
|
||||
}
|
||||
|
||||
class StoragePartnerResource
|
||||
{
|
||||
"namespace" : "string"
|
||||
"sizing_indicator" : "string"
|
||||
"peer_group"="string"
|
||||
}
|
||||
class ProcessingPartnerResource
|
||||
{
|
||||
"namespace" : "string"
|
||||
"sizing_indicator" : "string"
|
||||
"peer_group"="string"
|
||||
}
|
||||
class ComputeUnitPartnerResource
|
||||
{
|
||||
"namespace" : "string"
|
||||
"sizing_indicator" : "string"
|
||||
"peer_group"="string"
|
||||
}
|
||||
class DataPartnerResource
|
||||
{
|
||||
"namespace" : "string"
|
||||
"sizing_indicator" : "string"
|
||||
"peer_group"="string"
|
||||
}
|
||||
|
||||
class ResourceInstance
|
||||
{
|
||||
"location" : "geo coord"
|
||||
"country" : "string"
|
||||
"url" : "string"
|
||||
}
|
||||
|
||||
|
||||
class StorageInstance
|
||||
{
|
||||
|
||||
}
|
||||
|
||||
class DataInstance
|
||||
{
|
||||
|
||||
}
|
||||
|
||||
|
||||
class ComputeUnitInstance
|
||||
{
|
||||
"cpus":
|
||||
"gpus":
|
||||
"ram":
|
||||
"security_level" : "string"
|
||||
"power_source" : "string"
|
||||
}
|
||||
|
||||
class cpu
|
||||
{
|
||||
"model" : "string"
|
||||
"cores" : "int"
|
||||
"frequency" : "float"
|
||||
"architecture" : "string"
|
||||
}
|
||||
|
||||
class gpu
|
||||
{
|
||||
"model" : "string"
|
||||
"memory" : "float"
|
||||
}
|
||||
|
||||
class ram
|
||||
{
|
||||
"size" : "int"
|
||||
}
|
||||
|
||||
class bandwidth
|
||||
{
|
||||
"up" : "float"
|
||||
"down" : "float"
|
||||
}
|
||||
|
||||
class ProcessingInstance
|
||||
{
|
||||
|
||||
}
|
||||
|
||||
class Owner
|
||||
{
|
||||
"name" : "string"
|
||||
"logo" : "string"
|
||||
}
|
||||
|
||||
|
||||
class DataPricingStrategy {
|
||||
"unlimited"
|
||||
"subscription"
|
||||
"pay per use"
|
||||
}
|
||||
|
||||
class DataPricing
|
||||
{
|
||||
"price" : "float"
|
||||
"price_per_gb" : "float"
|
||||
"price_per_request" : "float"
|
||||
"price_per_api_call" : "float"
|
||||
"price_per_data_transfer" : "float"
|
||||
"price_per_data_download" : "float"
|
||||
|
||||
}
|
||||
|
||||
class ProcessingPricing
|
||||
{
|
||||
|
||||
"price" : "float"
|
||||
"price_per_request" : "float"
|
||||
"price_per_api_call" : "float"
|
||||
"price_per_data_transfer" : "float"
|
||||
"price_per_data_processing" : "float"
|
||||
"price_per_data_storage" : "float"
|
||||
"price_per_data_download" : "float"
|
||||
}
|
||||
|
||||
class ComputeUnitPricingStrategy {
|
||||
"overflow" : "booked, allowed, garanted"
|
||||
}
|
||||
|
||||
class ComputeUnitPricing
|
||||
{
|
||||
"cpu_price" : "float"
|
||||
"gpu_price" : "float"
|
||||
"ram_price" : "float"
|
||||
"refund" : "bool"
|
||||
}
|
||||
|
||||
class StoragePricing
|
||||
{
|
||||
|
||||
"price" : "float"
|
||||
"price_per_request" : "float"
|
||||
"price_per_api_call" : "float"
|
||||
"price_per_data_transfer" : "float"
|
||||
"price_per_data_processing" : "float"
|
||||
"price_per_data_storage" : "float"
|
||||
"price_per_data_download" : "float"
|
||||
}
|
||||
class Data
|
||||
{
|
||||
"personal_data" : "bool"
|
||||
"anonymized_personal_data" : "bool"
|
||||
"type" : "string"
|
||||
"license" : "string"
|
||||
"open_data" : "bool"
|
||||
"quality" : "string"
|
||||
"static" : bool
|
||||
"update_period" : "string"
|
||||
}
|
||||
|
||||
|
||||
class ComputeUnit
|
||||
{
|
||||
"type" : "string"
|
||||
"infrastructure" : "string"
|
||||
"architecture" : "string"
|
||||
"investors" : "string"
|
||||
}
|
||||
|
||||
class Processing {
|
||||
"type" : "string"
|
||||
"license" : "string"
|
||||
"open_source" : "bool"
|
||||
"maturity" : "string"
|
||||
"service" : "bool"
|
||||
}
|
||||
|
||||
class Container
|
||||
{
|
||||
"image" : "string"
|
||||
"command" : "string"
|
||||
"args" : "string"
|
||||
"env" : "string"
|
||||
"volumes" : "string"
|
||||
}
|
||||
|
||||
Processing "0" *-- "1" Container
|
||||
Container "0" *-- "*" Expose
|
||||
|
||||
class Expose
|
||||
{
|
||||
Port:int
|
||||
Reverse: "string"
|
||||
PAT:int
|
||||
}
|
||||
|
||||
class ProcessingUsage
|
||||
{
|
||||
"hypothesis" : "string"
|
||||
"cpu":
|
||||
"gpu":
|
||||
"ram":
|
||||
"storage": "float"
|
||||
"scalingmodel": "string"
|
||||
}
|
||||
|
||||
class Storage {
|
||||
"type" : "string"
|
||||
"security_level" : "string"
|
||||
"investors" : "string"
|
||||
"support" : "string"
|
||||
}
|
||||
|
||||
|
||||
Resource -- Owner
|
||||
|
||||
Resource <|- Data
|
||||
Resource <|- ComputeUnit
|
||||
Resource <|- Processing
|
||||
Resource <|- Storage
|
||||
Resource <|- Workflow
|
||||
|
||||
|
||||
StoragePartnerResource -- StoragePricingStrategy
|
||||
ProcessingPartnerResource -- ProcessingPricingStrategy
|
||||
ComputeUnitPartnerResource -- ComputeUnitPricingStrategy
|
||||
DataPartnerResource -- DataPricingStrategy
|
||||
|
||||
StorageInstance "1" *-- "*" StoragePartnerResource
|
||||
DataInstance "1" *-- "*" DataPartnerResource
|
||||
ProcessingInstance "1" *-- "*" ProcessingPartnerResource
|
||||
ComputeUnitInstance "1" *-- "*" ComputeUnitPartnerResource
|
||||
|
||||
DataPricingStrategy <|-- DataPricing
|
||||
ProcessingPricingStrategy <|-- ProcessingPricing
|
||||
ComputeUnitPricingStrategy <|-- ComputeUnitPricing
|
||||
StoragePricingStrategy <|-- StoragePricing
|
||||
|
||||
ResourceInstance <|-- StorageInstance
|
||||
ResourceInstance <|-- DataInstance
|
||||
ResourceInstance <|-- ComputeUnitInstance
|
||||
ResourceInstance <|-- ProcessingInstance
|
||||
|
||||
Storage "1" *-- "*" StorageInstance
|
||||
Processing "1" *-- "*" ProcessingInstance
|
||||
ComputeUnit "1" *-- "*" ComputeUnitInstance
|
||||
Data "1" *-- "*" DataInstance
|
||||
|
||||
|
||||
|
||||
@enduml
|
||||
13
docs/collaborative_area.md
Normal file
@@ -0,0 +1,13 @@
|
||||
Rulebook integrated to OpenCloud
|
||||
Dynamic rulebook update
|
||||
|
||||
Intégration d'Ekitia par défaut pour contôle éventuel
|
||||
Ok from all to start workflow ?
|
||||
Possibilité de révoquer un traitement par n'importe quel membre ?
|
||||
|
||||
blocage workflow si modif critères workspace
|
||||
attributs sur le workflow ?
|
||||
|
||||
|
||||
|
||||
|
||||
65
docs/development.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# OpenCloud base stack
|
||||
|
||||
OpenCloud relies on a micro services architecture.
|
||||
Each component could be developed using specific technologies.
|
||||
However, in order to preserve product consistency and ease maintenance activities, we strongly encourage using the following technological stacks.
|
||||
|
||||
## Web services
|
||||
|
||||
Web services are developped in Go using Beego stack
|
||||
|
||||
### Environment setup
|
||||
|
||||
When using pricate repositories like the OpenCloud git forge, you should define it as a private repository
|
||||
|
||||
export GOPRIVATE=cloud.o-forge.io
|
||||
|
||||
The Beego stack provides the bee cli tool to ease building process :
|
||||
|
||||
go get github.com/beego/bee/v2@latest
|
||||
|
||||
### Project initialization
|
||||
|
||||
New component creation
|
||||
|
||||
go mod init oc-mycomponent
|
||||
|
||||
Refer to other services component main.go file to write a consitent initialisation process
|
||||
|
||||
### Project build
|
||||
|
||||
In order to build the software :
|
||||
|
||||
bee run -downdoc=true -gendoc=true
|
||||
|
||||
The -downdoc=true -gendoc=true will automatically generate swagger documentation in the /swagger path
|
||||
|
||||
If default Swagger page is displayed instead of your api, change url in swagger/index.html file to :
|
||||
|
||||
url: "swagger.json"
|
||||
|
||||
If annotations are modified without any code changed, a rebuild might not reflect the changes.
|
||||
To force routing information update :
|
||||
|
||||
bee generate routers
|
||||
|
||||
## GUI components
|
||||
|
||||
The GUI are developped using Flutter framework
|
||||
|
||||
### Environment setup
|
||||
|
||||
* Install Flutter framework
|
||||
* Install Android Studio
|
||||
* In "Tools"->"SDK Manager"->"Apparenace & Behaviour/System Settings/Android SDK", go to "SDK tools" and tick the "Android SDK command line tools"
|
||||
* Run <code>flutter doctor</code> commmand and follow instructions to accept SDK licenses
|
||||
* Add Vscode flutter plugin and use Vscode Command palette to create a Flutter project
|
||||
* Also set the target Device using command Palette
|
||||
|
||||
### Project build
|
||||
|
||||
Depending on your target platform :
|
||||
|
||||
flutter build web
|
||||
flutter build linux
|
||||
flutter build windows
|
||||
0
docs/discovery.md
Normal file
37
docs/glossary.md
Normal file
@@ -0,0 +1,37 @@
|
||||
# Glossary
|
||||
|
||||
## Resource
|
||||
|
||||
An OpenCloud resource is an item that is shareable by any OpenCloud partner.
|
||||
it may be :
|
||||
* A data
|
||||
* An algorithm
|
||||
* A compute unit
|
||||
* A storage facility
|
||||
* A workflow refering to any of the previous items
|
||||
|
||||
## Catalog
|
||||
|
||||
The OpenCloud catalog contains a resource metadata list.
|
||||
|
||||
## Workspace
|
||||
|
||||
A workspace is a user selected set of resources.
|
||||
|
||||
## Workflow
|
||||
|
||||
A workflow is the processing of multiple resources.
|
||||
|
||||
## Service
|
||||
|
||||
A service is a deployment of permanent resources.
|
||||
|
||||
## Collaborative area
|
||||
|
||||
A collaborative area is an environment for shariung wokspaces / workflows / services between selected partners.
|
||||
|
||||
## Rule book
|
||||
|
||||
List of rules that a shareds workspace shall conform to
|
||||
|
||||
|
||||
69
docs/minio.md
Normal file
@@ -0,0 +1,69 @@
|
||||
# Setting up
|
||||
|
||||
Minio can be deployed using the argo workflow [documentation ](https://argo-workflows.readthedocs.io/en/latest/configure-artifact-repository/#configuring-minio) or use the ansible playbook written by `pierre.bayle[at]irt-saintexupery.com` available (here)[https://raw.githubusercontent.com/pi-B/ansible-oc/refs/heads/main/deploy_minio.yml?token=GHSAT0AAAAAAC5OBWUCGHWPA4OUAKHBKB4GZ4YTPGQ].
|
||||
|
||||
Launch the playbook with `ansible-playbook -i [your host ip or url], deploy_minio.yml --extra-vars "user_prompt=[your user]" [--ask-become-pass]`
|
||||
|
||||
- If your user doesn't have the `NOPASSWD` rights on the host use the `--ask-become-pass` to allow ansible to use `sudo`
|
||||
- Fill the value for `memory_req`, `storage_req` and `replicas` in the playbook's vars. The pods won't necessarily use it fully, but if the total memory or storage request of your pod's pool excede your host's capabilities the deployment might fail.
|
||||
|
||||
|
||||
## Flaws of the default install
|
||||
|
||||
- Requests 16Gi of memory / pods
|
||||
- Requests 500Gi of storage
|
||||
- Creates 16 replicas
|
||||
- Dosen't expose the MinIO GUI to the exterior of cluster
|
||||
|
||||
# Allow API access
|
||||
|
||||
Visit the MinIO GUI (on port 9001) and create the bucket(s) you will use (here `oc-bucket`) and access keys, encode them with base64 and create a secret in the argo namespace :
|
||||
|
||||
```
|
||||
kubectl create secret -n [name of your argo namespace] generic argo-artifact-secret \
|
||||
--from-literal=access-key=[your access key] \
|
||||
--from-literal=secret-key=[your secret key]
|
||||
```
|
||||
|
||||
- Create a ConfigMap, which will be used by argo to create the S3 artifact, the content can match the one from the previously created secret
|
||||
|
||||
```
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
# If you want to use this config map by default, name it "artifact-repositories". Otherwise, you can provide a reference to a
|
||||
# different config map in `artifactRepositoryRef.configMap`.
|
||||
name: artifact-repositories
|
||||
# annotations:
|
||||
# # v3.0 and after - if you want to use a specific key, put that key into this annotation.
|
||||
# workflows.argoproj.io/default-artifact-repository: oc-s3-artifact-repository
|
||||
data:
|
||||
oc-s3-artifact-repository: |
|
||||
s3:
|
||||
bucket: oc-bucket
|
||||
endpoint: [ retrieve cluster with kubectl get service argo-artifacts -o jsonpath="{.spec.clusterIP}" ]:9000
|
||||
insecure: true
|
||||
accessKeySecret:
|
||||
name: argo-artifact-secret
|
||||
key: access-key
|
||||
secretKeySecret:
|
||||
name: argo-artifact-secret
|
||||
key: secret-key
|
||||
|
||||
```
|
||||
|
||||
# Store Argo Workflow objects in MinIO S3 bucket
|
||||
|
||||
Here is an exemple of how to store some file/dir from an argo pod to an existing s3 bucket
|
||||
|
||||
```
|
||||
outputs:
|
||||
parameters:
|
||||
- name: outfile [or OUTDIR ]
|
||||
value: [NAME OF THE FILE OR DIR TO STORE]
|
||||
artifacts:
|
||||
- name: outputs
|
||||
path: [ PATH OF THE FILE OR DIR IN THE CONTAINER]
|
||||
s3:
|
||||
key: [PATH OF THE FILE IN THE BUCKET].tgz'
|
||||
```
|
||||
123
docs/opencloud_intro.md
Normal file
@@ -0,0 +1,123 @@
|
||||
# Introduction
|
||||
|
||||
OpenCloud is an open-source, distributed cloud solution that enables you to selectively share, sell, or rent your infrastructure resources—such as data, algorithms, compute power, and storage with other OpenCloud peers. It facilitates distributed workflow execution between partners, allowing seamless collaboration across decentralized networks.
|
||||
|
||||
Distributed execution within this peer-to-peer network can be optimized according to your own priorities:
|
||||
|
||||
* **Maximal sovereignty**
|
||||
* **Accelerated computation**
|
||||
* **Cost minimization**
|
||||
* **Optimized infrastructure investments**
|
||||
|
||||
Each OpenCloud instance includes an OpenID-based distributed authentication system.
|
||||
OpenCloud is entirely decentralized, with no central authority or single point of failure (SPOF). Additionally, OpenCloud provides transaction tracking, allowing all partners to be aware of their distributed resource consumption and ensuring transparent peer-to-peer billing.
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
Each OpenCloud instance runs a collection of services that allow users to interact with both their own deployment and other OpenCloud participants.
|
||||
|
||||
### Resource Catalog
|
||||
|
||||
The **Resource Catalog** service indexes all the resources provided by the current instance, including **Data**, **Algorithms**, **Compute Units**, **Storages**, and pre-built **Processing Workflows**.
|
||||
All resources are described by metadata, as defined in the `catalog_metadata` document. Catalog resources can be either **public**, visible to all OpenCloud peers, or **private**, accessible only to selected partners or groups (e.g., projects, entities, etc.).
|
||||
Access to specific resources may require credentials, payment, or other access agreements.
|
||||
|
||||
---
|
||||
|
||||
### Workspace Management
|
||||
|
||||
Each OpenCloud user can create **workspaces** to organize resources of interest.
|
||||
Resources within a workspace can later be used to build processing workflows or set up new services.
|
||||
Users can define as many workspaces as needed to manage their projects efficiently.
|
||||
|
||||
---
|
||||
|
||||
### Workflow Editor
|
||||
|
||||
Using elements selected in a workspace, a user can build a **distributed processing workflow** or establish a **permanent service**.
|
||||
Workflows are constructed with OpenCloud's integrated workflow editor, offering a user-friendly interface for defining distributed processes.
|
||||
|
||||
---
|
||||
|
||||
### Collaborative Areas
|
||||
|
||||
OpenCloud enables the sharing of **workspaces** and **workflows** with selected partners, enhancing collaborative projects.
|
||||
A **Collaborative Area** can include multiple management and operation rules that are enforced automatically or verified manually. Examples include:
|
||||
|
||||
* Enforcing the use of only open-source components
|
||||
* Restricting the inclusion of personal data
|
||||
* Defining result visibility constraints
|
||||
* Imposing legal limitations
|
||||
|
||||
---
|
||||
|
||||
### Peer Management
|
||||
|
||||
OpenCloud allows you to define relationships with other peers, enabling the creation of private communities.
|
||||
Access rights related to peers can be managed at a **global peer scope** or for **specific groups** within the peer community.
|
||||
|
||||
---
|
||||
|
||||
## Benefits
|
||||
|
||||
### Complete Control Over Data Location
|
||||
|
||||
OpenCloud encourages users to host their own data.
|
||||
When external storage is necessary, OpenCloud enables users to carefully select partners and locations to ensure privacy, compliance, and performance.
|
||||
|
||||
---
|
||||
|
||||
### Cooperation Framework
|
||||
|
||||
OpenCloud provides a structured framework for sharing data, managing common workspaces, and defining usage regulations.
|
||||
This framework covers both **technical** and **legal aspects** for distributed projects.
|
||||
|
||||
---
|
||||
|
||||
### Data Redundancy
|
||||
|
||||
Like traditional public cloud architectures, OpenCloud supports **data redundancy** but with finer-grained control.
|
||||
You can distribute your data across multiple OpenCloud instances, ensuring availability and resilience.
|
||||
|
||||
---
|
||||
|
||||
### Compatibility with Public Cloud Infrastructure
|
||||
|
||||
When your workloads require massive storage or computational capabilities beyond what your OpenCloud peers can provide, you can seamlessly deploy an OpenCloud instance on any public cloud provider.
|
||||
This hybrid approach allows you to scale effortlessly for workloads that are not sensitive to international competition.
|
||||
|
||||
---
|
||||
|
||||
### Fine-Grained Access Control
|
||||
|
||||
OpenCloud provides **fine-grained access control**, enabling you to precisely define access policies for partners and communities.
|
||||
|
||||
---
|
||||
|
||||
### Lightweight for Datacenter and Edge Deployments
|
||||
|
||||
The OpenCloud stack is developed in **Go**, generating **native code** and minimal **scratch containers**. All selected COTS (Commercial Off-The-Shelf) components used by OpenCloud services are chosen with these design principles in mind.
|
||||
|
||||
The objective is to enable OpenCloud to run on almost any platform:
|
||||
|
||||
* In **datacenters**, supporting large-scale processing workflows
|
||||
* On **ARM-based single-board computers**, handling concurrent payloads for diverse applications like **sensor preprocessing**, **image recognition**, or **data filtering**
|
||||
|
||||
GUIs are built with **Flutter** and rendered as plain **HTML/JS** for lightweight deployment.
|
||||
|
||||
---
|
||||
|
||||
### Fully Distributed Architecture
|
||||
|
||||
OpenCloud is fully decentralized, eliminating any **single point of failure**.
|
||||
There is no central administrator, and no central registration is required. This makes OpenCloud highly **resilient**, allowing partners to join or leave the network without impacting the broader OpenCloud community.
|
||||
|
||||
---
|
||||
|
||||
### Open Source and Transparent
|
||||
|
||||
To foster trust, OpenCloud is released as **open-source software**.
|
||||
Its code is publicly available for audit. The project is licensed under **AGPL V3** to prevent the emergence of closed, private forks that could compromise the OpenCloud community's transparency and trust.
|
||||
|
||||
104
docs/openid/glossary.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Glossary
|
||||
|
||||
# Oauth
|
||||
|
||||
## Ressource owner
|
||||
The user that will allow the app to read ressources that he/she will grant access for
|
||||
ex: the person that has a mail account
|
||||
|
||||
## Client
|
||||
The application that is requesting the ressources to use them on the behalf of the user
|
||||
ex : a mass mailing list service to all your contacts
|
||||
|
||||
## Authorization server
|
||||
|
||||
The application that knows the resource owner because it has an account there
|
||||
ex: the mail server authentication service
|
||||
|
||||
## Resource server
|
||||
|
||||
The API that the client will use on behalf of the user
|
||||
ex : the contact list API
|
||||
|
||||
## Redirect uri
|
||||
Url that will be used by the authorization server to send back the ressource owner to the client app after consenting to ressources access
|
||||
ex : mass mailing list "contact retrieve success/failure" page
|
||||
|
||||
## Response type
|
||||
Response type expeted by the client, usually "code" for an authorization code
|
||||
|
||||
## Scope
|
||||
Granular permission that the client wants
|
||||
ex: read contacts, read profile
|
||||
|
||||
## Consent
|
||||
The auhorization server takes the scopes that the clients requests and let the ressource owner choose to acccept them or not
|
||||
ex: access to your contacts ?
|
||||
|
||||
## Client Id
|
||||
To identify the client with the authorization server
|
||||
|
||||
## Client secret
|
||||
Shared between authorization server and client
|
||||
|
||||
## Authorization code
|
||||
Temporary code sent by authorization server to client
|
||||
The client then privately sends the authorization code along with the client secret to tha authorization server, in exchange for an access token
|
||||
|
||||
## Access token
|
||||
Key the client will use to communicate withe the ressource server
|
||||
|
||||
## Refresh token
|
||||
Token to get a new access token
|
||||
|
||||
# OIDC
|
||||
|
||||
## Oauth vs Oidc
|
||||
Oauth provides only a token for application access without any info on the user. OpenId adds information on the user.
|
||||
* Oauth enables an app to access ressources
|
||||
* Oidc enables an app to establish a login session and to access info about the user
|
||||
|
||||
## End user
|
||||
Oauth Resource Owner
|
||||
|
||||
## Relaying party
|
||||
Oauth client
|
||||
|
||||
## Identity provider
|
||||
OIDC enabled Oauth authorization server
|
||||
|
||||
## IdToken
|
||||
JWT token added to access token by OIDC with your identity info.
|
||||
|
||||
## Claims
|
||||
Attributes of the Id Token
|
||||
* Subject : uid for the user
|
||||
* Issuing Authority : url of identity provider
|
||||
* Audience : irdentifies the relying party that can use this token
|
||||
* Issue Date
|
||||
* Expiration Date
|
||||
* [Authentication Time]
|
||||
* [Nonce] : prevent replay attacks
|
||||
* [Name]
|
||||
* [Email]
|
||||
|
||||
## Scopes
|
||||
openid is a mandatory scope
|
||||
There a are 4 openid predefined scopes :
|
||||
* profile : access to the default profile claims
|
||||
* email
|
||||
* address
|
||||
* phone
|
||||
|
||||
## Identity provider Endpoints
|
||||
Several predefined endpoints exist on the Identity provider
|
||||
* Authorization endpoint
|
||||
* Token endpoint
|
||||
* UserInfo endpoint
|
||||
|
||||
## Recommended authorization flows
|
||||
* Authorization code
|
||||
* Authorization code with PKCE (Proof Key for Code Exchange) : for devices
|
||||
|
||||
## PKCE
|
||||
|
||||
19
docs/openid/oauth-app-requests-contacts-example.puml
Normal file
@@ -0,0 +1,19 @@
|
||||
@startuml
|
||||
|
||||
"User(ressource owner)"->"RequestingApp(client)": Select mail provider
|
||||
"RequestingApp(client)"->"User(ressource owner)": Redirect to mail provider with clientid,redirect_uri,response_type,scope
|
||||
"User(ressource owner)"->"MailProvider(authorization provider)": clientid,redirect_uri,response_type,scope
|
||||
"MailProvider(authorization provider)"->"MailProvider(authorization provider)": Active session ?
|
||||
"MailProvider(authorization provider)"-->"User(ressource owner)" : Login if no active session
|
||||
"User(ressource owner)"-->"MailProvider(authorization provider)" : Logs in
|
||||
"MailProvider(authorization provider)"->"User(ressource owner)": Asks for consent for each scope
|
||||
"User(ressource owner)"->"MailProvider(authorization provider)" : Grant or deny permission for each scope
|
||||
"MailProvider(authorization provider)"->"User(ressource owner)": Redirect to redirect_uri with authorization code
|
||||
"User(ressource owner)"->"RequestingApp(client)": Redirect to redirect_uri with authorization code
|
||||
"RequestingApp(client)"->"MailProvider(authorization provider)": Send authorization code, clientid, client_secret
|
||||
"MailProvider(authorization provider)"->"RequestingApp(client)": Send access token
|
||||
"RequestingApp(client)"->"MailProvider(resource server)": asks for contacts with access token
|
||||
"MailProvider(resource server)"->"RequestingApp(client)": Return contacts
|
||||
"RequestingApp(client)"->"User(ressource owner)": Display contacts
|
||||
|
||||
@enduml
|
||||
@@ -0,0 +1,19 @@
|
||||
@startuml
|
||||
|
||||
"User(ressource owner)"->"RequestingApp(client)": Select mail provider
|
||||
"RequestingApp(client)"->"User(ressource owner)": Redirect to mail provider with clientid,redirect_uri,response_type,scope<font color=red>+"openid"
|
||||
"User(ressource owner)"->"MailProvider(authorization provider)": clientid,redirect_uri,response_type,scope
|
||||
"MailProvider(authorization provider)"->"MailProvider(authorization provider)": Active session ?
|
||||
"MailProvider(authorization provider)"-->"User(ressource owner)" : Login if no active session
|
||||
"User(ressource owner)"-->"MailProvider(authorization provider)" : Logs in
|
||||
"MailProvider(authorization provider)"->"User(ressource owner)": Asks for consent for each scope
|
||||
"User(ressource owner)"->"MailProvider(authorization provider)" : Grant or deny permission for each scope
|
||||
"MailProvider(authorization provider)"->"User(ressource owner)": Redirect to redirect_uri with authorization code
|
||||
"User(ressource owner)"->"RequestingApp(client)": Redirect to redirect_uri with authorization code
|
||||
"RequestingApp(client)"->"MailProvider(authorization provider)": Send authorization code, clientid, client_secret
|
||||
"MailProvider(authorization provider)"->"RequestingApp(client)": Send access token<font color=red>+"idtoken"
|
||||
"RequestingApp(client)"->"MailProvider(resource server)": asks for contacts with access token
|
||||
"MailProvider(resource server)"->"RequestingApp(client)": Return contacts
|
||||
"RequestingApp(client)"->"User(ressource owner)": Display contacts
|
||||
|
||||
@enduml
|
||||
25
docs/openid/oidc_authcode-app-requests-contacts-example.puml
Normal file
@@ -0,0 +1,25 @@
|
||||
@startuml
|
||||
title "OpenID Connect Authorization Code Flow"
|
||||
actor "End User"
|
||||
boundary "Browser"
|
||||
"Relaying party"->"Browser": Identity providers list
|
||||
"End User"->"Browser": Select identity provider
|
||||
"Browser"->"Relaying party": Identity provider clicked
|
||||
"Relaying party"->"Browser": Redirect to identity provider with clientid, state,redirect_uri,response_type,scope<font color=red>+"openid"
|
||||
"Browser"->"Authorization endpoint": clientid,state,redirect_uri,response_type,scope
|
||||
"Authorization endpoint"->"Authorization endpoint": Active session ?
|
||||
"Authorization endpoint"-->"Browser" : Login if no active session
|
||||
"End User"-->"Browser" : Fills credentials
|
||||
"Browser"-->"Authorization endpoint" : Logs in
|
||||
"Authorization endpoint"->"Browser": Form for consent for each scope
|
||||
"End User"->"Browser": Grant or deny permission for each scope
|
||||
"Browser"->"Authorization endpoint" :Selected scopes
|
||||
"Authorization endpoint"->"Browser": Redirect to redirect_uri with authorization code+state provided earlier
|
||||
"Browser"->"Relaying party": Redirect to redirect_uri with authorization code
|
||||
"Relaying party"->"Token endpoint": Send authorization code, clientid, client_secret, redirect uri (for validation)
|
||||
"Token endpoint"->"Relaying party": Send access token<font color=red>+"idtoken"
|
||||
"Relaying party"->"UserInfo endpoint": Asks for profile with access token
|
||||
"UserInfo endpoint"->"Relaying party": Return profile
|
||||
"Relaying party"->"Browser": Display profile
|
||||
|
||||
@enduml
|
||||
25
docs/openid/oidc_pkce-app-requests-contacts-example.puml
Normal file
@@ -0,0 +1,25 @@
|
||||
@startuml
|
||||
title "OpenID Connect Authorization Code Flow with PKCE"
|
||||
actor "End User"
|
||||
boundary "App"
|
||||
"App"->"App": Identity providers list
|
||||
"End User"->"App": Select identity provider
|
||||
"App"->"App": Identity provider clicked
|
||||
"App"->"App": Generate code verifier and challenge
|
||||
"App"->"Authorization endpoint": clientid,state,redirect_uri,response_type,scope
|
||||
"Authorization endpoint"->"Authorization endpoint": Active session ?
|
||||
"Authorization endpoint"-->"App" : Login if no active session
|
||||
"End User"-->"App" : Fills credentials
|
||||
"App"-->"Authorization endpoint" : Logs in
|
||||
"Authorization endpoint"->"App": Form for consent for each scope
|
||||
"End User"->"App": Grant or deny permission for each scope
|
||||
"App"->"Authorization endpoint" :Selected scopes
|
||||
"Authorization endpoint"->"App": Redirect to redirect_uri with authorization code+state provided earlier
|
||||
"App"->"App": Redirect to redirect_uri with authorization code
|
||||
"App"->"Token endpoint": Send authorization code, clientid, --client_secret--,<font color=blue>+"code verifier"</font> , redirect uri (for validation)
|
||||
"Token endpoint"->"App": Send access token<font color=red>+"idtoken"
|
||||
"App"->"UserInfo endpoint": Asks for profile with access token
|
||||
"UserInfo endpoint"->"App": Return profile
|
||||
"App"->"App": Display profile
|
||||
|
||||
@enduml
|
||||
46
docs/openid/opencloud_openid.puml
Normal file
@@ -0,0 +1,46 @@
|
||||
@startuml
|
||||
|
||||
|
||||
Actor User
|
||||
Node "OpenCloud 1" as OC1 {
|
||||
Agent Traefik as tfk1
|
||||
Agent Catalog as cat1
|
||||
Agent Scheduler as shed1
|
||||
Collections "OC Services" as svcs1
|
||||
Component "Auth Service" as auth1
|
||||
Component OIDC as OIDC1
|
||||
Component "Keto?" as keto1
|
||||
Component "LDAP" as ldap1
|
||||
}
|
||||
User -> tfk1:sessionId
|
||||
tfk1 ---> cat1:IdToken+AccessToken
|
||||
tfk1 ---> shed1:IdToken+AccessToken
|
||||
tfk1 ---> svcs1:IdToken+AccessToken
|
||||
tfk1 ---> auth1
|
||||
auth1 -down-> OIDC1
|
||||
auth1 -down-> keto1
|
||||
OIDC1 -down-> ldap1
|
||||
|
||||
Node "OpenCloud 2" as OC2 {
|
||||
Agent Traefik as tfk2
|
||||
Agent Catalog as cat2
|
||||
Agent Scheduler as shed2
|
||||
Collections "OC Services" as svcs2
|
||||
Component "Auth Service" as auth2
|
||||
Component OIDC as OIDC2
|
||||
Component "Keto?" as keto2
|
||||
Component "LDAP" as ldap2
|
||||
}
|
||||
cat1 --> tfk2:IdToken+AccessToken
|
||||
tfk2 ---> cat2:IdToken+AccessToken
|
||||
tfk2 ---> shed2:IdToken+AccessToken
|
||||
tfk2 ---> svcs2:IdToken+AccessToken
|
||||
tfk2 -down-> auth2
|
||||
auth2 -down-> OIDC2
|
||||
auth2 -down-> keto2
|
||||
OIDC2 -down-> ldap2
|
||||
|
||||
auth2 -> auth1: validate id & access user groups
|
||||
auth2 -> tfk2: moderated scopes
|
||||
|
||||
@enduml
|
||||
BIN
docs/performance_test/100_monitors.png
Normal file
|
After Width: | Height: | Size: 34 KiB |
BIN
docs/performance_test/10_monitors.png
Normal file
|
After Width: | Height: | Size: 31 KiB |
BIN
docs/performance_test/150_monitors.png
Normal file
|
After Width: | Height: | Size: 30 KiB |
151
docs/performance_test/README.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# Goals
|
||||
|
||||
This originated from a demand to know how much RAM is consummed by Open Cloud when running a large number of workflow at the same time on the same node.
|
||||
|
||||
We differentiated between differents components :
|
||||
|
||||
- The "oc-stack", which is the minimum set of services to be able to create and schedule a workflow execution : oc-auth, oc-datacenter, oc-scheduler, oc-front, oc-schedulerd, oc-workflow, oc-catalog, oc-peer, oc-workspace, loki, mongo, traefik and nats
|
||||
|
||||
- oc-monitord, which is the daemon instanciated by the scheduling daemon (oc-schedulerd) that created the YAML for argo and creates the necessary kubernetes ressources.
|
||||
|
||||
We monitor both parts to view how much RAM the oc-stack uses before / during / after the execution, the RAM consummed by the monitord containers and the total of the stack and monitors.
|
||||
|
||||
# Setup
|
||||
|
||||
In order to have optimal performance we used a Promox server with high ressources (>370 GiB RAM and 128 cores) to hosts two VMs composing our Kubernetes cluster, with one control plane node were the oc stack is running and a worker node with only k3s running.
|
||||
|
||||
## VMs
|
||||
|
||||
We instantiated a 2 node kubernetes (with k3s) cluster on the superg PVE (https://superg-pve.irtse-pf.ext:8006/)
|
||||
|
||||
### VM Control
|
||||
|
||||
This vm is running the oc stack and the monitord containers, it carries the biggest part of the load. It must have k3s and argo installed. We allocated **62 GiB of RAM** and **31 cores**.
|
||||
|
||||
### VM Worker
|
||||
|
||||
This VM is holding the workload for all the pods created, acting as a worker node for the k3s cluster. We deploy k3s as a nodes as explained in the K3S quick start guide :
|
||||
|
||||
`curl -sfL https://get.k3s.io | K3S_URL=https://myserver:6443 K3S_TOKEN=mynodetoken sh -`
|
||||
|
||||
The value to use for K3S_TOKEN is stored at `/var/lib/rancher/k3s/server/node-token` on the server node.
|
||||
|
||||
Verify that the server has been added as a node to the cluster on the control plane with `kubectl get nodes` and look for the hostname of the worker VM on the list of nodes.
|
||||
|
||||
### Delegate pods to the worker node
|
||||
|
||||
In order for the pods to be executed on another node we need to modify how we construct he Argo YAML, to add a label in the metadata. We have added the needed attributes to the `Spec` struct in `oc-monitord` on the `test-ram` branch.
|
||||
|
||||
```go
|
||||
type Spec struct {
|
||||
ServiceAccountName string `yaml:"serviceAccountName"`
|
||||
Entrypoint string `yaml:"entrypoint"`
|
||||
Arguments []Parameter `yaml:"arguments,omitempty"`
|
||||
Volumes []VolumeClaimTemplate `yaml:"volumeClaimTemplates,omitempty"`
|
||||
Templates []Template `yaml:"templates"`
|
||||
Timeout int `yaml:"activeDeadlineSeconds,omitempty"`
|
||||
NodeSelector struct{
|
||||
NodeRole string `yaml:"node-role"`
|
||||
} `yaml:"nodeSelector"`
|
||||
}
|
||||
```
|
||||
|
||||
and added the tag in the `CreateDAG()` method :
|
||||
|
||||
```go
|
||||
b.Workflow.Spec.NodeSelector.NodeRole = "worker"
|
||||
```
|
||||
|
||||
## Container monitoring
|
||||
|
||||
Docker compose to instantiate the monitoring stack :
|
||||
- Prometheus : storing data
|
||||
- Cadvisor : monitoring of the containers
|
||||
|
||||
```yml
|
||||
version: '3.2'
|
||||
services:
|
||||
prometheus:
|
||||
image: prom/prometheus:latest
|
||||
container_name: prometheus
|
||||
ports:
|
||||
- 9090:9090
|
||||
command:
|
||||
- --config.file=/etc/prometheus/prometheus.yml
|
||||
volumes:
|
||||
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
|
||||
depends_on:
|
||||
- cadvisor
|
||||
cadvisor:
|
||||
image: gcr.io/cadvisor/cadvisor:latest
|
||||
container_name: cadvisor
|
||||
ports:
|
||||
- 9999:8080
|
||||
volumes:
|
||||
- /:/rootfs:ro
|
||||
- /var/run:/var/run:rw
|
||||
- /sys:/sys:ro
|
||||
- /var/lib/docker/:/var/lib/docker:ro
|
||||
|
||||
```
|
||||
|
||||
Prometheus scrapping configuration :
|
||||
|
||||
```yml
|
||||
scrape_configs:
|
||||
- job_name: cadvisor
|
||||
scrape_interval: 5s
|
||||
static_configs:
|
||||
- targets:
|
||||
- cadvisor:8080
|
||||
```
|
||||
|
||||
## Dashboards
|
||||
|
||||
In order to monitor the ressource consumption during our tests we need to create dashboard in Grafana.
|
||||
|
||||
We create 4 different queries using Prometheus as the data source. For each query we can use the `code` mode to create them from a PromQL query.
|
||||
|
||||
### OC stack consumption
|
||||
|
||||
```
|
||||
sum(container_memory_usage_bytes{name=~"oc-auth|oc-datacenter|oc-scheduler|oc-front|oc-schedulerd|oc-workflow|oc-catalog|oc-peer|oc-workspace|loki|mongo|traefik|nats"})
|
||||
```
|
||||
|
||||
### Monitord consumption
|
||||
|
||||
```
|
||||
sum(container_memory_usage_bytes{image="oc-monitord"})
|
||||
```
|
||||
|
||||
### Total RAM consumption
|
||||
|
||||
```
|
||||
sum(
|
||||
container_memory_usage_bytes{name=~"oc-auth|oc-datacenter|oc-scheduler|oc-front|oc-schedulerd|oc-workflow|oc-catalog|oc-peer|oc-workspace|loki|mongo|traefik|nats"}
|
||||
or
|
||||
container_memory_usage_bytes{image="oc-monitord"}
|
||||
)
|
||||
```
|
||||
|
||||
### Number of monitord containers
|
||||
|
||||
```
|
||||
count(container_memory_usage_bytes{image="oc-monitord"} > 0)
|
||||
```
|
||||
|
||||
# Launch executions
|
||||
|
||||
We will use a script to insert in the DB the executions that will create the monitord containers.
|
||||
|
||||
We need to retrieve two informations to execute the scripted insertion :
|
||||
|
||||
- The **workflow id** for the workflow we want to instantiate, this is can be located in the DB
|
||||
- A **token** to authentify against the API, connect to oc-front and retrieve the token in your browser network analyzer tool.
|
||||
|
||||
Add these to the `insert_exex.sh` script.
|
||||
|
||||
The script takes two arguments :
|
||||
- **$1** : the number of executions, which are created by chunks of 10 using a CRON expression to create 10 execution**S** for each execution/namespace
|
||||
|
||||
- **$2** : the number of minutes between now and the execution time for the executions.
|
||||
72
docs/performance_test/insert_exec.sh
Executable file
@@ -0,0 +1,72 @@
|
||||
#!/bin/bash
|
||||
|
||||
TOKEN=""
|
||||
WORFLOW=""
|
||||
|
||||
NB_EXEC=$1
|
||||
TIME=$2
|
||||
|
||||
if [ -z "$NB_EXEC" ]; then
|
||||
NB_EXEC=1
|
||||
fi
|
||||
|
||||
# if (( NB_EXEC % 10 != 0 )); then
|
||||
# echo "Met un chiffre rond stp"
|
||||
# exit 0
|
||||
# fi
|
||||
|
||||
if [ -z "$TIME" ]; then
|
||||
TIME=1
|
||||
fi
|
||||
|
||||
|
||||
EXECS=$(((NB_EXEC+9) / 10))
|
||||
echo EXECS=$EXECS
|
||||
|
||||
DAY=$(date +%d -u)
|
||||
MONTH=$(date +%m -u)
|
||||
HOUR=$(date +%H -u)
|
||||
MINUTE=$(date -d "$TIME min" +"%M" -u)
|
||||
SECOND=$(date +%s -u)
|
||||
|
||||
start_loop=$(date +%s)
|
||||
|
||||
for ((i = 1; i <= $EXECS; i++)); do
|
||||
(
|
||||
start_req=$(date +%s)
|
||||
|
||||
echo "Exec $i"
|
||||
CRON="0-10 $MINUTE $HOUR $DAY $MONTH *"
|
||||
echo "$CRON"
|
||||
|
||||
START="2025-$MONTH-$DAY"T"$HOUR:$MINUTE:00.012Z"
|
||||
|
||||
END_MONTH=$(printf "%02d" $((MONTH + 1)))
|
||||
END="2025-$END_MONTH-$DAY"T"$HOUR:$MINUTE:00.012Z"
|
||||
|
||||
# PAYLOAD=$(printf '{"id":null,"name":null,"cron":"","mode":1,"start":"%s","end":"%s"}' "$START" "$END")
|
||||
PAYLOAD=$(printf '{"id":null,"name":null,"cron":"%s","mode":1,"start":"%s","end":"%s"}' "$CRON" "$START" "$END")
|
||||
|
||||
# echo $PAYLOAD
|
||||
|
||||
curl -X 'POST' "http://localhost:8000/scheduler/$WORKFLOW" \
|
||||
-H 'accept: application/json' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "$PAYLOAD" \
|
||||
-H "Authorization: Bearer $TOKEN" -w '\n'
|
||||
|
||||
end=$(date +%s)
|
||||
duration=$((end - start_req))
|
||||
|
||||
echo "Début $start_req"
|
||||
echo "Fin $end"
|
||||
echo "Durée d'exécution $i : $duration secondes"
|
||||
)&
|
||||
|
||||
done
|
||||
|
||||
wait
|
||||
|
||||
end_loop=$(date +%s)
|
||||
total_time=$((end_loop - start_loop))
|
||||
echo "Durée d'exécution total : $total_time secondes"
|
||||
43
docs/performance_test/performance_report.md
Normal file
@@ -0,0 +1,43 @@
|
||||
We used a very simple mono node workflow which execute a simple sleep command within an alpine container
|
||||
|
||||

|
||||
|
||||
# 10 monitors
|
||||
|
||||

|
||||
|
||||
# 100 monitors
|
||||
|
||||

|
||||
|
||||
# 150 monitors
|
||||
|
||||

|
||||
|
||||
# Observations
|
||||
|
||||
We see an increase in the memory usage by the OC stack which initially is around 600/700 MiB :
|
||||
|
||||
```
|
||||
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
|
||||
7ce889dd97cc oc-auth 0.00% 21.82MiB / 11.41GiB 0.19% 125MB / 61.9MB 23.3MB / 5.18MB 9
|
||||
93be30148a12 oc-catalog 0.14% 17.52MiB / 11.41GiB 0.15% 300MB / 110MB 35.1MB / 242kB 9
|
||||
611de96ee37e oc-datacenter 0.32% 21.85MiB / 11.41GiB 0.19% 38.7MB / 18.8MB 14.8MB / 0B 9
|
||||
dafb3027cfc6 oc-front 0.00% 5.887MiB / 11.41GiB 0.05% 162kB / 3.48MB 1.65MB / 12.3kB 7
|
||||
d7601fd64205 oc-peer 0.23% 16.46MiB / 11.41GiB 0.14% 201MB / 74.2MB 27.6MB / 606kB 9
|
||||
a78eb053f0c8 oc-scheduler 0.00% 17.24MiB / 11.41GiB 0.15% 125MB / 61.1MB 17.3MB / 1.13MB 10
|
||||
bfbc3c7c2c14 oc-schedulerd 0.07% 15.05MiB / 11.41GiB 0.13% 303MB / 293MB 7.58MB / 176kB 9
|
||||
304bb6a65897 oc-workflow 0.44% 107.6MiB / 11.41GiB 0.92% 2.54GB / 2.65GB 50.9MB / 11.2MB 10
|
||||
62e243c1c28f oc-workspace 0.13% 17.1MiB / 11.41GiB 0.15% 193MB / 95.6MB 34.4MB / 2.14MB 10
|
||||
3c9311c8b963 loki 1.57% 147.4MiB / 11.41GiB 1.26% 37.4MB / 16.4MB 148MB / 459MB 13
|
||||
01284abc3c8e mongo 1.48% 86.78MiB / 11.41GiB 0.74% 564MB / 1.48GB 35.6MB / 5.35GB 94
|
||||
14fc9ac33688 traefik 2.61% 49.53MiB / 11.41GiB 0.42% 72.1MB / 72.1MB 127MB / 2.2MB 13
|
||||
4f1b7890c622 nats 0.70% 78.14MiB / 11.41GiB 0.67% 2.64GB / 2.36GB 17.3MB / 2.2MB 14
|
||||
|
||||
Total 631.2 Mb
|
||||
```
|
||||
|
||||
However over time with the repetition of a large number of scheduling that the stacks uses a larger amount of RAM.
|
||||
|
||||
Espacially it seems that **loki**, **nats**, **mongo**, **oc-datacenter** and **oc-workflow** grow overs 150 MiB. This can be explained by the cache growing in these containers, which seems to be reduced every time the containers are restarted.
|
||||
|
||||
BIN
docs/performance_test/wf_test_ram_1node.png
Normal file
|
After Width: | Height: | Size: 16 KiB |
39
mft.svg
Normal file
@@ -0,0 +1,39 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"
|
||||
width="780mm" height="390mm" viewBox="0 0 780 390" displayInline="False">
|
||||
<defs>
|
||||
</defs>
|
||||
<circle cx="48.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="48.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="48.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="48.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="144.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="144.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="144.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="144.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="240.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="240.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="240.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="240.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="336.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="336.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="336.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="336.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="432.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="432.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="432.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="432.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="528.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="528.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="528.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="528.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="624.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="624.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="624.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="624.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="720.0" cy="48.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="720.0" cy="144.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="720.0" cy="240.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<circle cx="720.0" cy="336.0" r="15" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
<rect x="0" y="0" width="780" height="390" fill="none" stroke="red" stroke-width="0.0275" />
|
||||
</svg>
|
||||
|
After Width: | Height: | Size: 3.0 KiB |
BIN
performance_test
Normal file
|
After Width: | Height: | Size: 16 KiB |
166
stack.sh
Executable file
@@ -0,0 +1,166 @@
|
||||
#!/bin/bash
|
||||
|
||||
# This script will help you running the different services needed
|
||||
# in the open cloud stack :
|
||||
# - checks if directories containing the core services are present
|
||||
# - allows you to build or not each service image
|
||||
# - launch the containerized service
|
||||
|
||||
# TODO :
|
||||
# - Provide a list of all directories wich contain a dockerfile and choose which to build
|
||||
# - Parallelize building
|
||||
# - Flag building errors without stoping the other and flooding the stdout
|
||||
# - Provide a list of directories with a docker-compose and choose which to launch
|
||||
|
||||
# Define the required directories
|
||||
REQUIRED_DIRECTORIES=("oc-auth" "oc-catalog" "oc-schedulerd")
|
||||
oc_root=""
|
||||
|
||||
check_directory() {
|
||||
for dir in "${REQUIRED_DIRECTORIES[@]}"; do
|
||||
if [ ! -d "$dir" ]; then
|
||||
return 1 # Return failure if any required directory is missing
|
||||
fi
|
||||
done
|
||||
return 0
|
||||
}
|
||||
|
||||
check_path() {
|
||||
local path="$1"
|
||||
|
||||
if [ -e "$PWD/$path" ]; then
|
||||
cd "$PWD/$path"
|
||||
echo "$PWD"
|
||||
elif [ -e "$path" ]; then
|
||||
echo "$path"
|
||||
else
|
||||
|
||||
echo "$path does not exist"
|
||||
return 1 # Return a non-zero exit status to indicate failure
|
||||
fi
|
||||
}
|
||||
|
||||
|
||||
create_oc_root_env() {
|
||||
if [ -z ${OC_ROOT} ]; then
|
||||
echo "export OC_ROOT='$1'" >> ~/.bashrc
|
||||
echo "OC_ROOT has been added to your ~/.bashrc file for next executions."
|
||||
echo "Please run 'source ~/.bashrc' or restart your terminal to apply the changes."
|
||||
fi
|
||||
}
|
||||
|
||||
# Main script
|
||||
echo "Verifying the script is being run in the correct directory..."
|
||||
sleep 1.1
|
||||
echo
|
||||
|
||||
if [ ! -z ${OC_ROOT} ]; then
|
||||
echo "la variable env existe : $OC_ROOT"
|
||||
cd $OC_ROOT
|
||||
oc_root=${OC_ROOT}
|
||||
fi
|
||||
|
||||
if ! check_directory; then
|
||||
echo "The current directory ($(pwd)) does not contain all required directories:"
|
||||
for dir in "${REQUIRED_DIRECTORIES[@]}"; do
|
||||
echo " - $dir"
|
||||
done
|
||||
echo
|
||||
|
||||
echo "Please ensure the script is run from the correct root directory."
|
||||
read -p "Would you like to specify the path to the correct directory? (y/n): " choice
|
||||
|
||||
if [[ "$choice" =~ ^[Yy]$ ]]; then
|
||||
read -p "Enter the relative or absolute path to the correct directory: " target_path
|
||||
target_path=$(eval echo "$target_path")
|
||||
target_path=$(check_path "$target_path")
|
||||
echo
|
||||
echo "updated path : $target_path"
|
||||
echo
|
||||
|
||||
cd "$target_path" || { echo "Failed to change directory. Exiting."; exit 1; }
|
||||
if check_directory; then
|
||||
oc_root="$(pwd)"
|
||||
echo "Directory verified successfully. All required directories are present."
|
||||
echo
|
||||
create_oc_root_env "$oc_root"
|
||||
sleep 2.5
|
||||
else
|
||||
echo "The specified directory does not contain all required directories. Exiting."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
else
|
||||
echo "Please rerun the script from the correct directory. Exiting."
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo "Directory verification passed. All required directories are present."
|
||||
create_oc_root_env "$(pwd)"
|
||||
fi
|
||||
|
||||
arr=("oc-catalog" "oc-datacenter" "oc-peer" "oc-scheduler" "oc-shared" "oc-workflow" "oc-workspace" "oc-auth")
|
||||
oc_directories=($(find . -maxdepth 1 -type d -name "oc-*" -exec basename {} \;))
|
||||
|
||||
# Check for directories in 'arr' that are missing from 'oc_directories'
|
||||
missing_directories=()
|
||||
for dir in "${arr[@]}"; do
|
||||
if [[ ! " ${oc_directories[@]} " =~ " $dir " ]]; then
|
||||
missing_directories+=("$dir")
|
||||
fi
|
||||
done
|
||||
|
||||
if [ ${#missing_directories[@]} -gt 0 ]; then
|
||||
echo "Warning: The following directories are missing and won't be built:"
|
||||
for missing in "${missing_directories[@]}"; do
|
||||
echo "- $missing"
|
||||
done
|
||||
|
||||
read -p "Do you want to proceede with deploy, without these missing components ?[Y/n]: " choice
|
||||
if [[ ! "$choice" =~ ^[Yy]$ ]]; then
|
||||
echo "Exiting the Open Cloud deployment process"
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
|
||||
|
||||
# Continue with the rest of your script
|
||||
echo "Executing the main script..."
|
||||
|
||||
docker create networks catalog | true
|
||||
|
||||
docker kill mongo | true
|
||||
docker rm mongo | true
|
||||
|
||||
docker kill nats | true
|
||||
docker rm nats | true
|
||||
|
||||
docker kill loki | true
|
||||
docker rm loki | true
|
||||
|
||||
docker kill graphana | true
|
||||
docker rm graphana | true
|
||||
|
||||
cd ./oc-auth/ldap-hydra && docker compose up -d
|
||||
cd ../keto && docker compose up -d
|
||||
cd ../../oc-catalog && docker compose -f docker-compose.base.yml up -d
|
||||
cd ../oc-schedulerd && docker compose -f docker-compose.tools.yml up -d
|
||||
|
||||
|
||||
for i in "${oc_directories[@]}"
|
||||
do
|
||||
cd ../$i
|
||||
if [ -e "$PWD/Dockerfile" ];then
|
||||
read -p "Do you want to build the image for $i [y/n] : " build
|
||||
if [[ "$build" =~ ^[Yy]$ ]];then
|
||||
docker build . -t $i
|
||||
fi
|
||||
|
||||
if [ -e "$PWD/docker-compose.yml" ];then
|
||||
docker compose up -d
|
||||
fi
|
||||
fi
|
||||
|
||||
done
|
||||
|
||||
cd ../oc-schedulerd && go build . && ./oc-schedulerd
|
||||
86
wbs/wbs.puml
@@ -1,46 +1,46 @@
|
||||
@startmindmap
|
||||
* OC for DTF
|
||||
** colors
|
||||
***[#yellow] iteration 1 in progress
|
||||
***[#lightyellow] (OK) iteration 1 task finished
|
||||
*** planned to be developped, might be (OK) if schedule allows it
|
||||
***[#lightblue] not in DTF scope yet
|
||||
***[#orange] iteration 2
|
||||
***[#lightgreen] Thales proposed scopes
|
||||
** OC-Catalog
|
||||
***[#orange] authentication => RBAC
|
||||
***[#orange] algo metadata ingress, res min max)
|
||||
*** (OK) new resource type : workflow
|
||||
***[#lightyellow] (OK) split catalog - workspace - workflow
|
||||
***[#lightblue] algo metadata input output description
|
||||
***[#lightblue] algo input/output rules
|
||||
*** admin interface for catalog admin, roles definition
|
||||
***[#lightgreen] catalog indexing and search
|
||||
** OC-Scheduler / OC-Monitor ?
|
||||
***[#lightyellow] (OK) automatically starting workflows
|
||||
*** (OK) monitoring workflows
|
||||
***[#orange] workflow to service generation (deployment yaml)
|
||||
*** workflow to other targets (slurm)
|
||||
** OC-Search => Front
|
||||
***[#lightblue] algo input/output description
|
||||
***[#lightblue] algo input/output rules check
|
||||
***[#lightyellow] (OK) refactor ui in flutter
|
||||
*** (OK) New resource type : workflow
|
||||
*** Algo metadata (ingress, res min max)
|
||||
*** (OK) workflows monitoring
|
||||
*** (OK) Schedule view
|
||||
*** Datacenter view
|
||||
**[#lightblue] OC-Identity : Distributed OpenID+ server
|
||||
***[#yellow] Evaluate OpenId codebases
|
||||
*** Implement OpenCloud extension
|
||||
**[#lightgreen] OC-Deploy
|
||||
***[#lightyellow] (OK) repo
|
||||
***[#yellow] deploy OC services
|
||||
***[#orange] deploy demo instance
|
||||
*** manage local cluster
|
||||
*** partner sandboxing
|
||||
***[#lightblue] network sandboxing
|
||||
***[#lightblue] network output cheks
|
||||
**[#lightgreen] OC-Datacenter
|
||||
- OC for DTF
|
||||
-- colors
|
||||
---[#yellow] iteration 1 in progress
|
||||
---[#lightyellow] (OK) iteration 1 task finished
|
||||
--- planned to be developped, might be (OK) if schedule allows it
|
||||
---[#lightblue] not in DTF scope yet
|
||||
---[#orange] iteration 2
|
||||
---[#lightgreen] Thales proposed scopes
|
||||
-- OC-Catalog
|
||||
---[#orange] (60%) authentication => RBAC
|
||||
---[#orange] (50%) algo metadata ingress, res min max)
|
||||
--- (OK) new resource type : workflow
|
||||
---[#lightyellow] (OK) split catalog - workspace - workflow
|
||||
---[#lightblue] algo metadata input output description
|
||||
---[#lightblue] algo input/output rules
|
||||
--- admin interface for catalog admin, roles definition
|
||||
---[#lightgreen] catalog indexing and search
|
||||
-- OC-Scheduler / OC-Monitor ?
|
||||
---[#lightyellow] (OK) automatically starting workflows
|
||||
--- (OK) monitoring workflows
|
||||
---[#orange] (60%) workflow to service generation (deployment yaml)
|
||||
--- workflow to other targets (slurm)
|
||||
++ OC-Search => Front
|
||||
+++[#lightblue] algo input/output description
|
||||
+++[#lightblue] algo input/output rules check
|
||||
+++[#lightyellow] (OK) refactor ui in flutter
|
||||
+++ (OK) New resource type : workflow
|
||||
+++ Algo metadata (ingress, res min max)
|
||||
+++ (OK) workflows monitoring
|
||||
+++ (OK) Schedule view
|
||||
+++ Datacenter view
|
||||
++[#lightblue] OC-Identity : Distributed OpenID+ server
|
||||
+++[#yellow] Evaluate OpenId codebases
|
||||
+++ Implement OpenCloud extension
|
||||
++[#lightgreen] OC-Deploy
|
||||
+++[#lightyellow] (OK) repo
|
||||
+++[#yellow] deploy OC services
|
||||
+++[#orange] (docker 80% / native 40%) deploy demo instance
|
||||
+++ manage local cluster
|
||||
+++ partner sandboxing
|
||||
+++[#lightblue] network sandboxing
|
||||
+++[#lightblue] network output cheks
|
||||
++[#lightgreen] OC-Datacenter
|
||||
|
||||
@endmindmap
|
||||