diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000..5bd67ad --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,495 @@ +# oc-discovery — Architecture et analyse technique + +> **Convention de lecture** +> Les points marqués ✅ ont été corrigés dans le code. Les points marqués ⚠️ restent ouverts. + +## Table des matières + +1. [Vue d'ensemble](#1-vue-densemble) +2. [Hiérarchie des rôles](#2-hiérarchie-des-rôles) +3. [Mécanismes principaux](#3-mécanismes-principaux) + - 3.1 Heartbeat long-lived (node → indexer) + - 3.2 Scoring de confiance + - 3.3 Enregistrement auprès des natifs (indexer → native) + - 3.4 Pool d'indexeurs : fetch + consensus + - 3.5 Self-delegation et offload loop + - 3.6 Résilience du mesh natif + - 3.7 DHT partagée + - 3.8 PubSub gossip (indexer registry) + - 3.9 Streams applicatifs (node ↔ node) +4. [Tableau récapitulatif](#4-tableau-récapitulatif) +5. [Risques et limites globaux](#5-risques-et-limites-globaux) +6. [Pistes d'amélioration](#6-pistes-damélioration) + +--- + +## 1. Vue d'ensemble + +`oc-discovery` est un service de découverte P2P pour le réseau OpenCloud. Il repose sur +**libp2p** (transport TCP + PSK réseau privé) et une **DHT Kademlia** (préfixe `oc`) +pour indexer les pairs. L'architecture est intentionnellement hiérarchique : des _natifs_ +stables servent de hubs autoritaires auxquels des _indexeurs_ s'enregistrent, et des _nœuds_ +ordinaires découvrent des indexeurs via ces natifs. + +``` + ┌──────────────┐ heartbeat ┌──────────────────┐ + │ Node │ ───────────────────► │ Indexer │ + │ (libp2p) │ ◄─────────────────── │ (DHT server) │ + └──────────────┘ stream applicatif └────────┬─────────┘ + │ subscribe / heartbeat + ▼ + ┌──────────────────┐ + │ Native Indexer │◄──► autres natifs + │ (hub autoritaire│ (mesh) + └──────────────────┘ +``` + +Tous les participants partagent une **clé pré-partagée (PSK)** qui isole le réseau +des connexions libp2p externes non autorisées. + +--- + +## 2. Hiérarchie des rôles + +| Rôle | Binaire | Responsabilité | +|---|---|---| +| **Node** | `node_mode=node` | Se fait indexer, publie/consulte des records DHT | +| **Indexer** | `node_mode=indexer` | Reçoit les heartbeats, écrit en DHT, s'enregistre auprès des natifs | +| **Native Indexer** | `node_mode=native` | Hub : tient le registre des indexeurs vivants, évalue le consensus, sert de fallback | + +Un même processus peut cumuler les rôles node+indexer ou indexer+native. + +--- + +## 3. Mécanismes principaux + +### 3.1 Heartbeat long-lived (node → indexer) + +**Fonctionnement** + +Un stream libp2p **persistant** (`/opencloud/heartbeat/1.0`) est ouvert depuis le nœud +vers chaque indexeur de son pool (`StaticIndexers`). Toutes les 20 secondes, le nœud +envoie un `Heartbeat` JSON sur ce stream. L'indexeur répond en enregistrant le peer dans +`StreamRecords[ProtocolHeartbeat]` avec une expiry de 2 min. + +Si `sendHeartbeat` échoue (stream reset, EOF, timeout), le peer est retiré de +`StaticIndexers` et `replenishIndexersFromNative` est déclenché. + +**Avantages** +- Détection rapide de déconnexion (erreur sur le prochain encode). +- Un seul stream par pair réduit la pression sur les connexions TCP. +- Le channel de nudge (`indexerHeartbeatNudge`) permet un reconnect immédiat sans + attendre le ticker de 20 s. + +**Limites / risques** +- ⚠️ Un seul stream persistant : si la couche TCP reste ouverte mais "gelée" (middlebox, + NAT silencieux), l'erreur peut ne pas remonter avant plusieurs minutes. +- ⚠️ `StaticIndexers` est une map partagée globale : si deux goroutines appellent + `replenishIndexersFromNative` simultanément (cas de perte multiple), on peut avoir + des écritures concurrentes non protégées hors des sections critiques. + +--- + +### 3.2 Scoring de confiance + +**Fonctionnement** + +Avant d'enregistrer un heartbeat dans `StreamRecords`, l'indexeur vérifie un **score +minimum** calculé par `CheckHeartbeat` : + +``` +Score = (0.4 × uptime_ratio + 0.4 × bpms + 0.2 × diversity) × 100 +``` + +- `uptime_ratio` : durée de présence du peer / durée depuis le démarrage de l'indexeur. +- `bpms` : débit mesuré via un stream dédié (`/opencloud/probe/1.0`) normalisé par 50 Mbps. +- `diversity` : ratio d'IP /24 distincts parmi les indexeurs que le peer déclare. + +Deux seuils sont appliqués selon l'état du peer : +- **Premier heartbeat** (peer absent de `StreamRecords`, uptime = 0) : seuil à **40**. +- **Heartbeats suivants** (uptime accumulé) : seuil à **75**. + +**Avantages** +- Décourage les peers éphémères ou lents d'encombrer le registre. +- La diversité réseau réduit le risque de concentration sur un seul sous-réseau. +- Le stream de probe dédié évite de polluer le stream JSON heartbeat avec des données binaires. +- Le double seuil permet aux nouveaux peers d'être admis dès leur première connexion. + +**Limites / risques** +- ✅ **Deadlock logique de démarrage corrigé** : avec uptime = 0 le score maximal était 60, + en-dessous du seuil de 75. Les nouveaux peers étaient silencieusement rejetés à jamais. + → Seuil abaissé à **40** pour le premier heartbeat (`isFirstHeartbeat`), 75 ensuite. +- ⚠️ Les seuils (40 / 75) restent câblés en dur, sans possibilité de configuration. +- ⚠️ La mesure de bande passante envoie entre 512 et 2048 octets par heartbeat : à 20 s + d'intervalle et 500 nœuds max, cela représente ~50 KB/s de trafic probe en continu. +- ⚠️ `diversity` est calculé sur les adresses que le nœud *déclare* avoir — ce champ est + auto-rapporté et non vérifié, facilement falsifiable. + +--- + +### 3.3 Enregistrement auprès des natifs (indexer → native) + +**Fonctionnement** + +Chaque indexeur (non-natif) envoie périodiquement (toutes les 60 s) une +`IndexerRegistration` JSON sur un stream one-shot (`/opencloud/native/subscribe/1.0`) +vers chaque natif configuré. Le natif : + +1. Stocke l'entrée en cache local avec un TTL de **90 s** (`IndexerTTL`). +2. Gossipe le `PeerID` sur le topic PubSub `oc-indexer-registry` aux autres natifs. +3. Persiste l'entrée en DHT de manière asynchrone (retry jusqu'à succès). + +**Avantages** +- Stream jetable : pas de ressource longue durée côté natif pour les enregistrements. +- Le cache local est immédiatement disponible pour `handleNativeGetIndexers` sans + attendre la DHT. +- La dissémination PubSub permet à d'autres natifs de connaître l'indexeur sans + qu'il ait besoin de s'y enregistrer directement. + +**Limites / risques** +- ✅ **TTL trop serré corrigé** : le TTL de 66 s n'était que 10 % au-dessus de l'intervalle + de 60 s — un léger retard réseau pouvait expirer un indexeur sain entre deux renewals. + → `IndexerTTL` porté à **90 s** (+50 %). +- ⚠️ Si le `PutValue` DHT échoue définitivement (réseau partitionné), le natif possède + l'entrée mais les autres natifs qui n'ont pas reçu le message PubSub ne la connaissent + jamais — incohérence silencieuse. +- ⚠️ `RegisterWithNative` ignore les adresses en `127.0.0.1`, mais ne gère pas + les adresses privées (RFC1918) qui seraient non routables depuis d'autres hôtes. + +--- + +### 3.4 Pool d'indexeurs : fetch + consensus + +**Fonctionnement** + +Lors de `ConnectToNatives` (démarrage ou replenish), le nœud/indexeur : + +1. **Fetch** : envoie `GetIndexersRequest` au premier natif répondant + (`/opencloud/native/indexers/1.0`), reçoit une liste de candidats. +2. **Consensus (round 1)** : interroge **tous** les natifs configurés en parallèle + (`/opencloud/native/consensus/1.0`, timeout 3 s, collecte sur 4 s). + Un indexeur est confirmé si **strictement plus de 50 %** des natifs répondants + le considèrent vivant. +3. **Consensus (round 2)** : si le pool est insuffisant, les suggestions des natifs + (indexeurs qu'ils connaissent mais qui n'étaient pas dans les candidats initiaux) + sont soumises à un second round. + +**Avantages** +- La règle de majorité absolue empêche un natif compromis ou désynchronisé d'injecter + des indexeurs fantômes. +- Le double round permet de compléter le pool avec des alternatives connues des natifs + sans sacrifier la vérification. +- Si le fetch retourne un **fallback** (natif comme indexeur), le consensus est skippé — + cohérent car il n'y a qu'une seule source. + +**Limites / risques** +- ⚠️ Avec **un seul natif** configuré (très courant en dev/test), le consensus est trivial + (100 % d'un seul vote) — la règle de majorité ne protège rien dans ce cas. +- ⚠️ `fetchIndexersFromNative` s'arrête au **premier natif répondant** (séquentiellement) : + si ce natif a un cache périmé ou partiel, le nœud obtient un pool sous-optimal sans + consulter les autres. +- ⚠️ Le timeout de collecte global (4 s) est fixe : sur un réseau lent ou géographiquement + distribué, des natifs valides peuvent être éliminés faute de réponse à temps. +- ⚠️ `replaceStaticIndexers` **ajoute** sans jamais retirer d'anciens indexeurs expirés : + le pool peut accumuler des entrées mortes que seul le heartbeat purge ensuite. + +--- + +### 3.5 Self-delegation et offload loop + +**Fonctionnement** + +Si un natif ne dispose d'aucun indexeur vivant lors d'un `handleNativeGetIndexers`, +il se désigne lui-même comme indexeur temporaire (`selfDelegate`) : il retourne sa propre +adresse multiaddr et ajoute le demandeur dans `responsiblePeers`, dans la limite de +`maxFallbackPeers` (50). Au-delà, la délégation est refusée et une réponse vide est +retournée pour que le nœud tente un autre natif. + +Toutes les 30 s, `runOffloadLoop` vérifie si des indexeurs réels sont de nouveau +disponibles. Si oui, pour chaque peer responsable : +- **Stream présent** : `Reset()` du stream heartbeat — le peer reçoit une erreur, + déclenche `replenishIndexersFromNative` et migre vers de vrais indexeurs. +- **Stream absent** (peer jamais admis par le scoring) : `ClosePeer()` sur la connexion + réseau — le peer reconnecte et re-demande ses indexeurs au natif. + +**Avantages** +- Continuité de service : un nœud n'est jamais bloqué en l'absence temporaire d'indexeurs. +- La migration est automatique et transparente pour le nœud. +- `Reset()` (vs `Close()`) interrompt les deux sens du stream, garantissant que le peer + reçoit bien une erreur. +- La limite de 50 empêche le natif de se retrouver surchargé lors de pénuries prolongées. + +**Limites / risques** +- ✅ **Offload sans stream corrigé** : si le heartbeat n'avait jamais été enregistré dans + `StreamRecords` (score < seuil — cas amplifié par le bug de scoring), l'offload + échouait silencieusement et le peer restait dans `responsiblePeers` indéfiniment. + → Branche `else` : `ClosePeer()` + suppression de `responsiblePeers`. +- ✅ **`responsiblePeers` illimité corrigé** : le natif acceptait un nombre arbitraire + de peers en self-delegation, devenant lui-même un indexeur surchargé. + → `selfDelegate` vérifie `len(responsiblePeers) >= maxFallbackPeers` et retourne + `false` si saturé. +- ⚠️ La délégation reste non coordonnée entre natifs : un natif surchargé refuse (retourne + vide) mais ne redirige pas explicitement vers un natif voisin qui aurait de la capacité. + +--- + +### 3.6 Résilience du mesh natif + +**Fonctionnement** + +Quand le heartbeat vers un natif échoue, `replenishNativesFromPeers` tente de trouver +un remplaçant dans cet ordre : + +1. `fetchNativeFromNatives` : demande à chaque natif vivant (`/opencloud/native/peers/1.0`) + une adresse de natif inconnue. +2. `fetchNativeFromIndexers` : demande à chaque indexeur connu + (`/opencloud/indexer/natives/1.0`) ses natifs configurés. +3. Si aucun remplaçant et `remaining ≤ 1` : `retryLostNative` relance un ticker de 30 s + qui retente la connexion directe au natif perdu. + +`EnsureNativePeers` maintient des heartbeats de natif à natif via `ProtocolHeartbeat`, +avec une **unique goroutine** couvrant toute la map `StaticNatives`. + +**Avantages** +- Le gossip multi-hop via indexeurs permet de retrouver un natif même si aucun pair + direct ne le connaît. +- `retryLostNative` gère le cas d'un seul natif (déploiement minimal). +- La reconnexion automatique (`retryLostNative`) déclenche `replenishIndexersIfNeeded` + pour restaurer aussi le pool d'indexeurs. + +**Limites / risques** +- ✅ **Goroutines heartbeat multiples corrigé** : `EnsureNativePeers` démarrait une + goroutine `SendHeartbeat` par adresse native (N natifs → N goroutines → N² heartbeats + par tick). → Utilisation de `nativeMeshHeartbeatOnce` : une seule goroutine itère sur + `StaticNatives`. +- ⚠️ `retryLostNative` tourne indéfiniment sans condition d'arrêt liée à la vie du processus + (pas de `context.Context`). Si le binaire est gracefully shutdown, cette goroutine + peut bloquer. +- ⚠️ La découverte transitoire (natif → indexeur → natif) est à sens unique : un indexeur + ne connaît que les natifs de sa propre config, pas les nouveaux natifs qui auraient + rejoint après son démarrage. + +--- + +### 3.7 DHT partagée + +**Fonctionnement** + +Tous les indexeurs et natifs participent à une DHT Kademlia (préfixe `oc`, mode +`ModeServer`). Deux namespaces sont utilisés : + +- `/node/` → `PeerRecord` JSON signé (publié par les indexeurs sur heartbeat de nœud). +- `/indexer/` → `liveIndexerEntry` JSON avec TTL (publié par les natifs). + +Chaque natif lance `refreshIndexersFromDHT` (toutes les 30 s) qui ré-hydrate son cache +local depuis la DHT pour les PeerIDs connus (`knownPeerIDs`) dont l'entrée locale a expiré. + +**Avantages** +- Persistance décentralisée : un record survit à la perte d'un seul natif ou indexeur. +- Validation des entrées : `PeerRecordValidator` et `IndexerRecordValidator` rejettent + les records malformés ou expirés au moment du `PutValue`. +- L'index secondaire `/name/` permet la résolution par nom humain. + +**Limites / risques** +- ⚠️ La DHT Kademlia en réseau privé (PSK) est fonctionnelle mais les nœuds bootstrap + ne sont pas configurés explicitement : la découverte dépend de connexions déjà établies, + ce qui peut ralentir la convergence au démarrage. +- ⚠️ `PutValue` est réessayé en boucle infinie si `"failed to find any peer in table"` — + une panne de réseau prolongée génère des goroutines bloquées. +- ⚠️ Si la PSK est compromise, un attaquant peut écrire dans la DHT ; les `liveIndexerEntry` + d'indexeurs ne sont pas signées, contrairement aux `PeerRecord`. +- ⚠️ `refreshIndexersFromDHT` prune `knownPeerIDs` si la DHT n'a aucune entrée fraîche, + mais ne prune pas `liveIndexers` — une entrée expirée reste en mémoire jusqu'au GC + ou au prochain refresh. + +--- + +### 3.8 PubSub gossip (indexer registry) + +**Fonctionnement** + +Quand un indexeur s'enregistre auprès d'un natif, ce dernier publie l'adresse sur le +topic GossipSub `oc-indexer-registry`. Les autres natifs abonnés mettent à jour leur +`knownPeerIDs` sans attendre la DHT. + +Le `TopicValidator` rejette tout message dont le contenu n'est pas un multiaddr +parseable valide avant qu'il n'atteigne la boucle de traitement. + +**Avantages** +- Dissémination quasi-instantanée entre natifs connectés. +- Complément utile à la DHT pour les registrations récentes qui n'ont pas encore + été persistées. +- Le filtre syntaxique bloque les messages malformés avant propagation dans le mesh. + +**Limites / risques** +- ✅ **`TopicValidator` sans validation corrigé** : le validateur acceptait systématiquement + tous les messages (`return true`), permettant à un natif compromis de gossiper + n'importe quelle donnée. + → Le validateur vérifie désormais que le message est un multiaddr parseable + (`pp.AddrInfoFromString`). +- ⚠️ La validation reste syntaxique uniquement : l'origine du message (l'émetteur + est-il un natif légitime ?) n'est pas vérifiée. +- ⚠️ Si le natif redémarre, il perd son abonnement et manque les messages publiés + pendant son absence. La re-hydratation depuis la DHT compense, mais avec un délai + pouvant aller jusqu'à 30 s. +- ⚠️ Le gossip ne porte que le `Addr` de l'indexeur, pas sa TTL ni sa signature. + +--- + +### 3.9 Streams applicatifs (node ↔ node) + +**Fonctionnement** + +`StreamService` gère les streams entre nœuds partenaires (relations `PARTNER` stockées +en base) via des protocols dédiés (`/opencloud/resource/*`). Un heartbeat partenaire +(`ProtocolHeartbeatPartner`) maintient les connexions actives. Les events sont routés +via `handleEvent` et le système NATS en parallèle. + +**Avantages** +- TTL par protocol (`PersistantStream`, `WaitResponse`) adapte le comportement au + type d'échange (longue durée pour le planner, courte pour les CRUDs). +- La GC (`gc()` toutes les 8 s, démarrée une seule fois dans `InitStream`) libère + rapidement les streams expirés. + +**Limites / risques** +- ✅ **Fuite de goroutines GC corrigée** : `HandlePartnerHeartbeat` appelait + `go s.StartGC(30s)` à chaque heartbeat reçu (~20 s), créant un nouveau ticker + goroutine infini à chaque appel. + → Appel supprimé ; la GC lancée par `InitStream` est suffisante. +- ✅ **Boucle infinie sur EOF corrigée** : `readLoop` effectuait `s.Stream.Close(); + continue` après une erreur de décodage, re-tentant indéfiniment de lire un stream + fermé. + → Remplacé par `return` ; les defers (`Close`, `delete`) nettoient correctement. +- ⚠️ La récupération de partenaires depuis `conf.PeerIDS` est marquée `TO REMOVE` : + présence de code provisoire en production. + +--- + +## 4. Tableau récapitulatif + +| Mécanisme | Protocole | Avantage principal | État du risque | +|---|---|---|---| +| Heartbeat node→indexer | `/opencloud/heartbeat/1.0` | Détection rapide de perte | ⚠️ Stream TCP gelé non détecté | +| Scoring de confiance | (inline dans heartbeat) | Filtre les pairs instables | ✅ Deadlock corrigé (seuil 40/75) | +| Enregistrement natif | `/opencloud/native/subscribe/1.0` | TTL ample, cache immédiat | ✅ TTL porté à 90 s | +| Fetch pool d'indexeurs | `/opencloud/native/indexers/1.0` | Prend le 1er natif répondant | ⚠️ Natif au cache périmé possible | +| Consensus | `/opencloud/native/consensus/1.0` | Majorité absolue | ⚠️ Trivial avec 1 seul natif | +| Self-delegation + offload | (in-memory) | Disponibilité sans indexeur | ✅ Limite 50 peers + ClosePeer | +| Mesh natif | `/opencloud/native/peers/1.0` | Gossip multi-hop | ✅ Goroutines dédupliquées | +| DHT | `/oc/kad/1.0.0` | Persistance décentralisée | ⚠️ Retry infini, pas de bootstrap | +| PubSub registry | `oc-indexer-registry` | Dissémination rapide | ✅ Validation multiaddr | +| Streams applicatifs | `/opencloud/resource/*` | TTL par protocol | ✅ Fuite GC + EOF corrigés | + +--- + +## 5. Risques et limites globaux + +### Sécurité + +- ⚠️ **Adresses auto-rapportées non vérifiées** : le champ `IndexersBinded` dans le heartbeat + est auto-déclaré par le nœud et sert à calculer la diversité. Un pair malveillant peut + gonfler son score en déclarant de fausses adresses. +- ⚠️ **PSK comme seule barrière d'entrée** : si la PSK est compromise (elle est statique et + fichier-based), tout l'isolement réseau saute. Il n'y a pas de rotation de clé ni + d'authentification supplémentaire par pair. +- ⚠️ **DHT sans ACL sur les entrées indexeur** : la signature des `PeerRecord` est vérifiée + à la lecture, mais les `liveIndexerEntry` ne sont pas signées. La validation PubSub + bloque les multiaddrs invalides mais pas les adresses d'indexeurs légitimes usurpées. + +### Disponibilité + +- ⚠️ **Single point of failure natif** : avec un seul natif, la perte de celui-ci stoppe + toute attribution d'indexeurs. `retryLostNative` pallie, mais sans indexeurs, les nœuds + ne peuvent pas publier. +- ⚠️ **Bootstrap DHT** : sans nœuds bootstrap explicites, la DHT met du temps à converger + si les connexions initiales sont peu nombreuses. + +### Cohérence + +- ⚠️ **`replaceStaticIndexers` n'efface jamais** : d'anciens indexeurs morts restent dans + `StaticIndexers` jusqu'à ce que le heartbeat échoue. Un nœud peut avoir un pool + surévalué contenant des entrées inatteignables. +- ⚠️ **`TimeWatcher` global** : défini une seule fois au démarrage de `ConnectToIndexers`. + Si l'indexeur tourne depuis longtemps, les nouveaux nœuds auront un `uptime_ratio` + durablement faible. Le seuil abaissé à 40 pour le premier heartbeat atténue l'impact + initial, mais les heartbeats suivants devront accumuler un uptime suffisant. + +--- + +## 6. Pistes d'amélioration + +Les pistes déjà implémentées sont marquées ✅. Les pistes ouvertes restent à traiter. + +### ✅ Score : double seuil pour les nouveaux peers +~~Remplacer le seuil binaire~~ — **Implémenté** : seuil à 40 pour le premier heartbeat +(peer absent de `StreamRecords`), 75 pour les suivants. Un peer peut désormais être admis +dès sa première connexion sans bloquer sur l'uptime nul. +_Fichier : `common/common_stream.go`, `CheckHeartbeat`_ + +### ✅ TTL indexeur aligné avec l'intervalle de renouvellement +~~TTL de 66 s trop proche de 60 s~~ — **Implémenté** : `IndexerTTL` passé à **90 s**. +_Fichier : `indexer/native.go`_ + +### ✅ Limite de la self-delegation +~~`responsiblePeers` illimité~~ — **Implémenté** : `selfDelegate` retourne `false` quand +`len(responsiblePeers) >= maxFallbackPeers` (50). Le site d'appel retourne une réponse +vide et logue un warning. +_Fichier : `indexer/native.go`_ + +### ✅ Validation PubSub des adresses gossipées +~~`TopicValidator` accepte tout~~ — **Implémenté** : le validateur vérifie que le message +est un multiaddr parseable via `pp.AddrInfoFromString`. +_Fichier : `indexer/native.go`, `subscribeIndexerRegistry`_ + +### ✅ Goroutines heartbeat dédupliquées dans `EnsureNativePeers` +~~Une goroutine par adresse native~~ — **Implémenté** : `nativeMeshHeartbeatOnce` +garantit qu'une seule goroutine `SendHeartbeat` couvre toute la map `StaticNatives`. +_Fichier : `common/native_stream.go`_ + +### ✅ Fuite de goroutines GC dans `HandlePartnerHeartbeat` +~~`go s.StartGC(30s)` à chaque heartbeat~~ — **Implémenté** : appel supprimé ; la GC +de `InitStream` est suffisante. +_Fichier : `stream/service.go`_ + +### ✅ Boucle infinie sur EOF dans `readLoop` +~~`continue` après `Stream.Close()`~~ — **Implémenté** : remplacé par `return` pour +laisser les defers nettoyer proprement. +_Fichier : `stream/service.go`_ + +--- + +### ⚠️ Fetch pool : interroger tous les natifs en parallèle + +`fetchIndexersFromNative` s'arrête au premier natif répondant. Interroger tous les natifs +en parallèle et fusionner les listes (similairement à `clientSideConsensus`) éviterait +qu'un natif au cache périmé fournisse un pool sous-optimal. + +### ⚠️ Consensus avec quorum configurable + +Le seuil de confirmation (`count*2 > total`) est câblé en dur. Le rendre configurable +(ex. `consensus_quorum: 0.67`) permettrait de durcir la règle sur des déploiements +à 3+ natifs sans modifier le code. + +### ⚠️ Désenregistrement explicite + +Ajouter un protocole `/opencloud/native/unsubscribe/1.0` : quand un indexeur s'arrête +proprement, il notifie les natifs pour invalider son TTL immédiatement plutôt qu'attendre +90 s. + +### ⚠️ Bootstrap DHT explicite + +Configurer les natifs comme nœuds bootstrap DHT via `dht.BootstrapPeers` pour accélérer +la convergence Kademlia au démarrage. + +### ⚠️ Context propagé dans les goroutines longue durée + +`retryLostNative`, `refreshIndexersFromDHT` et `runOffloadLoop` ne reçoivent aucun +`context.Context`. Les passer depuis `InitNative` permettrait un arrêt propre lors du +shutdown du processus. + +### ⚠️ Redirection explicite lors du refus de self-delegation + +Quand un natif refuse la self-delegation (pool saturé), retourner vide force le nœud à +réessayer sans lui indiquer vers qui se tourner. Une liste de natifs alternatifs dans la +réponse (`AlternativeNatives []string`) permettrait au nœud de trouver directement un +natif moins chargé. diff --git a/conf/config.go b/conf/config.go index 490d66d..55f813f 100644 --- a/conf/config.go +++ b/conf/config.go @@ -3,18 +3,21 @@ package conf import "sync" type Config struct { - Name string - Hostname string - PSKPath string - PublicKeyPath string - PrivateKeyPath string - NodeEndpointPort int64 + Name string + Hostname string + PSKPath string + PublicKeyPath string + PrivateKeyPath string + NodeEndpointPort int64 IndexerAddresses string NativeIndexerAddresses string // multiaddrs of native indexers, comma-separated; bypasses IndexerAddresses when set PeerIDS string // TO REMOVE NodeMode string + + MinIndexer int + MaxIndexer int } var instance *Config diff --git a/daemons/node/common/common_stream.go b/daemons/node/common/common_stream.go index 22ba3a3..e7d3e53 100644 --- a/daemons/node/common/common_stream.go +++ b/daemons/node/common/common_stream.go @@ -1,7 +1,6 @@ package common import ( - "bytes" "context" cr "crypto/rand" "encoding/json" @@ -28,6 +27,12 @@ type LongLivedStreamRecordedService[T interface{}] struct { StreamRecords map[protocol.ID]map[pp.ID]*StreamRecord[T] StreamMU sync.RWMutex maxNodesConn int + // AfterHeartbeat is an optional hook called after each successful heartbeat update. + // The indexer sets it to republish the embedded signed record to the DHT. + AfterHeartbeat func(pid pp.ID) + // AfterDelete is called after gc() evicts an expired peer, outside the lock. + // name and did may be empty if the HeartbeatStream had no metadata. + AfterDelete func(pid pp.ID, name string, did string) } func NewStreamRecordedService[T interface{}](h host.Host, maxNodesConn int) *LongLivedStreamRecordedService[T] { @@ -54,16 +59,29 @@ func (ix *LongLivedStreamRecordedService[T]) StartGC(interval time.Duration) { func (ix *LongLivedStreamRecordedService[T]) gc() { ix.StreamMU.Lock() - defer ix.StreamMU.Unlock() now := time.Now().UTC() if ix.StreamRecords[ProtocolHeartbeat] == nil { ix.StreamRecords[ProtocolHeartbeat] = map[pp.ID]*StreamRecord[T]{} + ix.StreamMU.Unlock() return } streams := ix.StreamRecords[ProtocolHeartbeat] + fmt.Println(StaticNatives, StaticIndexers, streams) + type gcEntry struct { + pid pp.ID + name string + did string + } + var evicted []gcEntry for pid, rec := range streams { if now.After(rec.HeartbeatStream.Expiry) || now.Sub(rec.HeartbeatStream.UptimeTracker.LastSeen) > 2*rec.HeartbeatStream.Expiry.Sub(now) { + name, did := "", "" + if rec.HeartbeatStream != nil { + name = rec.HeartbeatStream.Name + did = rec.HeartbeatStream.DID + } + evicted = append(evicted, gcEntry{pid, name, did}) for _, sstreams := range ix.StreamRecords { if sstreams[pid] != nil { delete(sstreams, pid) @@ -71,6 +89,13 @@ func (ix *LongLivedStreamRecordedService[T]) gc() { } } } + ix.StreamMU.Unlock() + + if ix.AfterDelete != nil { + for _, e := range evicted { + ix.AfterDelete(e.pid, e.name, e.did) + } + } } func (ix *LongLivedStreamRecordedService[T]) Snapshot(interval time.Duration) { @@ -101,8 +126,10 @@ func (ix *LongLivedStreamRecordedService[T]) snapshot() []*StreamRecord[T] { return out } -func (ix *LongLivedStreamRecordedService[T]) HandleNodeHeartbeat(s network.Stream) { +func (ix *LongLivedStreamRecordedService[T]) HandleHeartbeat(s network.Stream) { + logger := oclib.GetLogger() defer s.Close() + dec := json.NewDecoder(s) for { ix.StreamMU.Lock() if ix.StreamRecords[ProtocolHeartbeat] == nil { @@ -114,17 +141,37 @@ func (ix *LongLivedStreamRecordedService[T]) HandleNodeHeartbeat(s network.Strea streamsAnonym[k] = v } ix.StreamMU.Unlock() - - pid, hb, err := CheckHeartbeat(ix.Host, s, streamsAnonym, &ix.StreamMU, ix.maxNodesConn) + pid, hb, err := CheckHeartbeat(ix.Host, s, dec, streamsAnonym, &ix.StreamMU, ix.maxNodesConn) if err != nil { + // Stream-level errors (EOF, reset, closed) mean the connection is gone + // — exit so the goroutine doesn't spin forever on a dead stream. + // Metric/policy errors (score too low, too many connections) are transient + // — those are also stream-terminal since the stream carries one session. + if errors.Is(err, io.EOF) || errors.Is(err, io.ErrUnexpectedEOF) || + strings.Contains(err.Error(), "reset") || + strings.Contains(err.Error(), "closed") || + strings.Contains(err.Error(), "too many connections") { + logger.Info().Err(err).Msg("heartbeat stream terminated, closing handler") + return + } + logger.Warn().Err(err).Msg("heartbeat check failed, retrying on same stream") continue } ix.StreamMU.Lock() // if record already seen update last seen if rec, ok := streams[*pid]; ok { rec.DID = hb.DID + if rec.HeartbeatStream == nil { + rec.HeartbeatStream = hb.Stream + } rec.HeartbeatStream = hb.Stream - rec.HeartbeatStream.UptimeTracker.LastSeen = time.Now().UTC() + if rec.HeartbeatStream.UptimeTracker == nil { + rec.HeartbeatStream.UptimeTracker = &UptimeTracker{ + FirstSeen: time.Now().UTC(), + LastSeen: time.Now().UTC(), + } + } + logger.Info().Msg("A new node is updated : " + pid.String()) } else { hb.Stream.UptimeTracker = &UptimeTracker{ FirstSeen: time.Now().UTC(), @@ -134,37 +181,51 @@ func (ix *LongLivedStreamRecordedService[T]) HandleNodeHeartbeat(s network.Strea DID: hb.DID, HeartbeatStream: hb.Stream, } + logger.Info().Msg("A new node is subscribed : " + pid.String()) } ix.StreamMU.Unlock() + // Let the indexer republish the embedded signed record to the DHT. + if ix.AfterHeartbeat != nil { + ix.AfterHeartbeat(*pid) + } } } -func CheckHeartbeat(h host.Host, s network.Stream, streams map[pp.ID]HeartBeatStreamed, lock *sync.RWMutex, maxNodes int) (*pp.ID, *Heartbeat, error) { +func CheckHeartbeat(h host.Host, s network.Stream, dec *json.Decoder, streams map[pp.ID]HeartBeatStreamed, lock *sync.RWMutex, maxNodes int) (*pp.ID, *Heartbeat, error) { if len(h.Network().Peers()) >= maxNodes { return nil, nil, fmt.Errorf("too many connections, try another indexer") } var hb Heartbeat - if err := json.NewDecoder(s).Decode(&hb); err != nil { + if err := dec.Decode(&hb); err != nil { return nil, nil, err } - if ok, bpms, err := getBandwidthChallengeRate(MinPayloadChallenge+int(rand.Float64()*(MaxPayloadChallenge-MinPayloadChallenge)), s); err != nil { - return nil, nil, err - } else if !ok { - return nil, nil, fmt.Errorf("Not a proper peer") - } else { + _, bpms, _ := getBandwidthChallengeRate(h, s.Conn().RemotePeer(), MinPayloadChallenge+int(rand.Float64()*(MaxPayloadChallenge-MinPayloadChallenge))) + { pid, err := pp.Decode(hb.PeerID) if err != nil { return nil, nil, err } upTime := float64(0) + isFirstHeartbeat := true lock.Lock() if rec, ok := streams[pid]; ok && rec.GetUptimeTracker() != nil { upTime = rec.GetUptimeTracker().Uptime().Hours() / float64(time.Since(TimeWatcher).Hours()) + isFirstHeartbeat = false } lock.Unlock() diversity := getDiversityRate(h, hb.IndexersBinded) + fmt.Println(upTime, bpms, diversity) hb.ComputeIndexerScore(upTime, bpms, diversity) - if hb.Score < 75 { + // First heartbeat: uptime is always 0 so the score ceiling is 60, below the + // steady-state threshold of 75. Use a lower admission threshold so new peers + // can enter and start accumulating uptime. Subsequent heartbeats must meet + // the full threshold once uptime is tracked. + minScore := float64(50) + if isFirstHeartbeat { + minScore = 40 + } + fmt.Println(hb.Score, minScore) + if hb.Score < minScore { return nil, nil, errors.New("not enough trusting value") } hb.Stream = &Stream{ @@ -178,11 +239,13 @@ func CheckHeartbeat(h host.Host, s network.Stream, streams map[pp.ID]HeartBeatSt } func getDiversityRate(h host.Host, peers []string) float64 { + peers, _ = checkPeers(h, peers) diverse := []string{} for _, p := range peers { ip, err := ExtractIP(p) if err != nil { + fmt.Println("NO IP", p, err) continue } div := ip.Mask(net.CIDRMask(24, 32)).String() @@ -190,6 +253,9 @@ func getDiversityRate(h host.Host, peers []string) float64 { diverse = append(diverse, div) } } + if len(diverse) == 0 || len(peers) == 0 { + return 1 + } return float64(len(diverse) / len(peers)) } @@ -211,35 +277,42 @@ func checkPeers(h host.Host, peers []string) ([]string, []string) { return concretePeer, ips } -const MaxExpectedMbps = 50.0 +const MaxExpectedMbps = 100.0 const MinPayloadChallenge = 512 const MaxPayloadChallenge = 2048 const BaseRoundTrip = 400 * time.Millisecond -func getBandwidthChallengeRate(payloadSize int, s network.Stream) (bool, float64, error) { - // Génération payload aléatoire +// getBandwidthChallengeRate opens a dedicated ProtocolBandwidthProbe stream to +// remotePeer, sends a random payload, reads the echo, and computes throughput. +// Using a separate stream avoids mixing binary data on the JSON heartbeat stream +// and ensures the echo handler is actually running on the remote side. +func getBandwidthChallengeRate(h host.Host, remotePeer pp.ID, payloadSize int) (bool, float64, error) { payload := make([]byte, payloadSize) - _, err := cr.Read(payload) + if _, err := cr.Read(payload); err != nil { + return false, 0, err + } + + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel() + s, err := h.NewStream(ctx, remotePeer, ProtocolBandwidthProbe) if err != nil { return false, 0, err } + defer s.Reset() + s.SetDeadline(time.Now().Add(10 * time.Second)) start := time.Now() - // send on heartbeat stream the challenge if _, err = s.Write(payload); err != nil { return false, 0, err } - // read back + s.CloseWrite() + // Half-close the write side so the handler's io.Copy sees EOF and stops. + // Read the echo. response := make([]byte, payloadSize) - _, err = io.ReadFull(s, response) - if err != nil { + if _, err = io.ReadFull(s, response); err != nil { return false, 0, err } duration := time.Since(start) - // Verify content - if !bytes.Equal(payload, response) { - return false, 0, nil // pb or a sadge peer. - } maxRoundTrip := BaseRoundTrip + (time.Duration(payloadSize) * (100 * time.Millisecond)) mbps := float64(payloadSize*8) / duration.Seconds() / 1e6 if duration > maxRoundTrip || mbps < 5.0 { @@ -345,13 +418,36 @@ var StaticIndexers map[string]*pp.AddrInfo = map[string]*pp.AddrInfo{} var StreamMuIndexes sync.RWMutex var StreamIndexers ProtocolStream = ProtocolStream{} -func ConnectToIndexers(h host.Host, minIndexer int, maxIndexer int, myPID pp.ID) error { +// indexerHeartbeatNudge allows replenishIndexersFromNative to trigger an immediate +// heartbeat tick after adding new entries to StaticIndexers, without waiting up +// to 20s for the regular ticker. Buffered(1) so the sender never blocks. +var indexerHeartbeatNudge = make(chan struct{}, 1) + +// NudgeIndexerHeartbeat signals the indexer heartbeat goroutine to fire immediately. +func NudgeIndexerHeartbeat() { + select { + case indexerHeartbeatNudge <- struct{}{}: + default: // nudge already pending, skip + } +} + +func ConnectToIndexers(h host.Host, minIndexer int, maxIndexer int, myPID pp.ID, recordFn ...func() json.RawMessage) error { TimeWatcher = time.Now().UTC() logger := oclib.GetLogger() - // If native addresses are configured, bypass static indexer addresses + // If native addresses are configured, get the indexer pool from the native mesh, + // then start the long-lived heartbeat goroutine toward those indexers. if conf.GetConfig().NativeIndexerAddresses != "" { - return ConnectToNatives(h, minIndexer, maxIndexer, myPID) + if err := ConnectToNatives(h, minIndexer, maxIndexer, myPID); err != nil { + return err + } + // Step 2: start the long-lived heartbeat goroutine toward the indexer pool. + // replaceStaticIndexers/replenishIndexersFromNative update the map in-place + // so this single goroutine follows all pool changes automatically. + logger.Info().Msg("[native] step 2 — starting long-lived heartbeat to indexer pool") + SendHeartbeat(context.Background(), ProtocolHeartbeat, conf.GetConfig().Name, + h, StreamIndexers, StaticIndexers, &StreamMuIndexes, 20*time.Second, recordFn...) + return nil } addresses := strings.Split(conf.GetConfig().IndexerAddresses, ",") @@ -360,8 +456,8 @@ func ConnectToIndexers(h host.Host, minIndexer int, maxIndexer int, myPID pp.ID) addresses = addresses[0:maxIndexer] } + StreamMuIndexes.Lock() for _, indexerAddr := range addresses { - fmt.Println("GENERATE ADDR", indexerAddr) ad, err := pp.AddrInfoFromString(indexerAddr) if err != nil { logger.Err(err) @@ -369,15 +465,18 @@ func ConnectToIndexers(h host.Host, minIndexer int, maxIndexer int, myPID pp.ID) } StaticIndexers[indexerAddr] = ad } + indexerCount := len(StaticIndexers) + StreamMuIndexes.Unlock() - SendHeartbeat(context.Background(), ProtocolHeartbeat, conf.GetConfig().Name, h, StreamIndexers, StaticIndexers, 20*time.Second) // your indexer is just like a node for the next indexer. - if len(StaticIndexers) < minIndexer { + SendHeartbeat(context.Background(), ProtocolHeartbeat, conf.GetConfig().Name, h, StreamIndexers, StaticIndexers, &StreamMuIndexes, 20*time.Second, recordFn...) // your indexer is just like a node for the next indexer. + if indexerCount < minIndexer { return errors.New("you run a node without indexers... your gonna be isolated.") } return nil } func AddStreamProtocol(ctx *context.Context, protoS ProtocolStream, h host.Host, proto protocol.ID, id pp.ID, mypid pp.ID, force bool, onStreamCreated *func(network.Stream)) ProtocolStream { + logger := oclib.GetLogger() if onStreamCreated == nil { f := func(s network.Stream) { protoS[proto][id] = &Stream{ @@ -400,7 +499,7 @@ func AddStreamProtocol(ctx *context.Context, protoS ProtocolStream, h host.Host, if protoS[proto][id] != nil { protoS[proto][id].Expiry = time.Now().Add(2 * time.Minute) } else { - fmt.Println("NEW STREAM", proto, id) + logger.Info().Msg("NEW STREAM Generated" + fmt.Sprintf("%v", proto) + " " + id.String()) s, err := h.NewStream(*ctx, id, proto) if err != nil { panic(err.Error()) @@ -419,12 +518,16 @@ type Heartbeat struct { Timestamp int64 `json:"timestamp"` IndexersBinded []string `json:"indexers_binded"` Score float64 + // Record carries a fresh signed PeerRecord (JSON) so the receiving indexer + // can republish it to the DHT without an extra round-trip. + // Only set by nodes (not indexers heartbeating other indexers). + Record json.RawMessage `json:"record,omitempty"` } func (hb *Heartbeat) ComputeIndexerScore(uptimeHours float64, bpms float64, diversity float64) { - hb.Score = (0.4 * uptimeHours) + - (0.4 * bpms) + - (0.2 * diversity) + hb.Score = ((0.3 * uptimeHours) + + (0.3 * bpms) + + (0.4 * diversity)) * 100 } type HeartbeatInfo []struct { @@ -433,34 +536,213 @@ type HeartbeatInfo []struct { const ProtocolHeartbeat = "/opencloud/heartbeat/1.0" -func SendHeartbeat(ctx context.Context, proto protocol.ID, name string, h host.Host, ps ProtocolStream, peers map[string]*pp.AddrInfo, interval time.Duration) { - peerID, err := oclib.GenerateNodeID() - if err == nil { - panic("can't heartbeat daemon failed to start") +// ProtocolBandwidthProbe is a dedicated short-lived stream used exclusively +// for bandwidth/latency measurement. The handler echoes any bytes it receives. +// All nodes and indexers register this handler so peers can measure them. +const ProtocolBandwidthProbe = "/opencloud/probe/1.0" + +// HandleBandwidthProbe echoes back everything written on the stream, then closes. +// It is registered by all participants so the measuring side (the heartbeat receiver) +// can open a dedicated probe stream and read the round-trip latency + throughput. +func HandleBandwidthProbe(s network.Stream) { + defer s.Close() + s.SetDeadline(time.Now().Add(10 * time.Second)) + io.Copy(s, s) // echo every byte back to the sender +} + +// SendHeartbeat starts a goroutine that sends periodic heartbeats to peers. +// recordFn, when provided, is called on each tick and its output is embedded in +// the heartbeat as a fresh signed PeerRecord so the receiving indexer can +// republish it to the DHT without an extra round-trip. +// Pass no recordFn (or nil) for indexer→indexer / native heartbeats. +func SendHeartbeat(ctx context.Context, proto protocol.ID, name string, h host.Host, ps ProtocolStream, peers map[string]*pp.AddrInfo, mu *sync.RWMutex, interval time.Duration, recordFn ...func() json.RawMessage) { + logger := oclib.GetLogger() + // isIndexerHB is true when this goroutine drives the indexer heartbeat. + // isNativeHB is true when it drives the native heartbeat. + isIndexerHB := mu == &StreamMuIndexes + isNativeHB := mu == &StreamNativeMu + var recFn func() json.RawMessage + if len(recordFn) > 0 { + recFn = recordFn[0] } go func() { + logger.Info().Str("proto", string(proto)).Int("peers", len(peers)).Msg("heartbeat started") t := time.NewTicker(interval) defer t.Stop() + + // doTick sends one round of heartbeats to the current peer snapshot. + doTick := func() { + // Build the heartbeat payload — snapshot current indexer addresses. + StreamMuIndexes.RLock() + addrs := make([]string, 0, len(StaticIndexers)) + for addr := range StaticIndexers { + addrs = append(addrs, addr) + } + StreamMuIndexes.RUnlock() + hb := Heartbeat{ + Name: name, + PeerID: h.ID().String(), + Timestamp: time.Now().UTC().Unix(), + IndexersBinded: addrs, + } + if recFn != nil { + hb.Record = recFn() + } + + // Snapshot the peer list under a read lock so we don't hold the + // write lock during network I/O. + if mu != nil { + mu.RLock() + } + snapshot := make([]*pp.AddrInfo, 0, len(peers)) + for _, ix := range peers { + snapshot = append(snapshot, ix) + } + if mu != nil { + mu.RUnlock() + } + + for _, ix := range snapshot { + wasConnected := h.Network().Connectedness(ix.ID) == network.Connected + if err := sendHeartbeat(ctx, h, proto, ix, hb, ps, interval*time.Second); err != nil { + // Step 3: heartbeat failed — remove from pool and trigger replenish. + logger.Info().Str("peer", ix.ID.String()).Str("proto", string(proto)).Msg("[native] step 3 — heartbeat failed, removing peer from pool") + + // Remove the dead peer and clean up its stream. + // mu already covers ps when isIndexerHB (same mutex), so one + // lock acquisition is sufficient — no re-entrant double-lock. + if mu != nil { + mu.Lock() + } + if ps[proto] != nil { + if s, ok := ps[proto][ix.ID]; ok { + if s.Stream != nil { + s.Stream.Close() + } + delete(ps[proto], ix.ID) + } + } + lostAddr := "" + for addr, ad := range peers { + if ad.ID == ix.ID { + lostAddr = addr + delete(peers, addr) + break + } + } + need := conf.GetConfig().MinIndexer - len(peers) + remaining := len(peers) + if mu != nil { + mu.Unlock() + } + logger.Info().Int("remaining", remaining).Int("min", conf.GetConfig().MinIndexer).Int("need", need).Msg("[native] step 3 — pool state after removal") + + // Step 4: ask the native for the missing indexer count. + if isIndexerHB && conf.GetConfig().NativeIndexerAddresses != "" { + if need < 1 { + need = 1 + } + logger.Info().Int("need", need).Msg("[native] step 3→4 — triggering replenish") + go replenishIndexersFromNative(h, need) + } + + // Native heartbeat failed — find a replacement native. + // Case 1: if the dead native was also serving as an indexer, evict it + // from StaticIndexers immediately without waiting for the indexer HB tick. + if isNativeHB { + logger.Info().Str("addr", lostAddr).Msg("[native] step 3 — native heartbeat failed, triggering native replenish") + if lostAddr != "" && conf.GetConfig().NativeIndexerAddresses != "" { + StreamMuIndexes.Lock() + if _, wasIndexer := StaticIndexers[lostAddr]; wasIndexer { + delete(StaticIndexers, lostAddr) + if s := StreamIndexers[ProtocolHeartbeat]; s != nil { + if stream, ok := s[ix.ID]; ok { + if stream.Stream != nil { + stream.Stream.Close() + } + delete(s, ix.ID) + } + } + idxNeed := conf.GetConfig().MinIndexer - len(StaticIndexers) + StreamMuIndexes.Unlock() + if idxNeed < 1 { + idxNeed = 1 + } + logger.Info().Str("addr", lostAddr).Msg("[native] dead native evicted from indexer pool, triggering replenish") + go replenishIndexersFromNative(h, idxNeed) + } else { + StreamMuIndexes.Unlock() + } + } + go replenishNativesFromPeers(h, lostAddr, proto) + } + } else { + // Case 2: native-as-indexer reconnected after a restart. + // If the peer was disconnected before this tick and the heartbeat just + // succeeded (transparent reconnect), the native may have restarted with + // blank state (responsiblePeers empty). Evict it from StaticIndexers and + // re-request an assignment so the native re-tracks us properly and + // runOffloadLoop can eventually migrate us to real indexers. + if !wasConnected && isIndexerHB && conf.GetConfig().NativeIndexerAddresses != "" { + StreamNativeMu.RLock() + isNativeIndexer := false + for _, ad := range StaticNatives { + if ad.ID == ix.ID { + isNativeIndexer = true + break + } + } + StreamNativeMu.RUnlock() + if isNativeIndexer { + if mu != nil { + mu.Lock() + } + if ps[proto] != nil { + if s, ok := ps[proto][ix.ID]; ok { + if s.Stream != nil { + s.Stream.Close() + } + delete(ps[proto], ix.ID) + } + } + reconnectedAddr := "" + for addr, ad := range peers { + if ad.ID == ix.ID { + reconnectedAddr = addr + delete(peers, addr) + break + } + } + idxNeed := conf.GetConfig().MinIndexer - len(peers) + if mu != nil { + mu.Unlock() + } + if idxNeed < 1 { + idxNeed = 1 + } + logger.Info().Str("addr", reconnectedAddr).Str("peer", ix.ID.String()).Msg( + "[native] native-as-indexer reconnected after restart — evicting and re-requesting assignment") + go replenishIndexersFromNative(h, idxNeed) + } + } + logger.Debug().Str("peer", ix.ID.String()).Str("proto", string(proto)).Msg("[native] step 2 — heartbeat sent ok") + } + } + } + for { select { case <-t.C: - addrs := []string{} - for addr := range StaticIndexers { - addrs = append(addrs, addr) + doTick() + case <-indexerHeartbeatNudge: + if isIndexerHB { + logger.Info().Msg("[native] step 2 — nudge received, heartbeating new indexers immediately") + doTick() } - hb := Heartbeat{ - Name: name, - DID: peerID, - PeerID: h.ID().String(), - Timestamp: time.Now().UTC().Unix(), - IndexersBinded: addrs, - } - for _, ix := range peers { - if err = sendHeartbeat(ctx, h, proto, ix, hb, ps, interval*time.Second); err != nil { - StreamMuIndexes.Lock() - delete(StreamIndexers[proto], ix.ID) - StreamMuIndexes.Unlock() - } + case <-nativeHeartbeatNudge: + if isNativeHB { + logger.Info().Msg("[native] native nudge received, heartbeating replacement native immediately") + doTick() } case <-ctx.Done(): return @@ -480,58 +762,62 @@ func TempStream(h host.Host, ad pp.AddrInfo, proto protocol.ID, did string, stre if pts[proto] != nil { expiry = pts[proto].TTL } - if ctxTTL, err := context.WithTimeout(context.Background(), expiry); err == nil { - if h.Network().Connectedness(ad.ID) != network.Connected { - if err := h.Connect(ctxTTL, ad); err != nil { - return streams, err - } - } - if streams[proto] != nil && streams[proto][ad.ID] != nil { - return streams, nil - } else if s, err := h.NewStream(ctxTTL, ad.ID, proto); err == nil { - mu.Lock() - if streams[proto] == nil { - streams[proto] = map[pp.ID]*Stream{} - } - mu.Unlock() - time.AfterFunc(expiry, func() { - mu.Lock() - defer mu.Unlock() - delete(streams[proto], ad.ID) - }) - streams[ProtocolPublish][ad.ID] = &Stream{ - DID: did, - Stream: s, - Expiry: time.Now().UTC().Add(expiry), - } - mu.Unlock() - return streams, nil - } else { + ctxTTL, _ := context.WithTimeout(context.Background(), expiry) + if h.Network().Connectedness(ad.ID) != network.Connected { + if err := h.Connect(ctxTTL, ad); err != nil { return streams, err } } - return streams, errors.New("can't create a context") + if streams[proto] != nil && streams[proto][ad.ID] != nil { + return streams, nil + } else if s, err := h.NewStream(ctxTTL, ad.ID, proto); err == nil { + mu.Lock() + if streams[proto] == nil { + streams[proto] = map[pp.ID]*Stream{} + } + mu.Unlock() + time.AfterFunc(expiry, func() { + mu.Lock() + delete(streams[proto], ad.ID) + mu.Unlock() + }) + mu.Lock() + streams[proto][ad.ID] = &Stream{ + DID: did, + Stream: s, + Expiry: time.Now().UTC().Add(expiry), + } + mu.Unlock() + return streams, nil + } else { + return streams, err + } } func sendHeartbeat(ctx context.Context, h host.Host, proto protocol.ID, p *pp.AddrInfo, hb Heartbeat, ps ProtocolStream, interval time.Duration) error { - streams := ps.Get(proto) - if len(streams) == 0 { - return errors.New("no stream for protocol heartbeat founded") + logger := oclib.GetLogger() + if ps[proto] == nil { + ps[proto] = map[pp.ID]*Stream{} } + streams := ps[proto] pss, exists := streams[p.ID] - ctxTTL, _ := context.WithTimeout(ctx, 3*interval) + ctxTTL, cancel := context.WithTimeout(ctx, 3*interval) + defer cancel() // Connect si nécessaire if h.Network().Connectedness(p.ID) != network.Connected { if err := h.Connect(ctxTTL, *p); err != nil { + logger.Err(err) return err } exists = false // on devra recréer le stream } // Crée le stream si inexistant ou fermé if !exists || pss.Stream == nil { + logger.Info().Msg("New Stream engaged as Heartbeat " + fmt.Sprintf("%v", proto) + " " + p.ID.String()) s, err := h.NewStream(ctx, p.ID, proto) if err != nil { + logger.Err(err) return err } pss = &Stream{ diff --git a/daemons/node/common/native_stream.go b/daemons/node/common/native_stream.go index be8c94f..a76c291 100644 --- a/daemons/node/common/native_stream.go +++ b/daemons/node/common/native_stream.go @@ -13,6 +13,7 @@ import ( oclib "cloud.o-forge.io/core/oc-lib" "github.com/libp2p/go-libp2p/core/host" pp "github.com/libp2p/go-libp2p/core/peer" + "github.com/libp2p/go-libp2p/core/protocol" ) const ( @@ -56,7 +57,8 @@ type IndexerRegistration struct { // GetIndexersRequest asks a native for a pool of live indexers. type GetIndexersRequest struct { - Count int `json:"count"` + Count int `json:"count"` + From string `json:"from"` } // GetIndexersResponse is returned by the native with live indexer multiaddrs. @@ -69,17 +71,26 @@ var StaticNatives = map[string]*pp.AddrInfo{} var StreamNativeMu sync.RWMutex var StreamNatives ProtocolStream = ProtocolStream{} -// ConnectToNatives is the client-side entry point for nodes/indexers that have -// NativeIndexerAddresses configured. It: -// 1. Connects (long-lived heartbeat) to all configured natives. -// 2. Fetches an initial indexer pool from the FIRST responsive native. -// 3. Challenges that pool to ALL natives (consensus round 1). -// 4. If the confirmed list is short, samples native suggestions and re-challenges (round 2). -// 5. Populates StaticIndexers with majority-confirmed indexers. +// nativeHeartbeatOnce ensures we start exactly one long-lived heartbeat goroutine +// toward the native mesh, even when ConnectToNatives is called from recovery paths. +var nativeHeartbeatOnce sync.Once + +// nativeMeshHeartbeatOnce guards the native-to-native heartbeat goroutine started +// by EnsureNativePeers so only one goroutine covers the whole StaticNatives map. +var nativeMeshHeartbeatOnce sync.Once + +// ConnectToNatives is the initial setup for nodes/indexers in native mode: +// 1. Parses native addresses → StaticNatives. +// 2. Starts a single long-lived heartbeat goroutine toward the native mesh. +// 3. Fetches an initial indexer pool from the first responsive native. +// 4. Runs consensus when real (non-fallback) indexers are returned. +// 5. Replaces StaticIndexers with the confirmed pool. func ConnectToNatives(h host.Host, minIndexer int, maxIndexer int, myPID pp.ID) error { logger := oclib.GetLogger() + logger.Info().Msg("[native] step 1 — parsing native addresses") - // Parse in config order: the first entry is the primary pool source. + // Parse native addresses — safe to call multiple times. + StreamNativeMu.Lock() orderedAddrs := []string{} for _, addr := range strings.Split(conf.GetConfig().NativeIndexerAddresses, ",") { addr = strings.TrimSpace(addr) @@ -88,106 +99,208 @@ func ConnectToNatives(h host.Host, minIndexer int, maxIndexer int, myPID pp.ID) } ad, err := pp.AddrInfoFromString(addr) if err != nil { - logger.Err(err).Msg("ConnectToNatives: invalid addr") + logger.Err(err).Msg("[native] step 1 — invalid native addr") continue } StaticNatives[addr] = ad orderedAddrs = append(orderedAddrs, addr) + logger.Info().Str("addr", addr).Msg("[native] step 1 — native registered") } if len(StaticNatives) == 0 { + StreamNativeMu.Unlock() return errors.New("no valid native addresses configured") } + StreamNativeMu.Unlock() + logger.Info().Int("count", len(orderedAddrs)).Msg("[native] step 1 — natives parsed") - // Long-lived heartbeat connections to keep the native mesh active. - SendHeartbeat(context.Background(), ProtocolHeartbeat, - conf.GetConfig().Name, h, StreamNatives, StaticNatives, 20*time.Second) - - // Step 1: get an initial pool from the FIRST responsive native (in config order). - var candidates []string - var isFallback bool - for _, addr := range orderedAddrs { - ad := StaticNatives[addr] - ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) - if err := h.Connect(ctx, *ad); err != nil { - cancel() - continue - } - s, err := h.NewStream(ctx, ad.ID, ProtocolNativeGetIndexers) - cancel() - if err != nil { - continue - } - req := GetIndexersRequest{Count: maxIndexer} - if encErr := json.NewEncoder(s).Encode(req); encErr != nil { - s.Close() - continue - } - var resp GetIndexersResponse - if decErr := json.NewDecoder(s).Decode(&resp); decErr != nil { - s.Close() - continue - } - s.Close() - candidates = resp.Indexers - isFallback = resp.IsSelfFallback - break // first responsive native only - } + // Step 1: one long-lived heartbeat to each native. + nativeHeartbeatOnce.Do(func() { + logger.Info().Msg("[native] step 1 — starting long-lived heartbeat to native mesh") + SendHeartbeat(context.Background(), ProtocolHeartbeat, + conf.GetConfig().Name, h, StreamNatives, StaticNatives, &StreamNativeMu, 20*time.Second) + }) + // Fetch initial pool from the first responsive native. + logger.Info().Int("want", maxIndexer).Msg("[native] step 1 — fetching indexer pool from native") + candidates, isFallback := fetchIndexersFromNative(h, orderedAddrs, maxIndexer) if len(candidates) == 0 { + logger.Warn().Msg("[native] step 1 — no candidates returned by any native") if minIndexer > 0 { return errors.New("ConnectToNatives: no indexers available from any native") } return nil } + logger.Info().Int("candidates", len(candidates)).Bool("fallback", isFallback).Msg("[native] step 1 — pool received") - // If the native is already the fallback indexer, use it directly — no consensus needed. + // Step 2: populate StaticIndexers — consensus for real indexers, direct for fallback. + pool := resolvePool(h, candidates, isFallback, maxIndexer) + replaceStaticIndexers(pool) + + StreamMuIndexes.RLock() + indexerCount := len(StaticIndexers) + StreamMuIndexes.RUnlock() + logger.Info().Int("pool_size", indexerCount).Msg("[native] step 2 — StaticIndexers replaced") + + if minIndexer > 0 && indexerCount < minIndexer { + return errors.New("not enough majority-confirmed indexers available") + } + return nil +} + +// replenishIndexersFromNative is called when an indexer heartbeat fails (step 3→4). +// It asks the native for exactly `need` replacement indexers, runs consensus when +// real indexers are returned, and adds the results to StaticIndexers without +// clearing the existing pool. +func replenishIndexersFromNative(h host.Host, need int) { + if need <= 0 { + return + } + logger := oclib.GetLogger() + logger.Info().Int("need", need).Msg("[native] step 4 — replenishing indexer pool from native") + + StreamNativeMu.RLock() + addrs := make([]string, 0, len(StaticNatives)) + for addr := range StaticNatives { + addrs = append(addrs, addr) + } + StreamNativeMu.RUnlock() + + candidates, isFallback := fetchIndexersFromNative(h, addrs, need) + if len(candidates) == 0 { + logger.Warn().Msg("[native] step 4 — no candidates returned by any native") + return + } + logger.Info().Int("candidates", len(candidates)).Bool("fallback", isFallback).Msg("[native] step 4 — candidates received") + + pool := resolvePool(h, candidates, isFallback, need) + if len(pool) == 0 { + logger.Warn().Msg("[native] step 4 — consensus yielded no confirmed indexers") + return + } + + // Add new indexers to the pool — do NOT clear existing ones. + StreamMuIndexes.Lock() + for addr, ad := range pool { + StaticIndexers[addr] = ad + } + total := len(StaticIndexers) + + StreamMuIndexes.Unlock() + logger.Info().Int("added", len(pool)).Int("total", total).Msg("[native] step 4 — pool replenished") + + // Nudge the heartbeat goroutine to connect immediately instead of waiting + // for the next 20s tick. + NudgeIndexerHeartbeat() + logger.Info().Msg("[native] step 4 — heartbeat goroutine nudged") +} + +// fetchIndexersFromNative opens a ProtocolNativeGetIndexers stream to the first +// responsive native and returns the candidate list and fallback flag. +func fetchIndexersFromNative(h host.Host, nativeAddrs []string, count int) (candidates []string, isFallback bool) { + logger := oclib.GetLogger() + for _, addr := range nativeAddrs { + ad, err := pp.AddrInfoFromString(addr) + if err != nil { + logger.Warn().Str("addr", addr).Msg("[native] fetch — skipping invalid addr") + continue + } + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + if err := h.Connect(ctx, *ad); err != nil { + cancel() + logger.Warn().Str("addr", addr).Err(err).Msg("[native] fetch — connect failed") + continue + } + s, err := h.NewStream(ctx, ad.ID, ProtocolNativeGetIndexers) + cancel() + if err != nil { + logger.Warn().Str("addr", addr).Err(err).Msg("[native] fetch — stream open failed") + continue + } + req := GetIndexersRequest{Count: count, From: h.ID().String()} + if encErr := json.NewEncoder(s).Encode(req); encErr != nil { + s.Close() + logger.Warn().Str("addr", addr).Err(encErr).Msg("[native] fetch — encode request failed") + continue + } + var resp GetIndexersResponse + if decErr := json.NewDecoder(s).Decode(&resp); decErr != nil { + s.Close() + logger.Warn().Str("addr", addr).Err(decErr).Msg("[native] fetch — decode response failed") + continue + } + s.Close() + logger.Info().Str("native", addr).Int("indexers", len(resp.Indexers)).Bool("fallback", resp.IsSelfFallback).Msg("[native] fetch — response received") + return resp.Indexers, resp.IsSelfFallback + } + logger.Warn().Msg("[native] fetch — no native responded") + return nil, false +} + +// resolvePool converts a candidate list to a validated addr→AddrInfo map. +// When isFallback is true the native itself is the indexer — no consensus needed. +// When isFallback is false, consensus is run before accepting the candidates. +func resolvePool(h host.Host, candidates []string, isFallback bool, maxIndexer int) map[string]*pp.AddrInfo { + logger := oclib.GetLogger() if isFallback { + logger.Info().Strs("addrs", candidates).Msg("[native] resolve — fallback mode, skipping consensus") + pool := make(map[string]*pp.AddrInfo, len(candidates)) for _, addr := range candidates { ad, err := pp.AddrInfoFromString(addr) if err != nil { continue } - StaticIndexers[addr] = ad + pool[addr] = ad } - return nil + return pool } - // Step 2: challenge the pool to ALL configured natives and score by majority vote. + // Round 1. + logger.Info().Int("candidates", len(candidates)).Msg("[native] resolve — consensus round 1") confirmed, suggestions := clientSideConsensus(h, candidates) + logger.Info().Int("confirmed", len(confirmed)).Int("suggestions", len(suggestions)).Msg("[native] resolve — consensus round 1 done") - // Step 3: if we still have gaps, sample from suggestions and re-challenge. + // Round 2: fill gaps from suggestions if below target. if len(confirmed) < maxIndexer && len(suggestions) > 0 { rand.Shuffle(len(suggestions), func(i, j int) { suggestions[i], suggestions[j] = suggestions[j], suggestions[i] }) gap := maxIndexer - len(confirmed) if gap > len(suggestions) { gap = len(suggestions) } + logger.Info().Int("gap", gap).Msg("[native] resolve — consensus round 2 (filling gaps)") confirmed2, _ := clientSideConsensus(h, append(confirmed, suggestions[:gap]...)) if len(confirmed2) > 0 { confirmed = confirmed2 } + logger.Info().Int("confirmed", len(confirmed)).Msg("[native] resolve — consensus round 2 done") } - // Step 4: populate StaticIndexers with confirmed addresses. + pool := make(map[string]*pp.AddrInfo, len(confirmed)) for _, addr := range confirmed { ad, err := pp.AddrInfoFromString(addr) if err != nil { continue } + pool[addr] = ad + } + logger.Info().Int("pool_size", len(pool)).Msg("[native] resolve — pool ready") + return pool +} + +// replaceStaticIndexers atomically replaces the active indexer pool. +// Peers no longer in next have their heartbeat streams closed so the SendHeartbeat +// goroutine stops sending to them on the next tick. +func replaceStaticIndexers(next map[string]*pp.AddrInfo) { + StreamMuIndexes.Lock() + defer StreamMuIndexes.Unlock() + for addr, ad := range next { StaticIndexers[addr] = ad } - - if minIndexer > 0 && len(StaticIndexers) < minIndexer { - return errors.New("not enough majority-confirmed indexers available") - } - return nil } // clientSideConsensus challenges a candidate list to ALL configured native peers // in parallel. Each native replies with the candidates it trusts plus extras it // recommends. An indexer is confirmed when strictly more than 50% of responding -// natives trust it. The remaining addresses from native suggestions are returned -// as suggestions for a possible second round. +// natives trust it. func clientSideConsensus(h host.Host, candidates []string) (confirmed []string, suggestions []string) { if len(candidates) == 0 { return nil, nil @@ -201,7 +314,6 @@ func clientSideConsensus(h host.Host, candidates []string) (confirmed []string, StreamNativeMu.RUnlock() if len(peers) == 0 { - // No natives to challenge: trust candidates as-is. return candidates, nil } @@ -239,13 +351,12 @@ func clientSideConsensus(h host.Host, candidates []string) (confirmed []string, }(ad) } - // Collect responses up to consensusCollectTimeout. timer := time.NewTimer(consensusCollectTimeout) defer timer.Stop() trustedCounts := map[string]int{} suggestionPool := map[string]struct{}{} - total := 0 // counts only natives that actually responded + total := 0 collected := 0 collect: @@ -254,7 +365,7 @@ collect: case r := <-ch: collected++ if !r.responded { - continue // timeout / error: skip, do not count as vote + continue } total++ seen := map[string]struct{}{} @@ -273,13 +384,12 @@ collect: } if total == 0 { - // No native responded: fall back to trusting the candidates as-is. return candidates, nil } confirmedSet := map[string]struct{}{} for addr, count := range trustedCounts { - if count*2 > total { // strictly >50% + if count*2 > total { confirmed = append(confirmed, addr) confirmedSet[addr] = struct{}{} } @@ -292,15 +402,17 @@ collect: return } -const ProtocolIndexerHeartbeat = "/opencloud/heartbeat/indexer/1.0" - // RegisterWithNative sends a one-shot registration to each configured native indexer. // Should be called periodically every RecommendedHeartbeatInterval. func RegisterWithNative(h host.Host, nativeAddressesStr string) { logger := oclib.GetLogger() myAddr := "" - if len(h.Addrs()) > 0 { - myAddr = h.Addrs()[0].String() + "/p2p/" + h.ID().String() + if !strings.Contains(h.Addrs()[len(h.Addrs())-1].String(), "127.0.0.1") { + myAddr = h.Addrs()[len(h.Addrs())-1].String() + "/p2p/" + h.ID().String() + } + if myAddr == "" { + logger.Warn().Msg("RegisterWithNative: no routable address yet, skipping") + return } reg := IndexerRegistration{ PeerID: h.ID().String(), @@ -334,16 +446,16 @@ func RegisterWithNative(h host.Host, nativeAddressesStr string) { } } -// EnsureNativePeers populates StaticNatives from config and starts heartbeat -// connections to other natives. Safe to call multiple times; heartbeat is only -// started once (when StaticNatives transitions from empty to non-empty). +// EnsureNativePeers populates StaticNatives from config and starts a single +// heartbeat goroutine toward the native mesh. Safe to call multiple times; +// the heartbeat goroutine is started at most once (nativeMeshHeartbeatOnce). func EnsureNativePeers(h host.Host) { + logger := oclib.GetLogger() nativeAddrs := conf.GetConfig().NativeIndexerAddresses if nativeAddrs == "" { return } StreamNativeMu.Lock() - wasEmpty := len(StaticNatives) == 0 for _, addr := range strings.Split(nativeAddrs, ",") { addr = strings.TrimSpace(addr) if addr == "" { @@ -354,11 +466,312 @@ func EnsureNativePeers(h host.Host) { continue } StaticNatives[addr] = ad + logger.Info().Str("addr", addr).Msg("native: registered peer in native mesh") } StreamNativeMu.Unlock() + // One heartbeat goroutine iterates over all of StaticNatives on each tick; + // starting one per address would multiply heartbeats by the native count. + nativeMeshHeartbeatOnce.Do(func() { + logger.Info().Msg("native: starting mesh heartbeat goroutine") + SendHeartbeat(context.Background(), ProtocolHeartbeat, + conf.GetConfig().Name, h, StreamNatives, StaticNatives, &StreamNativeMu, 20*time.Second) + }) +} - if wasEmpty && len(StaticNatives) > 0 { - SendHeartbeat(context.Background(), ProtocolIndexerHeartbeat, - conf.GetConfig().Name, h, StreamNatives, StaticNatives, 20*time.Second) +func StartNativeRegistration(h host.Host, nativeAddressesStr string) { + go func() { + // Poll until a routable (non-loopback) address is available before the first + // registration attempt. libp2p may not have discovered external addresses yet + // at startup. Cap at 12 retries (~1 minute) so we don't spin indefinitely. + for i := 0; i < 12; i++ { + hasRoutable := false + if !strings.Contains(h.Addrs()[len(h.Addrs())-1].String(), "127.0.0.1") { + hasRoutable = true + break + } + + if hasRoutable { + break + } + time.Sleep(5 * time.Second) + } + RegisterWithNative(h, nativeAddressesStr) + t := time.NewTicker(RecommendedHeartbeatInterval) + defer t.Stop() + for range t.C { + RegisterWithNative(h, nativeAddressesStr) + } + }() +} + +// ── Lost-native replacement ─────────────────────────────────────────────────── + +const ( + // ProtocolNativeGetPeers lets a node/indexer ask a native for a random + // selection of that native's own native contacts (to replace a dead native). + ProtocolNativeGetPeers = "/opencloud/native/peers/1.0" + // ProtocolIndexerGetNatives lets nodes/indexers ask a connected indexer for + // its configured native addresses (fallback when no alive native responds). + ProtocolIndexerGetNatives = "/opencloud/indexer/natives/1.0" + // retryNativeInterval is how often retryLostNative polls a dead native. + retryNativeInterval = 30 * time.Second +) + +// GetNativePeersRequest is sent to a native to ask for its known native contacts. +type GetNativePeersRequest struct { + Exclude []string `json:"exclude"` + Count int `json:"count"` +} + +// GetNativePeersResponse carries native addresses returned by a native's peer list. +type GetNativePeersResponse struct { + Peers []string `json:"peers"` +} + +// GetIndexerNativesRequest is sent to an indexer to ask for its configured native addresses. +type GetIndexerNativesRequest struct { + Exclude []string `json:"exclude"` +} + +// GetIndexerNativesResponse carries native addresses returned by an indexer. +type GetIndexerNativesResponse struct { + Natives []string `json:"natives"` +} + +// nativeHeartbeatNudge allows replenishNativesFromPeers to trigger an immediate +// native heartbeat tick after adding a replacement native to the pool. +var nativeHeartbeatNudge = make(chan struct{}, 1) + +// NudgeNativeHeartbeat signals the native heartbeat goroutine to fire immediately. +func NudgeNativeHeartbeat() { + select { + case nativeHeartbeatNudge <- struct{}{}: + default: // nudge already pending, skip + } +} + +// replenishIndexersIfNeeded checks if the indexer pool is below the configured +// minimum (or empty) and, if so, asks the native mesh for replacements. +// Called whenever a native is recovered so the indexer pool is restored. +func replenishIndexersIfNeeded(h host.Host) { + logger := oclib.GetLogger() + minIdx := conf.GetConfig().MinIndexer + if minIdx < 1 { + minIdx = 1 + } + StreamMuIndexes.RLock() + indexerCount := len(StaticIndexers) + StreamMuIndexes.RUnlock() + if indexerCount < minIdx { + need := minIdx - indexerCount + logger.Info().Int("need", need).Int("current", indexerCount).Msg("[native] native recovered — replenishing indexer pool") + go replenishIndexersFromNative(h, need) + } +} + +// replenishNativesFromPeers is called when the heartbeat to a native fails. +// Flow: +// 1. Ask other alive natives for one of their native contacts (ProtocolNativeGetPeers). +// 2. If none respond or return a new address, ask connected indexers (ProtocolIndexerGetNatives). +// 3. If no replacement found: +// - remaining > 1 → ignore (enough natives remain). +// - remaining ≤ 1 → start periodic retry (retryLostNative). +func replenishNativesFromPeers(h host.Host, lostAddr string, proto protocol.ID) { + if lostAddr == "" { + return + } + logger := oclib.GetLogger() + logger.Info().Str("lost", lostAddr).Msg("[native] replenish natives — start") + + // Build exclude list: the lost addr + all currently alive natives. + // lostAddr has already been removed from StaticNatives by doTick. + StreamNativeMu.RLock() + remaining := len(StaticNatives) + exclude := make([]string, 0, remaining+1) + exclude = append(exclude, lostAddr) + for addr := range StaticNatives { + exclude = append(exclude, addr) + } + StreamNativeMu.RUnlock() + + logger.Info().Int("remaining", remaining).Msg("[native] replenish natives — step 1: ask alive natives for a peer") + + // Step 1: ask other alive natives for a replacement. + newAddr := fetchNativeFromNatives(h, exclude) + + // Step 2: fallback — ask connected indexers for their native addresses. + if newAddr == "" { + logger.Info().Msg("[native] replenish natives — step 2: ask indexers for their native addresses") + newAddr = fetchNativeFromIndexers(h, exclude) + } + + if newAddr != "" { + ad, err := pp.AddrInfoFromString(newAddr) + if err == nil { + StreamNativeMu.Lock() + StaticNatives[newAddr] = ad + StreamNativeMu.Unlock() + logger.Info().Str("new", newAddr).Msg("[native] replenish natives — replacement added, nudging heartbeat") + NudgeNativeHeartbeat() + replenishIndexersIfNeeded(h) + return + } + } + + // Step 3: no replacement found. + logger.Warn().Int("remaining", remaining).Msg("[native] replenish natives — no replacement found") + if remaining > 1 { + logger.Info().Msg("[native] replenish natives — enough natives remain, ignoring loss") + return + } + // Last (or only) native — retry periodically. + logger.Info().Str("addr", lostAddr).Msg("[native] replenish natives — last native lost, starting periodic retry") + go retryLostNative(h, lostAddr, proto) +} + +// fetchNativeFromNatives asks each alive native for one of its own native contacts +// not in exclude. Returns the first new address found or "" if none. +func fetchNativeFromNatives(h host.Host, exclude []string) string { + logger := oclib.GetLogger() + excludeSet := make(map[string]struct{}, len(exclude)) + for _, e := range exclude { + excludeSet[e] = struct{}{} + } + + StreamNativeMu.RLock() + natives := make([]*pp.AddrInfo, 0, len(StaticNatives)) + for _, ad := range StaticNatives { + natives = append(natives, ad) + } + StreamNativeMu.RUnlock() + + rand.Shuffle(len(natives), func(i, j int) { natives[i], natives[j] = natives[j], natives[i] }) + + for _, ad := range natives { + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + if err := h.Connect(ctx, *ad); err != nil { + cancel() + logger.Warn().Str("native", ad.ID.String()).Err(err).Msg("[native] fetch native peers — connect failed") + continue + } + s, err := h.NewStream(ctx, ad.ID, ProtocolNativeGetPeers) + cancel() + if err != nil { + logger.Warn().Str("native", ad.ID.String()).Err(err).Msg("[native] fetch native peers — stream failed") + continue + } + req := GetNativePeersRequest{Exclude: exclude, Count: 1} + if encErr := json.NewEncoder(s).Encode(req); encErr != nil { + s.Close() + continue + } + var resp GetNativePeersResponse + if decErr := json.NewDecoder(s).Decode(&resp); decErr != nil { + s.Close() + continue + } + s.Close() + for _, peer := range resp.Peers { + if _, excluded := excludeSet[peer]; !excluded && peer != "" { + logger.Info().Str("from", ad.ID.String()).Str("new", peer).Msg("[native] fetch native peers — got replacement") + return peer + } + } + logger.Debug().Str("native", ad.ID.String()).Msg("[native] fetch native peers — no new native from this peer") + } + return "" +} + +// fetchNativeFromIndexers asks connected indexers for their configured native addresses, +// returning the first one not in exclude. +func fetchNativeFromIndexers(h host.Host, exclude []string) string { + logger := oclib.GetLogger() + excludeSet := make(map[string]struct{}, len(exclude)) + for _, e := range exclude { + excludeSet[e] = struct{}{} + } + + StreamMuIndexes.RLock() + indexers := make([]*pp.AddrInfo, 0, len(StaticIndexers)) + for _, ad := range StaticIndexers { + indexers = append(indexers, ad) + } + StreamMuIndexes.RUnlock() + + rand.Shuffle(len(indexers), func(i, j int) { indexers[i], indexers[j] = indexers[j], indexers[i] }) + + for _, ad := range indexers { + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + if err := h.Connect(ctx, *ad); err != nil { + cancel() + continue + } + s, err := h.NewStream(ctx, ad.ID, ProtocolIndexerGetNatives) + cancel() + if err != nil { + logger.Warn().Str("indexer", ad.ID.String()).Err(err).Msg("[native] fetch indexer natives — stream failed") + continue + } + req := GetIndexerNativesRequest{Exclude: exclude} + if encErr := json.NewEncoder(s).Encode(req); encErr != nil { + s.Close() + continue + } + var resp GetIndexerNativesResponse + if decErr := json.NewDecoder(s).Decode(&resp); decErr != nil { + s.Close() + continue + } + s.Close() + for _, nativeAddr := range resp.Natives { + if _, excluded := excludeSet[nativeAddr]; !excluded && nativeAddr != "" { + logger.Info().Str("indexer", ad.ID.String()).Str("native", nativeAddr).Msg("[native] fetch indexer natives — got native") + return nativeAddr + } + } + } + logger.Warn().Msg("[native] fetch indexer natives — no native found from indexers") + return "" +} + +// retryLostNative periodically retries connecting to a lost native address until +// it becomes reachable again or was already restored by another path. +func retryLostNative(h host.Host, addr string, nativeProto protocol.ID) { + logger := oclib.GetLogger() + logger.Info().Str("addr", addr).Msg("[native] retry — periodic retry for lost native started") + t := time.NewTicker(retryNativeInterval) + defer t.Stop() + for range t.C { + StreamNativeMu.RLock() + _, alreadyRestored := StaticNatives[addr] + StreamNativeMu.RUnlock() + if alreadyRestored { + logger.Info().Str("addr", addr).Msg("[native] retry — native already restored, stopping retry") + return + } + + ad, err := pp.AddrInfoFromString(addr) + if err != nil { + logger.Warn().Str("addr", addr).Msg("[native] retry — invalid addr, stopping retry") + return + } + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + err = h.Connect(ctx, *ad) + cancel() + if err != nil { + logger.Warn().Str("addr", addr).Msg("[native] retry — still unreachable") + continue + } + // Reachable again — add back to pool. + StreamNativeMu.Lock() + StaticNatives[addr] = ad + StreamNativeMu.Unlock() + logger.Info().Str("addr", addr).Msg("[native] retry — native reconnected and added back to pool") + NudgeNativeHeartbeat() + replenishIndexersIfNeeded(h) + if nativeProto == ProtocolNativeGetIndexers { + StartNativeRegistration(h, addr) // register back + } + return } } diff --git a/daemons/node/common/utils.go b/daemons/node/common/utils.go index 8dc9079..8cb7d65 100644 --- a/daemons/node/common/utils.go +++ b/daemons/node/common/utils.go @@ -24,17 +24,16 @@ func ExtractIP(addr string) (net.IP, error) { if err != nil { return nil, err } - ips, err := ma.ValueForProtocol(multiaddr.P_IP4) // or P_IP6 + ipStr, err := ma.ValueForProtocol(multiaddr.P_IP4) if err != nil { - return nil, err + ipStr, err = ma.ValueForProtocol(multiaddr.P_IP6) + if err != nil { + return nil, err + } } - host, _, err := net.SplitHostPort(ips) - if err != nil { - return nil, err - } - ip := net.ParseIP(host) + ip := net.ParseIP(ipStr) if ip == nil { - return nil, fmt.Errorf("invalid IP: %s", host) + return nil, fmt.Errorf("invalid IP: %s", ipStr) } return ip, nil } diff --git a/daemons/node/indexer/handler.go b/daemons/node/indexer/handler.go index a8346ac..81c00fa 100644 --- a/daemons/node/indexer/handler.go +++ b/daemons/node/indexer/handler.go @@ -5,8 +5,9 @@ import ( "encoding/base64" "encoding/json" "errors" - "fmt" + "oc-discovery/conf" "oc-discovery/daemons/node/common" + "strings" "time" oclib "cloud.o-forge.io/core/oc-lib" @@ -18,17 +19,21 @@ import ( "github.com/libp2p/go-libp2p/core/peer" ) +type PeerRecordPayload struct { + Name string `json:"name"` + DID string `json:"did"` + PubKey []byte `json:"pub_key"` + ExpiryDate time.Time `json:"expiry_date"` +} + type PeerRecord struct { - Name string `json:"name"` - DID string `json:"did"` // real PEER ID - PeerID string `json:"peer_id"` - PubKey []byte `json:"pub_key"` - APIUrl string `json:"api_url"` - StreamAddress string `json:"stream_address"` - NATSAddress string `json:"nats_address"` - WalletAddress string `json:"wallet_address"` - Signature []byte `json:"signature"` - ExpiryDate time.Time `json:"expiry_date"` + PeerRecordPayload + PeerID string `json:"peer_id"` + APIUrl string `json:"api_url"` + StreamAddress string `json:"stream_address"` + NATSAddress string `json:"nats_address"` + WalletAddress string `json:"wallet_address"` + Signature []byte `json:"signature"` } func (p *PeerRecord) Sign() error { @@ -36,13 +41,7 @@ func (p *PeerRecord) Sign() error { if err != nil { return err } - dht := PeerRecord{ - Name: p.Name, - DID: p.DID, - PubKey: p.PubKey, - ExpiryDate: p.ExpiryDate, - } - payload, _ := json.Marshal(dht) + payload, _ := json.Marshal(p.PeerRecordPayload) b, err := common.Sign(priv, payload) p.Signature = b return err @@ -51,19 +50,11 @@ func (p *PeerRecord) Sign() error { func (p *PeerRecord) Verify() (crypto.PubKey, error) { pubKey, err := crypto.UnmarshalPublicKey(p.PubKey) // retrieve pub key in message if err != nil { - fmt.Println("UnmarshalPublicKey") return pubKey, err } - dht := PeerRecord{ - Name: p.Name, - DID: p.DID, - PubKey: p.PubKey, - ExpiryDate: p.ExpiryDate, - } - payload, _ := json.Marshal(dht) + payload, _ := json.Marshal(p.PeerRecordPayload) - if ok, _ := common.Verify(pubKey, payload, p.Signature); !ok { // verify minimal message was sign per pubKey - fmt.Println("Verify") + if ok, _ := pubKey.Verify(payload, p.Signature); !ok { // verify minimal message was sign per pubKey return pubKey, errors.New("invalid signature") } return pubKey, nil @@ -114,6 +105,8 @@ func (pr *PeerRecord) ExtractPeer(ourkey string, key string, pubKey crypto.PubKe type GetValue struct { Key string `json:"key"` PeerID peer.ID `json:"peer_id"` + Name string `json:"name,omitempty"` + Search bool `json:"search,omitempty"` } type GetResponse struct { @@ -125,122 +118,233 @@ func (ix *IndexerService) genKey(did string) string { return "/node/" + did } +func (ix *IndexerService) genNameKey(name string) string { + return "/name/" + name +} + +func (ix *IndexerService) genPIDKey(peerID string) string { + return "/pid/" + peerID +} + func (ix *IndexerService) initNodeHandler() { - ix.Host.SetStreamHandler(common.ProtocolHeartbeat, ix.HandleNodeHeartbeat) + logger := oclib.GetLogger() + logger.Info().Msg("Init Node Handler") + // Each heartbeat from a node carries a freshly signed PeerRecord. + // Republish it to the DHT so the record never expires as long as the node + // is alive — no separate publish stream needed from the node side. + ix.AfterHeartbeat = func(pid peer.ID) { + ctx1, cancel1 := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel1() + res, err := ix.DHT.GetValue(ctx1, ix.genPIDKey(pid.String())) + if err != nil { + logger.Warn().Err(err) + return + } + did := string(res) + ctx2, cancel2 := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel2() + res, err = ix.DHT.GetValue(ctx2, ix.genKey(did)) + if err != nil { + logger.Warn().Err(err) + return + } + var rec PeerRecord + if err := json.Unmarshal(res, &rec); err != nil { + logger.Warn().Err(err).Str("peer", pid.String()).Msg("indexer: heartbeat record unmarshal failed") + return + } + if _, err := rec.Verify(); err != nil { + logger.Warn().Err(err).Str("peer", pid.String()).Msg("indexer: heartbeat record signature invalid") + return + } + data, err := json.Marshal(rec) + if err != nil { + return + } + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel() + logger.Info().Msg("REFRESH PutValue " + ix.genKey(rec.DID)) + if err := ix.DHT.PutValue(ctx, ix.genKey(rec.DID), data); err != nil { + logger.Warn().Err(err).Str("did", rec.DID).Msg("indexer: DHT refresh failed") + return + } + if rec.Name != "" { + ctx2, cancel2 := context.WithTimeout(context.Background(), 10*time.Second) + ix.DHT.PutValue(ctx2, ix.genNameKey(rec.Name), []byte(rec.DID)) + cancel2() + } + if rec.PeerID != "" { + ctx3, cancel3 := context.WithTimeout(context.Background(), 10*time.Second) + ix.DHT.PutValue(ctx3, ix.genPIDKey(rec.PeerID), []byte(rec.DID)) + cancel3() + } + } + ix.Host.SetStreamHandler(common.ProtocolHeartbeat, ix.HandleHeartbeat) ix.Host.SetStreamHandler(common.ProtocolPublish, ix.handleNodePublish) ix.Host.SetStreamHandler(common.ProtocolGet, ix.handleNodeGet) + ix.Host.SetStreamHandler(common.ProtocolIndexerGetNatives, ix.handleGetNatives) } func (ix *IndexerService) handleNodePublish(s network.Stream) { defer s.Close() logger := oclib.GetLogger() - for { - var rec PeerRecord - if err := json.NewDecoder(s).Decode(&rec); err != nil { - logger.Err(err) - continue - } - rec2 := PeerRecord{ - Name: rec.Name, - DID: rec.DID, // REAL PEER ID - PubKey: rec.PubKey, - PeerID: rec.PeerID, - } - if _, err := rec2.Verify(); err != nil { - logger.Err(err) - continue - } - if rec.PeerID == "" || rec.ExpiryDate.Before(time.Now().UTC()) { // already expired - logger.Err(errors.New(rec.PeerID + " is expired.")) - continue - } - pid, err := peer.Decode(rec.PeerID) - if err != nil { - continue - } - ix.StreamMU.Lock() + var rec PeerRecord + if err := json.NewDecoder(s).Decode(&rec); err != nil { + logger.Err(err) + return + } + if _, err := rec.Verify(); err != nil { + logger.Err(err) + return + } + if rec.PeerID == "" || rec.ExpiryDate.Before(time.Now().UTC()) { + logger.Err(errors.New(rec.PeerID + " is expired.")) + return + } + pid, err := peer.Decode(rec.PeerID) + if err != nil { + return + } - if ix.StreamRecords[common.ProtocolHeartbeat] == nil { - ix.StreamRecords[common.ProtocolHeartbeat] = map[peer.ID]*common.StreamRecord[PeerRecord]{} - } - streams := ix.StreamRecords[common.ProtocolHeartbeat] + ix.StreamMU.Lock() + defer ix.StreamMU.Unlock() + if ix.StreamRecords[common.ProtocolHeartbeat] == nil { + ix.StreamRecords[common.ProtocolHeartbeat] = map[peer.ID]*common.StreamRecord[PeerRecord]{} + } + streams := ix.StreamRecords[common.ProtocolHeartbeat] + if srec, ok := streams[pid]; ok { + srec.DID = rec.DID + srec.Record = rec + srec.HeartbeatStream.UptimeTracker.LastSeen = time.Now().UTC() + } - if srec, ok := streams[pid]; ok { - srec.DID = rec.DID - srec.Record = rec - srec.HeartbeatStream.UptimeTracker.LastSeen = time.Now().UTC() - } else { - ix.StreamMU.Unlock() - logger.Err(errors.New("no heartbeat")) - continue - } - ix.StreamMU.Unlock() - - key := ix.genKey(rec.DID) - - data, err := json.Marshal(rec) - if err != nil { - logger.Err(err) - continue - } - - ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) - if err := ix.DHT.PutValue(ctx, key, data); err != nil { - logger.Err(err) - cancel() - continue - } + key := ix.genKey(rec.DID) + data, err := json.Marshal(rec) + if err != nil { + logger.Err(err) + return + } + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + if err := ix.DHT.PutValue(ctx, key, data); err != nil { + logger.Err(err) cancel() - break // response... so quit + return + } + cancel() + + // Secondary index: /name/ → DID, so peers can resolve by human-readable name. + if rec.Name != "" { + ctx2, cancel2 := context.WithTimeout(context.Background(), 10*time.Second) + if err := ix.DHT.PutValue(ctx2, ix.genNameKey(rec.Name), []byte(rec.DID)); err != nil { + logger.Err(err).Str("name", rec.Name).Msg("indexer: failed to write name index") + } + cancel2() + } + // Secondary index: /pid/ → DID, so peers can resolve by libp2p PeerID. + if rec.PeerID != "" { + ctx3, cancel3 := context.WithTimeout(context.Background(), 10*time.Second) + if err := ix.DHT.PutValue(ctx3, ix.genPIDKey(rec.PeerID), []byte(rec.DID)); err != nil { + logger.Err(err).Str("pid", rec.PeerID).Msg("indexer: failed to write pid index") + } + cancel3() } } func (ix *IndexerService) handleNodeGet(s network.Stream) { + defer s.Close() logger := oclib.GetLogger() - for { - var req GetValue - if err := json.NewDecoder(s).Decode(&req); err != nil { - logger.Err(err) + + var req GetValue + if err := json.NewDecoder(s).Decode(&req); err != nil { + logger.Err(err) + return + } + + resp := GetResponse{Found: false, Records: map[string]PeerRecord{}} + + keys := []string{} + // Name substring search — scan in-memory connected nodes first, then DHT exact match. + if req.Name != "" { + if req.Search { + for _, did := range ix.LookupNameIndex(strings.ToLower(req.Name)) { + keys = append(keys, did) + } + } else { + // 2. DHT exact-name lookup: covers nodes that published but aren't currently connected. + nameCtx, nameCancel := context.WithTimeout(context.Background(), 5*time.Second) + if ch, err := ix.DHT.SearchValue(nameCtx, ix.genNameKey(req.Name)); err == nil { + for did := range ch { + keys = append(keys, string(did)) + break + } + } + nameCancel() + } + } else if req.PeerID != "" { + pidCtx, pidCancel := context.WithTimeout(context.Background(), 5*time.Second) + if did, err := ix.DHT.GetValue(pidCtx, ix.genPIDKey(req.PeerID.String())); err == nil { + keys = append(keys, string(did)) + } + pidCancel() + } else { + keys = append(keys, req.Key) + } + + // DHT record fetch by DID key (covers exact-name and PeerID paths). + if len(keys) > 0 { + for _, k := range keys { + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + c, err := ix.DHT.GetValue(ctx, ix.genKey(k)) + cancel() + if err == nil { + var rec PeerRecord + if json.Unmarshal(c, &rec) == nil { + // Filter by PeerID only when one was explicitly specified. + if req.PeerID == "" || rec.PeerID == req.PeerID.String() { + resp.Records[rec.PeerID] = rec + } + } + } else if req.Name == "" && req.PeerID == "" { + logger.Err(err).Msg("Failed to fetch PeerRecord from DHT " + req.Key) + } + } + } + + resp.Found = len(resp.Records) > 0 + _ = json.NewEncoder(s).Encode(resp) +} + +// handleGetNatives returns this indexer's configured native addresses, +// excluding any in the request's Exclude list. +func (ix *IndexerService) handleGetNatives(s network.Stream) { + defer s.Close() + logger := oclib.GetLogger() + + var req common.GetIndexerNativesRequest + if err := json.NewDecoder(s).Decode(&req); err != nil { + logger.Err(err).Msg("indexer get natives: decode") + return + } + + excludeSet := make(map[string]struct{}, len(req.Exclude)) + for _, e := range req.Exclude { + excludeSet[e] = struct{}{} + } + + resp := common.GetIndexerNativesResponse{} + for _, addr := range strings.Split(conf.GetConfig().NativeIndexerAddresses, ",") { + addr = strings.TrimSpace(addr) + if addr == "" { continue } - ix.StreamMU.Lock() + if _, excluded := excludeSet[addr]; !excluded { + resp.Natives = append(resp.Natives, addr) + } + } - if ix.StreamRecords[common.ProtocolHeartbeat] == nil { - ix.StreamRecords[common.ProtocolHeartbeat] = map[peer.ID]*common.StreamRecord[PeerRecord]{} - } - resp := GetResponse{ - Found: false, - Records: map[string]PeerRecord{}, - } - streams := ix.StreamRecords[common.ProtocolHeartbeat] - - key := ix.genKey(req.Key) - // simple lookup by PeerID (or DID) - ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) - recBytes, err := ix.DHT.SearchValue(ctx, key) - if err != nil { - logger.Err(err).Msg("Failed to fetch PeerRecord from DHT") - cancel() - } - cancel() - for c := range recBytes { - var rec PeerRecord - if err := json.Unmarshal(c, &rec); err != nil || rec.PeerID != req.PeerID.String() { - continue - } - resp.Found = true - resp.Records[rec.PeerID] = rec - if srec, ok := streams[req.PeerID]; ok { - srec.DID = rec.DID - srec.Record = rec - srec.HeartbeatStream.UptimeTracker.LastSeen = time.Now().UTC() - } - } - // Not found - _ = json.NewEncoder(s).Encode(resp) - ix.StreamMU.Unlock() - break // response... so quit + if err := json.NewEncoder(s).Encode(resp); err != nil { + logger.Err(err).Msg("indexer get natives: encode response") } } diff --git a/daemons/node/indexer/nameindex.go b/daemons/node/indexer/nameindex.go new file mode 100644 index 0000000..19bed90 --- /dev/null +++ b/daemons/node/indexer/nameindex.go @@ -0,0 +1,168 @@ +package indexer + +import ( + "context" + "encoding/json" + "strings" + "sync" + "time" + + "oc-discovery/daemons/node/common" + + oclib "cloud.o-forge.io/core/oc-lib" + pubsub "github.com/libp2p/go-libp2p-pubsub" + pp "github.com/libp2p/go-libp2p/core/peer" +) + +// TopicNameIndex is the GossipSub topic shared by regular indexers to exchange +// add/delete events for the distributed name→peerID mapping. +const TopicNameIndex = "oc-name-index" + +// nameIndexDedupWindow suppresses re-emission of the same (action, name, peerID) +// tuple within this window, reducing duplicate events when a node is registered +// with multiple indexers simultaneously. +const nameIndexDedupWindow = 30 * time.Second + +// NameIndexAction indicates whether a name mapping is being added or removed. +type NameIndexAction string + +const ( + NameIndexAdd NameIndexAction = "add" + NameIndexDelete NameIndexAction = "delete" +) + +// NameIndexEvent is published on TopicNameIndex by each indexer when a node +// registers (add) or is evicted by the GC (delete). +type NameIndexEvent struct { + Action NameIndexAction `json:"action"` + Name string `json:"name"` + PeerID string `json:"peer_id"` + DID string `json:"did"` +} + +// nameIndexState holds the local in-memory name index and the sender-side +// deduplication tracker. +type nameIndexState struct { + // index: name → peerID → DID, built from events received from all indexers. + index map[string]map[string]string + indexMu sync.RWMutex + + // emitted tracks the last emission time for each (action, name, peerID) key + // to suppress duplicates within nameIndexDedupWindow. + emitted map[string]time.Time + emittedMu sync.Mutex +} + +// shouldEmit returns true if the (action, name, peerID) tuple has not been +// emitted within nameIndexDedupWindow, updating the tracker if so. +func (s *nameIndexState) shouldEmit(action NameIndexAction, name, peerID string) bool { + key := string(action) + ":" + name + ":" + peerID + s.emittedMu.Lock() + defer s.emittedMu.Unlock() + if t, ok := s.emitted[key]; ok && time.Since(t) < nameIndexDedupWindow { + return false + } + s.emitted[key] = time.Now() + return true +} + +// onEvent applies a received NameIndexEvent to the local index. +// "add" inserts/updates the mapping; "delete" removes it. +// Operations are idempotent — duplicate events from multiple indexers are harmless. +func (s *nameIndexState) onEvent(evt NameIndexEvent) { + if evt.Name == "" || evt.PeerID == "" { + return + } + s.indexMu.Lock() + defer s.indexMu.Unlock() + switch evt.Action { + case NameIndexAdd: + if s.index[evt.Name] == nil { + s.index[evt.Name] = map[string]string{} + } + s.index[evt.Name][evt.PeerID] = evt.DID + case NameIndexDelete: + if s.index[evt.Name] != nil { + delete(s.index[evt.Name], evt.PeerID) + if len(s.index[evt.Name]) == 0 { + delete(s.index, evt.Name) + } + } + } +} + +// initNameIndex joins TopicNameIndex and starts consuming events. +// Must be called after ix.PS is ready. +func (ix *IndexerService) initNameIndex(ps *pubsub.PubSub) { + logger := oclib.GetLogger() + ix.nameIndex = &nameIndexState{ + index: map[string]map[string]string{}, + emitted: map[string]time.Time{}, + } + + ps.RegisterTopicValidator(TopicNameIndex, func(_ context.Context, _ pp.ID, _ *pubsub.Message) bool { + return true + }) + topic, err := ps.Join(TopicNameIndex) + if err != nil { + logger.Err(err).Msg("name index: failed to join topic") + return + } + ix.LongLivedStreamRecordedService.LongLivedPubSubService.PubsubMu.Lock() + ix.LongLivedStreamRecordedService.LongLivedPubSubService.LongLivedPubSubs[TopicNameIndex] = topic + ix.LongLivedStreamRecordedService.LongLivedPubSubService.PubsubMu.Unlock() + + common.SubscribeEvents( + ix.LongLivedStreamRecordedService.LongLivedPubSubService, + context.Background(), + TopicNameIndex, + -1, + func(_ context.Context, evt NameIndexEvent, _ string) { + ix.nameIndex.onEvent(evt) + }, + ) +} + +// publishNameEvent emits a NameIndexEvent on TopicNameIndex, subject to the +// sender-side deduplication window. +func (ix *IndexerService) publishNameEvent(action NameIndexAction, name, peerID, did string) { + if ix.nameIndex == nil || name == "" || peerID == "" { + return + } + if !ix.nameIndex.shouldEmit(action, name, peerID) { + return + } + ix.LongLivedStreamRecordedService.LongLivedPubSubService.PubsubMu.RLock() + topic := ix.LongLivedStreamRecordedService.LongLivedPubSubService.LongLivedPubSubs[TopicNameIndex] + ix.LongLivedStreamRecordedService.LongLivedPubSubService.PubsubMu.RUnlock() + if topic == nil { + return + } + evt := NameIndexEvent{Action: action, Name: name, PeerID: peerID, DID: did} + b, err := json.Marshal(evt) + if err != nil { + return + } + _ = topic.Publish(context.Background(), b) +} + +// LookupNameIndex searches the distributed name index for peers whose name +// contains needle (case-insensitive). Returns peerID → DID for matched peers. +// Returns nil if the name index is not initialised (e.g. native indexers). +func (ix *IndexerService) LookupNameIndex(needle string) map[string]string { + if ix.nameIndex == nil { + return nil + } + result := map[string]string{} + needleLow := strings.ToLower(needle) + ix.nameIndex.indexMu.RLock() + defer ix.nameIndex.indexMu.RUnlock() + for name, peers := range ix.nameIndex.index { + if strings.Contains(strings.ToLower(name), needleLow) { + for peerID, did := range peers { + result[peerID] = did + } + } + } + return result +} diff --git a/daemons/node/indexer/native.go b/daemons/node/indexer/native.go index 8dabb0c..880088c 100644 --- a/daemons/node/indexer/native.go +++ b/daemons/node/indexer/native.go @@ -4,7 +4,10 @@ import ( "context" "encoding/json" "errors" + "fmt" "math/rand" + "slices" + "strings" "sync" "time" @@ -12,19 +15,24 @@ import ( oclib "cloud.o-forge.io/core/oc-lib" pubsub "github.com/libp2p/go-libp2p-pubsub" - "github.com/libp2p/go-libp2p/core/host" "github.com/libp2p/go-libp2p/core/network" pp "github.com/libp2p/go-libp2p/core/peer" ) const ( - // IndexerTTL is 10% above the recommended 60s heartbeat interval. - IndexerTTL = 66 * time.Second + // IndexerTTL is the lifetime of a live-indexer cache entry. Set to 50% above + // the recommended 60s heartbeat interval so a single delayed renewal does not + // evict a healthy indexer from the native's cache. + IndexerTTL = 90 * time.Second // offloadInterval is how often the native checks if it can release responsible peers. offloadInterval = 30 * time.Second // dhtRefreshInterval is how often the background goroutine queries the DHT for // known-but-expired indexer entries (written by neighbouring natives). dhtRefreshInterval = 30 * time.Second + // maxFallbackPeers caps how many peers the native will accept in self-delegation + // mode. Beyond this limit the native refuses to act as a fallback indexer so it + // is not overwhelmed during prolonged indexer outages. + maxFallbackPeers = 50 ) // liveIndexerEntry tracks a registered indexer in the native's in-memory cache and DHT. @@ -43,7 +51,7 @@ type NativeState struct { // knownPeerIDs accumulates all indexer PeerIDs ever seen (local stream or gossip). // Used by refreshIndexersFromDHT to re-hydrate expired entries from the shared DHT, // including entries written by other natives. - knownPeerIDs map[string]struct{} + knownPeerIDs map[string]string knownMu sync.RWMutex } @@ -51,7 +59,7 @@ func newNativeState() *NativeState { return &NativeState{ liveIndexers: map[string]*liveIndexerEntry{}, responsiblePeers: map[pp.ID]struct{}{}, - knownPeerIDs: map[string]struct{}{}, + knownPeerIDs: map[string]string{}, } } @@ -92,10 +100,12 @@ func (v IndexerRecordValidator) Select(_ string, values [][]byte) (int, error) { // Must be called after DHT is initialized. func (ix *IndexerService) InitNative() { ix.Native = newNativeState() - ix.Host.SetStreamHandler(common.ProtocolIndexerHeartbeat, ix.HandleNodeHeartbeat) // specific heartbeat for Indexer. + ix.Host.SetStreamHandler(common.ProtocolHeartbeat, ix.HandleHeartbeat) // specific heartbeat for Indexer. ix.Host.SetStreamHandler(common.ProtocolNativeSubscription, ix.handleNativeSubscription) ix.Host.SetStreamHandler(common.ProtocolNativeGetIndexers, ix.handleNativeGetIndexers) ix.Host.SetStreamHandler(common.ProtocolNativeConsensus, ix.handleNativeConsensus) + ix.Host.SetStreamHandler(common.ProtocolNativeGetPeers, ix.handleNativeGetPeers) + ix.Host.SetStreamHandler(common.ProtocolIndexerGetNatives, ix.handleGetNatives) ix.subscribeIndexerRegistry() // Ensure long connections to other configured natives (native-to-native mesh). common.EnsureNativePeers(ix.Host) @@ -107,8 +117,15 @@ func (ix *IndexerService) InitNative() { // registered indexer PeerIDs to one another, enabling cross-native DHT discovery. func (ix *IndexerService) subscribeIndexerRegistry() { logger := oclib.GetLogger() - ix.PS.RegisterTopicValidator(common.TopicIndexerRegistry, func(_ context.Context, _ pp.ID, _ *pubsub.Message) bool { - return true + ix.PS.RegisterTopicValidator(common.TopicIndexerRegistry, func(_ context.Context, _ pp.ID, msg *pubsub.Message) bool { + // Reject empty or syntactically invalid multiaddrs before they reach the + // message loop. A compromised native could otherwise gossip arbitrary data. + addr := string(msg.Data) + if addr == "" { + return false + } + _, err := pp.AddrInfoFromString(addr) + return err == nil }) topic, err := ix.PS.Join(common.TopicIndexerRegistry) if err != nil { @@ -130,29 +147,38 @@ func (ix *IndexerService) subscribeIndexerRegistry() { if err != nil { return } - peerID := string(msg.Data) - if peerID == "" { + addr := string(msg.Data) + if addr == "" { continue } + if peer, err := pp.AddrInfoFromString(addr); err == nil { + ix.Native.knownMu.Lock() + ix.Native.knownPeerIDs[peer.ID.String()] = addr + ix.Native.knownMu.Unlock() + + } // A neighbouring native registered this PeerID; add to known set for DHT refresh. - ix.Native.knownMu.Lock() - ix.Native.knownPeerIDs[peerID] = struct{}{} - ix.Native.knownMu.Unlock() + } }() } -// handleNativeSubscription stores an indexer's alive registration in the DHT cache. +// handleNativeSubscription stores an indexer's alive registration in the local cache +// immediately, then persists it to the DHT asynchronously. // The stream is temporary: indexer sends one IndexerRegistration and closes. func (ix *IndexerService) handleNativeSubscription(s network.Stream) { defer s.Close() logger := oclib.GetLogger() + logger.Info().Msg("Subscription") + var reg common.IndexerRegistration if err := json.NewDecoder(s).Decode(®); err != nil { logger.Err(err).Msg("native subscription: decode") return } + logger.Info().Msg("Subscription " + reg.Addr) + if reg.Addr == "" { logger.Error().Msg("native subscription: missing addr") return @@ -166,30 +192,23 @@ func (ix *IndexerService) handleNativeSubscription(s network.Stream) { reg.PeerID = ad.ID.String() } - expiry := time.Now().UTC().Add(IndexerTTL) + // Build entry with a fresh TTL — must happen before the cache write so the 66s + // window is not consumed by DHT retries. entry := &liveIndexerEntry{ PeerID: reg.PeerID, Addr: reg.Addr, - ExpiresAt: expiry, + ExpiresAt: time.Now().UTC().Add(IndexerTTL), } - // Persist in DHT with 66s TTL. - key := ix.genIndexerKey(reg.PeerID) - if data, err := json.Marshal(entry); err == nil { - ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) - if err := ix.DHT.PutValue(ctx, key, data); err != nil { - logger.Err(err).Msg("native subscription: DHT put") - } - cancel() - } - - // Update local cache and known set. + // Update local cache and known set immediately so concurrent GetIndexers calls + // can already see this indexer without waiting for the DHT write to complete. ix.Native.liveIndexersMu.Lock() + _, isRenewal := ix.Native.liveIndexers[reg.PeerID] ix.Native.liveIndexers[reg.PeerID] = entry ix.Native.liveIndexersMu.Unlock() ix.Native.knownMu.Lock() - ix.Native.knownPeerIDs[reg.PeerID] = struct{}{} + ix.Native.knownPeerIDs[reg.PeerID] = reg.Addr ix.Native.knownMu.Unlock() // Gossip PeerID to neighbouring natives so they discover it via DHT. @@ -197,16 +216,46 @@ func (ix *IndexerService) handleNativeSubscription(s network.Stream) { topic := ix.LongLivedPubSubs[common.TopicIndexerRegistry] ix.PubsubMu.RUnlock() if topic != nil { - if err := topic.Publish(context.Background(), []byte(reg.PeerID)); err != nil { + if err := topic.Publish(context.Background(), []byte(reg.Addr)); err != nil { logger.Err(err).Msg("native subscription: registry gossip publish") } } - logger.Info().Str("peer", reg.PeerID).Msg("native: indexer registered") + if isRenewal { + logger.Debug().Str("peer", reg.PeerID).Msg("native: indexer TTL renewed : " + fmt.Sprintf("%v", len(ix.Native.liveIndexers))) + } else { + logger.Info().Str("peer", reg.PeerID).Msg("native: indexer registered : " + fmt.Sprintf("%v", len(ix.Native.liveIndexers))) + } + + // Persist in DHT asynchronously — retries must not block the handler or consume + // the local cache TTL. + key := ix.genIndexerKey(reg.PeerID) + data, err := json.Marshal(entry) + if err != nil { + logger.Err(err).Msg("native subscription: marshal entry") + return + } + go func() { + for { + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + if err := ix.DHT.PutValue(ctx, key, data); err != nil { + cancel() + logger.Err(err).Msg("native subscription: DHT put " + key) + if strings.Contains(err.Error(), "failed to find any peer in table") { + time.Sleep(10 * time.Second) + continue + } + return + } + cancel() + return + } + }() } // handleNativeGetIndexers returns this native's own list of reachable indexers. -// If none are available, it self-delegates (becomes the fallback indexer for the caller). +// Self-delegation (native acting as temporary fallback indexer) is only permitted +// for nodes — never for peers that are themselves registered indexers in knownPeerIDs. // The consensus across natives is the responsibility of the requesting node/indexer. func (ix *IndexerService) handleNativeGetIndexers(s network.Stream) { defer s.Close() @@ -220,14 +269,20 @@ func (ix *IndexerService) handleNativeGetIndexers(s network.Stream) { if req.Count <= 0 { req.Count = 3 } - - reachable := ix.reachableLiveIndexers() + callerPeerID := s.Conn().RemotePeer().String() + reachable := ix.reachableLiveIndexers(req.Count, callerPeerID) var resp common.GetIndexersResponse if len(reachable) == 0 { - // No indexers known: become temporary fallback for this caller. - ix.selfDelegate(s.Conn().RemotePeer(), &resp) - logger.Info().Str("peer", s.Conn().RemotePeer().String()).Msg("native: no indexers, acting as fallback") + // No live indexers reachable — try to self-delegate. + if ix.selfDelegate(s.Conn().RemotePeer(), &resp) { + logger.Info().Str("peer", callerPeerID).Msg("native: no indexers, acting as fallback for node") + } else { + // Fallback pool saturated: return empty so the caller retries another + // native instead of piling more load onto this one. + logger.Warn().Str("peer", callerPeerID).Int("pool", maxFallbackPeers).Msg( + "native: fallback pool saturated, refusing self-delegation") + } } else { rand.Shuffle(len(reachable), func(i, j int) { reachable[i], reachable[j] = reachable[j], reachable[i] }) if req.Count > len(reachable) { @@ -255,7 +310,7 @@ func (ix *IndexerService) handleNativeConsensus(s network.Stream) { return } - myList := ix.reachableLiveIndexers() + myList := ix.reachableLiveIndexers(-1, s.Conn().RemotePeer().String()) mySet := make(map[string]struct{}, len(myList)) for _, addr := range myList { mySet[addr] = struct{}{} @@ -285,31 +340,56 @@ func (ix *IndexerService) handleNativeConsensus(s network.Stream) { } // selfDelegate marks the caller as a responsible peer and exposes this native's own -// address as its temporary indexer. -func (ix *IndexerService) selfDelegate(remotePeer pp.ID, resp *common.GetIndexersResponse) { +// address as its temporary indexer. Returns false when the fallback pool is saturated +// (maxFallbackPeers reached) — the caller must return an empty response so the node +// retries later instead of pinning indefinitely to an overloaded native. +func (ix *IndexerService) selfDelegate(remotePeer pp.ID, resp *common.GetIndexersResponse) bool { ix.Native.responsibleMu.Lock() - ix.Native.responsiblePeers[remotePeer] = struct{}{} - ix.Native.responsibleMu.Unlock() - resp.IsSelfFallback = true - for _, a := range ix.Host.Addrs() { - resp.Indexers = []string{a.String() + "/p2p/" + ix.Host.ID().String()} - break + defer ix.Native.responsibleMu.Unlock() + if len(ix.Native.responsiblePeers) >= maxFallbackPeers { + return false } + ix.Native.responsiblePeers[remotePeer] = struct{}{} + resp.IsSelfFallback = true + resp.Indexers = []string{ix.Host.Addrs()[len(ix.Host.Addrs())-1].String() + "/p2p/" + ix.Host.ID().String()} + return true } // reachableLiveIndexers returns the multiaddrs of non-expired, pingable indexers // from the local cache (kept fresh by refreshIndexersFromDHT in background). -func (ix *IndexerService) reachableLiveIndexers() []string { +func (ix *IndexerService) reachableLiveIndexers(count int, from ...string) []string { ix.Native.liveIndexersMu.RLock() now := time.Now().UTC() candidates := []*liveIndexerEntry{} for _, e := range ix.Native.liveIndexers { - if e.ExpiresAt.After(now) { + fmt.Println("liveIndexers", slices.Contains(from, e.PeerID), from, e.PeerID) + if e.ExpiresAt.After(now) && !slices.Contains(from, e.PeerID) { candidates = append(candidates, e) } } ix.Native.liveIndexersMu.RUnlock() + fmt.Println("midway...", candidates, from, ix.Native.knownPeerIDs) + + if (count > 0 && len(candidates) < count) || count < 0 { + ix.Native.knownMu.RLock() + for k, v := range ix.Native.knownPeerIDs { + // Include peers whose liveIndexers entry is absent OR expired. + // A non-nil but expired entry means the peer was once known but + // has since timed out — PeerIsAlive below will decide if it's back. + fmt.Println("knownPeerIDs", slices.Contains(from, k), from, k) + if !slices.Contains(from, k) { + candidates = append(candidates, &liveIndexerEntry{ + PeerID: k, + Addr: v, + }) + } + } + ix.Native.knownMu.RUnlock() + } + + fmt.Println("midway...1", candidates) + reachable := []string{} for _, e := range candidates { ad, err := pp.AddrInfoFromString(e.Addr) @@ -371,6 +451,12 @@ func (ix *IndexerService) refreshIndexersFromDHT() { ix.Native.liveIndexers[best.PeerID] = best ix.Native.liveIndexersMu.Unlock() logger.Info().Str("peer", best.PeerID).Msg("native: refreshed indexer from DHT") + } else { + // DHT has no fresh entry — peer is gone, prune from known set. + ix.Native.knownMu.Lock() + delete(ix.Native.knownPeerIDs, pid) + ix.Native.knownMu.Unlock() + logger.Info().Str("peer", pid).Msg("native: pruned stale peer from knownPeerIDs") } } } @@ -387,30 +473,107 @@ func (ix *IndexerService) runOffloadLoop() { defer t.Stop() logger := oclib.GetLogger() for range t.C { + fmt.Println("runOffloadLoop", ix.Native.responsiblePeers) ix.Native.responsibleMu.RLock() count := len(ix.Native.responsiblePeers) ix.Native.responsibleMu.RUnlock() if count == 0 { continue } - if len(ix.reachableLiveIndexers()) > 0 { - ix.Native.responsibleMu.Lock() - ix.Native.responsiblePeers = map[pp.ID]struct{}{} - ix.Native.responsibleMu.Unlock() + ix.Native.responsibleMu.RLock() + peerIDS := []string{} + for p := range ix.Native.responsiblePeers { + peerIDS = append(peerIDS, p.String()) + } + fmt.Println("COUNT --> ", count, len(ix.reachableLiveIndexers(-1, peerIDS...))) + ix.Native.responsibleMu.RUnlock() + if len(ix.reachableLiveIndexers(-1, peerIDS...)) > 0 { + ix.Native.responsibleMu.RLock() + released := ix.Native.responsiblePeers + ix.Native.responsibleMu.RUnlock() + + // Reset (not Close) heartbeat streams of released peers. + // Close() only half-closes the native's write direction — the peer's write + // direction stays open and sendHeartbeat never sees an error. + // Reset() abruptly terminates both directions, making the peer's next + // json.Encode return an error which triggers replenishIndexersFromNative. + ix.StreamMU.Lock() + if streams := ix.StreamRecords[common.ProtocolHeartbeat]; streams != nil { + for pid := range released { + if rec, ok := streams[pid]; ok { + if rec.HeartbeatStream != nil && rec.HeartbeatStream.Stream != nil { + rec.HeartbeatStream.Stream.Reset() + } + ix.Native.responsibleMu.Lock() + delete(ix.Native.responsiblePeers, pid) + ix.Native.responsibleMu.Unlock() + + delete(streams, pid) + logger.Info().Str("peer", pid.String()).Str("proto", string(common.ProtocolHeartbeat)).Msg( + "native: offload — stream reset, peer will reconnect to real indexer") + } else { + // No recorded heartbeat stream for this peer: either it never + // passed the score check (new peer, uptime=0 → score<75) or the + // stream was GC'd. We cannot send a Reset signal, so close the + // whole connection instead — this makes the peer's sendHeartbeat + // return an error, which triggers replenishIndexersFromNative and + // migrates it to a real indexer. + ix.Native.responsibleMu.Lock() + delete(ix.Native.responsiblePeers, pid) + ix.Native.responsibleMu.Unlock() + go ix.Host.Network().ClosePeer(pid) + logger.Info().Str("peer", pid.String()).Msg( + "native: offload — no heartbeat stream, closing connection so peer re-requests real indexers") + } + } + + } + ix.StreamMU.Unlock() + logger.Info().Int("released", count).Msg("native: offloaded responsible peers to real indexers") } } } +// handleNativeGetPeers returns a random selection of this native's known native +// contacts, excluding any in the request's Exclude list. +func (ix *IndexerService) handleNativeGetPeers(s network.Stream) { + defer s.Close() + logger := oclib.GetLogger() + + var req common.GetNativePeersRequest + if err := json.NewDecoder(s).Decode(&req); err != nil { + logger.Err(err).Msg("native get peers: decode") + return + } + if req.Count <= 0 { + req.Count = 1 + } + + excludeSet := make(map[string]struct{}, len(req.Exclude)) + for _, e := range req.Exclude { + excludeSet[e] = struct{}{} + } + + common.StreamNativeMu.RLock() + candidates := make([]string, 0, len(common.StaticNatives)) + for addr := range common.StaticNatives { + if _, excluded := excludeSet[addr]; !excluded { + candidates = append(candidates, addr) + } + } + common.StreamNativeMu.RUnlock() + + rand.Shuffle(len(candidates), func(i, j int) { candidates[i], candidates[j] = candidates[j], candidates[i] }) + if req.Count > len(candidates) { + req.Count = len(candidates) + } + + resp := common.GetNativePeersResponse{Peers: candidates[:req.Count]} + if err := json.NewEncoder(s).Encode(resp); err != nil { + logger.Err(err).Msg("native get peers: encode response") + } +} + // StartNativeRegistration starts a goroutine that periodically registers this // indexer with all configured native indexers (every RecommendedHeartbeatInterval). -func StartNativeRegistration(h host.Host, nativeAddressesStr string) { - go func() { - common.RegisterWithNative(h, nativeAddressesStr) - t := time.NewTicker(common.RecommendedHeartbeatInterval) - defer t.Stop() - for range t.C { - common.RegisterWithNative(h, nativeAddressesStr) - } - }() -} diff --git a/daemons/node/indexer/service.go b/daemons/node/indexer/service.go index 79d6f9f..708ff49 100644 --- a/daemons/node/indexer/service.go +++ b/daemons/node/indexer/service.go @@ -11,6 +11,7 @@ import ( pubsub "github.com/libp2p/go-libp2p-pubsub" record "github.com/libp2p/go-libp2p-record" "github.com/libp2p/go-libp2p/core/host" + pp "github.com/libp2p/go-libp2p/core/peer" ) // IndexerService manages the indexer node's state: stream records, DHT, pubsub. @@ -22,6 +23,7 @@ type IndexerService struct { mu sync.RWMutex IsNative bool Native *NativeState // non-nil when IsNative == true + nameIndex *nameIndexState } // NewIndexerService creates an IndexerService. @@ -43,22 +45,34 @@ func NewIndexerService(h host.Host, ps *pubsub.PubSub, maxNode int, isNative boo } ix.PS = ps - if ix.isStrictIndexer { + if ix.isStrictIndexer && !isNative { logger.Info().Msg("connect to indexers as strict indexer...") - common.ConnectToIndexers(h, 0, 5, ix.Host.ID()) + common.ConnectToIndexers(h, conf.GetConfig().MinIndexer, conf.GetConfig().MaxIndexer, ix.Host.ID()) logger.Info().Msg("subscribe to decentralized search flow as strict indexer...") - ix.SubscribeToSearch(ix.PS, nil) + go ix.SubscribeToSearch(ix.PS, nil) + } + + if !isNative { + logger.Info().Msg("init distributed name index...") + ix.initNameIndex(ps) + ix.LongLivedStreamRecordedService.AfterDelete = func(pid pp.ID, name, did string) { + ix.publishNameEvent(NameIndexDelete, name, pid.String(), did) + } } if ix.DHT, err = dht.New( context.Background(), ix.Host, dht.Mode(dht.ModeServer), + dht.ProtocolPrefix("oc"), // 🔥 réseau privé dht.Validator(record.NamespacedValidator{ "node": PeerRecordValidator{}, "indexer": IndexerRecordValidator{}, // for native indexer registry + "name": DefaultValidator{}, + "pid": DefaultValidator{}, }), ); err != nil { + logger.Info().Msg(err.Error()) return nil } @@ -67,11 +81,10 @@ func NewIndexerService(h host.Host, ps *pubsub.PubSub, maxNode int, isNative boo ix.InitNative() } else { ix.initNodeHandler() - } - - // Register with configured natives so this indexer appears in their cache - if nativeAddrs := conf.GetConfig().NativeIndexerAddresses; nativeAddrs != "" { - StartNativeRegistration(ix.Host, nativeAddrs) + // Register with configured natives so this indexer appears in their cache + if nativeAddrs := conf.GetConfig().NativeIndexerAddresses; nativeAddrs != "" { + common.StartNativeRegistration(ix.Host, nativeAddrs) + } } return ix } @@ -79,6 +92,9 @@ func NewIndexerService(h host.Host, ps *pubsub.PubSub, maxNode int, isNative boo func (ix *IndexerService) Close() { ix.DHT.Close() ix.PS.UnregisterTopicValidator(common.TopicPubSubSearch) + if ix.nameIndex != nil { + ix.PS.UnregisterTopicValidator(TopicNameIndex) + } for _, s := range ix.StreamRecords { for _, ss := range s { ss.HeartbeatStream.Stream.Close() diff --git a/daemons/node/indexer/validator.go b/daemons/node/indexer/validator.go index 6a8c536..11d5371 100644 --- a/daemons/node/indexer/validator.go +++ b/daemons/node/indexer/validator.go @@ -6,6 +6,16 @@ import ( "time" ) +type DefaultValidator struct{} + +func (v DefaultValidator) Validate(key string, value []byte) error { + return nil +} + +func (v DefaultValidator) Select(key string, values [][]byte) (int, error) { + return 0, nil +} + type PeerRecordValidator struct{} func (v PeerRecordValidator) Validate(key string, value []byte) error { @@ -26,14 +36,7 @@ func (v PeerRecordValidator) Validate(key string, value []byte) error { } // Signature verification - rec2 := PeerRecord{ - Name: rec.Name, - DID: rec.DID, - PubKey: rec.PubKey, - PeerID: rec.PeerID, - } - - if _, err := rec2.Verify(); err != nil { + if _, err := rec.Verify(); err != nil { return errors.New("invalid signature") } diff --git a/daemons/node/nats.go b/daemons/node/nats.go index 1b1a0cc..4222acc 100644 --- a/daemons/node/nats.go +++ b/daemons/node/nats.go @@ -96,6 +96,7 @@ func ListenNATS(n *Node) { }, tools.PROPALGATION_EVENT: func(resp tools.NATSResponse) { + fmt.Println("PROPALGATION") if resp.FromApp == config.GetAppName() { return } @@ -106,10 +107,10 @@ func ListenNATS(n *Node) { dtt := tools.DataType(propalgation.DataType) dt = &dtt } + fmt.Println("PROPALGATION ACT", propalgation.Action, propalgation.Action == tools.PB_CREATE, err) if err == nil { switch propalgation.Action { - case tools.PB_ADMIRALTY_CONFIG: - case tools.PB_MINIO_CONFIG: + case tools.PB_ADMIRALTY_CONFIG, tools.PB_MINIO_CONFIG: var m configPayload var proto protocol.ID = stream.ProtocolAdmiraltyConfigResource if propalgation.Action == tools.PB_MINIO_CONFIG { @@ -122,20 +123,17 @@ func ListenNATS(n *Node) { p.PeerID, proto, resp.Payload) } } - case tools.PB_CREATE: - case tools.PB_UPDATE: - case tools.PB_DELETE: - n.StreamService.ToPartnerPublishEvent( + case tools.PB_CREATE, tools.PB_UPDATE, tools.PB_DELETE: + fmt.Println(propalgation.Action, dt, resp.User, propalgation.Payload) + fmt.Println(n.StreamService.ToPartnerPublishEvent( context.Background(), propalgation.Action, dt, resp.User, propalgation.Payload, - ) + )) case tools.PB_CONSIDERS: switch resp.Datatype { - case tools.BOOKING: - case tools.PURCHASE_RESOURCE: - case tools.WORKFLOW_EXECUTION: + case tools.BOOKING, tools.PURCHASE_RESOURCE, tools.WORKFLOW_EXECUTION: var m executionConsidersPayload if err := json.Unmarshal(resp.Payload, &m); err == nil { for _, p := range m.PeerIDs { diff --git a/daemons/node/node.go b/daemons/node/node.go index 04d9a3a..10d2de7 100644 --- a/daemons/node/node.go +++ b/daemons/node/node.go @@ -2,10 +2,10 @@ package node import ( "context" - "crypto/sha256" "encoding/json" "errors" "fmt" + "maps" "oc-discovery/conf" "oc-discovery/daemons/node/common" "oc-discovery/daemons/node/indexer" @@ -15,6 +15,7 @@ import ( "time" oclib "cloud.o-forge.io/core/oc-lib" + "cloud.o-forge.io/core/oc-lib/dbs" "cloud.o-forge.io/core/oc-lib/models/peer" "cloud.o-forge.io/core/oc-lib/tools" "github.com/google/uuid" @@ -33,6 +34,7 @@ type Node struct { StreamService *stream.StreamService PeerID pp.ID isIndexer bool + peerRecord *indexer.PeerRecord Mu sync.RWMutex } @@ -69,6 +71,9 @@ func InitNode(isNode bool, isIndexer bool, isNativeIndexer bool) (*Node, error) isIndexer: isIndexer, LongLivedStreamRecordedService: common.NewStreamRecordedService[interface{}](h, 1000), } + // Register the bandwidth probe handler so any peer measuring this node's + // throughput can open a dedicated probe stream and read the echo. + h.SetStreamHandler(common.ProtocolBandwidthProbe, common.HandleBandwidthProbe) var ps *pubsubs.PubSub if isNode { logger.Info().Msg("generate opencloud node...") @@ -77,8 +82,30 @@ func InitNode(isNode bool, isIndexer bool, isNativeIndexer bool) (*Node, error) panic(err) // can't run your node without a propalgation pubsub, of state of node. } node.PS = ps + // buildRecord returns a fresh signed PeerRecord as JSON, embedded in each + // heartbeat so the receiving indexer can republish it to the DHT directly. + // peerRecord is nil until claimInfo runs, so the first ~20s heartbeats carry + // no record — that's fine, claimInfo publishes once synchronously at startup. + buildRecord := func() json.RawMessage { + if node.peerRecord == nil { + return nil + } + priv, err := tools.LoadKeyFromFilePrivate() + if err != nil { + return nil + } + fresh := *node.peerRecord + fresh.PeerRecordPayload.ExpiryDate = time.Now().UTC().Add(2 * time.Minute) + payload, _ := json.Marshal(fresh.PeerRecordPayload) + fresh.Signature, err = priv.Sign(payload) + if err != nil { + return nil + } + b, _ := json.Marshal(fresh) + return json.RawMessage(b) + } logger.Info().Msg("connect to indexers...") - common.ConnectToIndexers(node.Host, 0, 5, node.PeerID) // TODO : make var to change how many indexers are allowed. + common.ConnectToIndexers(node.Host, conf.GetConfig().MinIndexer, conf.GetConfig().MaxIndexer, node.PeerID, buildRecord) logger.Info().Msg("claims my node...") if _, err := node.claimInfo(conf.GetConfig().Name, conf.GetConfig().Hostname); err != nil { panic(err) @@ -100,14 +127,14 @@ func InitNode(isNode bool, isIndexer bool, isNativeIndexer bool) (*Node, error) } } node.SubscribeToSearch(node.PS, &f) + logger.Info().Msg("connect to NATS") + go ListenNATS(node) + logger.Info().Msg("Node is actually running.") } if isIndexer { logger.Info().Msg("generate opencloud indexer...") - node.IndexerService = indexer.NewIndexerService(node.Host, ps, 5, isNativeIndexer) + node.IndexerService = indexer.NewIndexerService(node.Host, ps, 500, isNativeIndexer) } - logger.Info().Msg("connect to NATS") - ListenNATS(node) - logger.Info().Msg("Node is actually running.") return node, nil } @@ -127,24 +154,29 @@ func (d *Node) publishPeerRecord( if err != nil { return err } + common.StreamMuIndexes.RLock() + indexerSnapshot := make([]*pp.AddrInfo, 0, len(common.StaticIndexers)) for _, ad := range common.StaticIndexers { + indexerSnapshot = append(indexerSnapshot, ad) + } + common.StreamMuIndexes.RUnlock() + + for _, ad := range indexerSnapshot { var err error if common.StreamIndexers, err = common.TempStream(d.Host, *ad, common.ProtocolPublish, "", common.StreamIndexers, map[protocol.ID]*common.ProtocolInfo{}, &common.StreamMuIndexes); err != nil { continue } stream := common.StreamIndexers[common.ProtocolPublish][ad.ID] - base := indexer.PeerRecord{ + base := indexer.PeerRecordPayload{ Name: rec.Name, DID: rec.DID, PubKey: rec.PubKey, ExpiryDate: time.Now().UTC().Add(2 * time.Minute), } payload, _ := json.Marshal(base) - hash := sha256.Sum256(payload) - - rec.ExpiryDate = base.ExpiryDate - rec.Signature, err = priv.Sign(hash[:]) + rec.PeerRecordPayload = base + rec.Signature, err = priv.Sign(payload) if err := json.NewEncoder(stream.Stream).Encode(&rec); err != nil { // then publish on stream return err } @@ -156,38 +188,50 @@ func (d *Node) GetPeerRecord( ctx context.Context, pidOrdid string, ) ([]*peer.Peer, error) { - did := pidOrdid // if known pidOrdid is did - pid := pidOrdid // if not known pidOrdid is pid - access := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil) - if data := access.Search(nil, did, true); len(data.Data) > 0 { - did = data.Data[0].GetID() - pid = data.Data[0].(*peer.Peer).PeerID - } var err error var info map[string]indexer.PeerRecord + common.StreamMuIndexes.RLock() + indexerSnapshot2 := make([]*pp.AddrInfo, 0, len(common.StaticIndexers)) for _, ad := range common.StaticIndexers { + indexerSnapshot2 = append(indexerSnapshot2, ad) + } + common.StreamMuIndexes.RUnlock() + + // Build the GetValue request: if pidOrdid is neither a UUID DID nor a libp2p + // PeerID, treat it as a human-readable name and let the indexer resolve it. + getReq := indexer.GetValue{Key: pidOrdid} + isNameSearch := false + if pidR, pidErr := pp.Decode(pidOrdid); pidErr == nil { + getReq.PeerID = pidR + } else if _, uuidErr := uuid.Parse(pidOrdid); uuidErr != nil { + // Not a UUID DID → treat pidOrdid as a name substring search. + getReq.Name = pidOrdid + getReq.Key = "" + isNameSearch = true + } + + for _, ad := range indexerSnapshot2 { if common.StreamIndexers, err = common.TempStream(d.Host, *ad, common.ProtocolGet, "", common.StreamIndexers, map[protocol.ID]*common.ProtocolInfo{}, &common.StreamMuIndexes); err != nil { continue } - pidR, err := pp.Decode(pid) - if err != nil { + stream := common.StreamIndexers[common.ProtocolGet][ad.ID] + if err := json.NewEncoder(stream.Stream).Encode(getReq); err != nil { continue } - stream := common.StreamIndexers[common.ProtocolGet][ad.ID] - if err := json.NewEncoder(stream.Stream).Encode(indexer.GetValue{ - Key: did, - PeerID: pidR, - }); err != nil { - return nil, err + var resp indexer.GetResponse + if err := json.NewDecoder(stream.Stream).Decode(&resp); err != nil { + continue } - for { - var resp indexer.GetResponse - if err := json.NewDecoder(stream.Stream).Decode(&resp); err != nil { - return nil, err - } - if resp.Found { + if resp.Found { + if info == nil { info = resp.Records + } else { + // Aggregate results from all indexers for name searches. + maps.Copy(info, resp.Records) + } + // For exact lookups (PeerID / DID) stop at the first hit. + if !isNameSearch { break } } @@ -196,7 +240,7 @@ func (d *Node) GetPeerRecord( for _, pr := range info { if pk, err := pr.Verify(); err != nil { return nil, err - } else if ok, p, err := pr.ExtractPeer(d.PeerID.String(), did, pk); err != nil { + } else if ok, p, err := pr.ExtractPeer(d.PeerID.String(), pr.PeerID, pk); err != nil { return nil, err } else { if ok { @@ -218,7 +262,11 @@ func (d *Node) claimInfo( } did := uuid.New().String() - peers := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil).Search(nil, fmt.Sprintf("%v", peer.SELF), false) + peers := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil).Search(&dbs.Filters{ + And: map[string][]dbs.Filter{ // search by name if no filters are provided + "peer_id": {{Operator: dbs.EQUAL.String(), Value: d.Host.ID().String()}}, + }, + }, "", false) if len(peers.Data) > 0 { did = peers.Data[0].GetID() // if already existing set up did as made } @@ -238,39 +286,38 @@ func (d *Node) claimInfo( now := time.Now().UTC() expiry := now.Add(150 * time.Second) - rec := &indexer.PeerRecord{ - Name: name, - DID: did, // REAL PEER ID - PubKey: pubBytes, + pRec := indexer.PeerRecordPayload{ + Name: name, + DID: did, // REAL PEER ID + PubKey: pubBytes, + ExpiryDate: expiry, } - - rec.PeerID = d.Host.ID().String() d.PeerID = d.Host.ID() + payload, _ := json.Marshal(pRec) - payload, _ := json.Marshal(rec) - hash := sha256.Sum256(payload) - - rec.Signature, err = priv.Sign(hash[:]) + rec := &indexer.PeerRecord{ + PeerRecordPayload: pRec, + } + rec.Signature, err = priv.Sign(payload) if err != nil { return nil, err } - + rec.PeerID = d.Host.ID().String() rec.APIUrl = endPoint rec.StreamAddress = "/ip4/" + conf.GetConfig().Hostname + "/tcp/" + fmt.Sprintf("%v", conf.GetConfig().NodeEndpointPort) + "/p2p/" + rec.PeerID rec.NATSAddress = oclib.GetConfig().NATSUrl rec.WalletAddress = "my-wallet" - rec.ExpiryDate = expiry if err := d.publishPeerRecord(rec); err != nil { return nil, err } - /*if pk, err := rec.Verify(); err != nil { - fmt.Println("Verify") + d.peerRecord = rec + if _, err := rec.Verify(); err != nil { return nil, err - } else {*/ - _, p, err := rec.ExtractPeer(did, did, pub) - return p, err - //} + } else { + _, p, err := rec.ExtractPeer(did, did, pub) + return p, err + } } /* diff --git a/daemons/node/pubsub/publish.go b/daemons/node/pubsub/publish.go index 747ec20..d91d961 100644 --- a/daemons/node/pubsub/publish.go +++ b/daemons/node/pubsub/publish.go @@ -4,47 +4,56 @@ import ( "context" "encoding/json" "errors" + "oc-discovery/daemons/node/stream" "oc-discovery/models" - oclib "cloud.o-forge.io/core/oc-lib" + "cloud.o-forge.io/core/oc-lib/dbs" + "cloud.o-forge.io/core/oc-lib/models/peer" "cloud.o-forge.io/core/oc-lib/tools" ) func (ps *PubSubService) SearchPublishEvent( ctx context.Context, dt *tools.DataType, typ string, user string, search string) error { + b, err := json.Marshal(map[string]string{"search": search}) + if err != nil { + return err + } switch typ { case "known": // define Search Strategy - return ps.StreamService.SearchKnownPublishEvent(dt, user, search) //if partners focus only them*/ + return ps.StreamService.PublishesCommon(dt, user, &dbs.Filters{ // filter by like name, short_description, description, owner, url if no filters are provided + And: map[string][]dbs.Filter{ + "": {{Operator: dbs.NOT.String(), Value: dbs.Filters{ // filter by like name, short_description, description, owner, url if no filters are provided + And: map[string][]dbs.Filter{ + "relation": {{Operator: dbs.EQUAL.String(), Value: peer.BLACKLIST}}, + }, + }}}, + }, + }, b, stream.ProtocolSearchResource) //if partners focus only them*/ case "partner": // define Search Strategy - return ps.StreamService.SearchPartnersPublishEvent(dt, user, search) //if partners focus only them*/ + return ps.StreamService.PublishesCommon(dt, user, &dbs.Filters{ // filter by like name, short_description, description, owner, url if no filters are provided + And: map[string][]dbs.Filter{ + "relation": {{Operator: dbs.EQUAL.String(), Value: peer.PARTNER}}, + }, + }, b, stream.ProtocolSearchResource) case "all": // Gossip PubSub b, err := json.Marshal(map[string]string{"search": search}) if err != nil { return err } - return ps.searchPublishEvent(ctx, dt, user, b) + return ps.publishEvent(ctx, dt, tools.PB_SEARCH, user, b) default: return errors.New("no type of research found") } } -func (ps *PubSubService) searchPublishEvent( - ctx context.Context, dt *tools.DataType, user string, payload []byte) error { - return ps.publishEvent(ctx, dt, tools.PB_SEARCH, user, payload) -} - func (ps *PubSubService) publishEvent( ctx context.Context, dt *tools.DataType, action tools.PubSubAction, user string, payload []byte, ) error { - from, err := oclib.GenerateNodeID() - if err != nil { - return err - } priv, err := tools.LoadKeyFromFilePrivate() if err != nil { return err } - msg, _ := json.Marshal(models.NewEvent(action.String(), from, dt, user, payload, priv)) + msg, _ := json.Marshal(models.NewEvent(action.String(), ps.Host.ID().String(), dt, user, payload, priv)) topic, err := ps.PS.Join(action.String()) if err != nil { return err diff --git a/daemons/node/stream/handler.go b/daemons/node/stream/handler.go index c14be49..356b3c3 100644 --- a/daemons/node/stream/handler.go +++ b/daemons/node/stream/handler.go @@ -5,6 +5,7 @@ import ( "crypto/subtle" "encoding/json" "errors" + "fmt" "oc-discovery/daemons/node/common" oclib "cloud.o-forge.io/core/oc-lib" @@ -19,6 +20,7 @@ type Verify struct { } func (ps *StreamService) handleEvent(protocol string, evt *common.Event) error { + fmt.Println("handleEvent") ps.handleEventFromPartner(evt, protocol) /*if protocol == ProtocolVerifyResource { if evt.DataType == -1 { @@ -148,14 +150,6 @@ func (abs *StreamService) pass(event *common.Event, action tools.PubSubAction) e } func (ps *StreamService) handleEventFromPartner(evt *common.Event, protocol string) error { - resource, err := resources.ToResource(int(evt.DataType), evt.Payload) - if err != nil { - return err - } - b, err := json.Marshal(resource) - if err != nil { - return err - } switch protocol { case ProtocolSearchResource: if evt.DataType < 0 { @@ -169,20 +163,20 @@ func (ps *StreamService) handleEventFromPartner(evt *common.Event, protocol stri ps.SendResponse(p[0], evt) } } - case ProtocolCreateResource: - case ProtocolUpdateResource: + case ProtocolCreateResource, ProtocolUpdateResource: + fmt.Println("RECEIVED Protocol.Update") go tools.NewNATSCaller().SetNATSPub(tools.CREATE_RESOURCE, tools.NATSResponse{ FromApp: "oc-discovery", Datatype: tools.DataType(evt.DataType), Method: int(tools.CREATE_RESOURCE), - Payload: b, + Payload: evt.Payload, }) case ProtocolDeleteResource: go tools.NewNATSCaller().SetNATSPub(tools.REMOVE_RESOURCE, tools.NATSResponse{ FromApp: "oc-discovery", Datatype: tools.DataType(evt.DataType), Method: int(tools.REMOVE_RESOURCE), - Payload: b, + Payload: evt.Payload, }) default: return errors.New("no action authorized available : " + protocol) @@ -213,9 +207,9 @@ func (abs *StreamService) SendResponse(p *peer.Peer, event *common.Event) error if j, err := json.Marshal(ss); err == nil { if event.DataType != -1 { ndt := tools.DataType(dt.EnumIndex()) - abs.PublishResources(&ndt, event.User, peerID, j) + abs.PublishCommon(&ndt, event.User, peerID, ProtocolSearchResource, j) } else { - abs.PublishResources(nil, event.User, peerID, j) + abs.PublishCommon(nil, event.User, peerID, ProtocolSearchResource, j) } } } diff --git a/daemons/node/stream/publish.go b/daemons/node/stream/publish.go index 8643721..887aa3e 100644 --- a/daemons/node/stream/publish.go +++ b/daemons/node/stream/publish.go @@ -15,81 +15,45 @@ import ( "github.com/libp2p/go-libp2p/core/protocol" ) -func (ps *StreamService) PublishCommon(dt *tools.DataType, user string, toPeerID string, proto protocol.ID, resource []byte) (*common.Stream, error) { +func (ps *StreamService) PublishesCommon(dt *tools.DataType, user string, filter *dbs.Filters, resource []byte, protos ...protocol.ID) error { access := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil) - p := access.LoadOne(toPeerID) - if p.Err != "" { - return nil, errors.New(p.Err) - } else { - ad, err := pp.AddrInfoFromString(p.Data.(*peer.Peer).StreamAddress) + p := access.Search(filter, "", false) + for _, pes := range p.Data { + for _, proto := range protos { + if _, err := ps.PublishCommon(dt, user, pes.(*peer.Peer).PeerID, proto, resource); err != nil { + return err + } + } + } + return nil +} + +func (ps *StreamService) PublishCommon(dt *tools.DataType, user string, toPeerID string, proto protocol.ID, resource []byte) (*common.Stream, error) { + fmt.Println("PublishCommon") + if toPeerID == ps.Key.String() { + return nil, errors.New("Can't send to ourself !") + } + + access := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil) + p := access.Search(&dbs.Filters{ + And: map[string][]dbs.Filter{ // search by name if no filters are provided + "peer_id": {{Operator: dbs.EQUAL.String(), Value: toPeerID}}, + }, + }, toPeerID, false) + var pe *peer.Peer + if len(p.Data) > 0 && p.Data[0].(*peer.Peer).Relation != peer.BLACKLIST { + pe = p.Data[0].(*peer.Peer) + } else if pps, err := ps.Node.GetPeerRecord(context.Background(), toPeerID); err == nil && len(pps) > 0 { + pe = pps[0] + } + if pe != nil { + ad, err := pp.AddrInfoFromString(p.Data[0].(*peer.Peer).StreamAddress) if err != nil { return nil, err } return ps.write(toPeerID, ad, dt, user, resource, proto) } -} - -func (ps *StreamService) PublishResources(dt *tools.DataType, user string, toPeerID string, resource []byte) error { - access := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil) - p := access.LoadOne(toPeerID) - if p.Err != "" { - return errors.New(p.Err) - } else { - ad, err := pp.AddrInfoFromString(p.Data.(*peer.Peer).StreamAddress) - if err != nil { - return err - } - ps.write(toPeerID, ad, dt, user, resource, ProtocolSearchResource) - } - return nil -} - -func (ps *StreamService) SearchKnownPublishEvent(dt *tools.DataType, user string, search string) error { - access := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil) - peers := access.Search(&dbs.Filters{ // filter by like name, short_description, description, owner, url if no filters are provided - And: map[string][]dbs.Filter{ - "": {{Operator: dbs.NOT.String(), Value: dbs.Filters{ // filter by like name, short_description, description, owner, url if no filters are provided - And: map[string][]dbs.Filter{ - "relation": {{Operator: dbs.EQUAL.String(), Value: peer.BLACKLIST}}, - }, - }}}, - }, - }, search, false) - if peers.Err != "" { - return errors.New(peers.Err) - } else { - b, err := json.Marshal(map[string]string{"search": search}) - if err != nil { - return err - } - for _, p := range peers.Data { - ad, err := pp.AddrInfoFromString(p.(*peer.Peer).StreamAddress) - if err != nil { - continue - } - ps.write(p.GetID(), ad, dt, user, b, ProtocolSearchResource) - } - } - return nil -} - -func (ps *StreamService) SearchPartnersPublishEvent(dt *tools.DataType, user string, search string) error { - if peers, err := ps.searchPeer(fmt.Sprintf("%v", peer.PARTNER.EnumIndex())); err != nil { - return err - } else { - b, err := json.Marshal(map[string]string{"search": search}) - if err != nil { - return err - } - for _, p := range peers { - ad, err := pp.AddrInfoFromString(p.StreamAddress) - if err != nil { - continue - } - ps.write(p.GetID(), ad, dt, user, b, ProtocolSearchResource) - } - } - return nil + return nil, errors.New("peer unvalid " + toPeerID) } func (ps *StreamService) ToPartnerPublishEvent( @@ -103,35 +67,44 @@ func (ps *StreamService) ToPartnerPublishEvent( if err != nil { return err } - ps.Mu.Lock() - defer ps.Mu.Unlock() - if p.Relation == peer.PARTNER { - if ps.Streams[ProtocolHeartbeatPartner] == nil { - ps.Streams[ProtocolHeartbeatPartner] = map[pp.ID]*common.Stream{} - } - ps.ConnectToPartner(p.StreamAddress) - } else if ps.Streams[ProtocolHeartbeatPartner] != nil && ps.Streams[ProtocolHeartbeatPartner][pid] != nil { - for _, pids := range ps.Streams { - if pids[pid] != nil { - delete(pids, pid) + + if pe, err := oclib.GetMySelf(); err != nil { + return err + } else if pe.GetID() == p.GetID() { + return fmt.Errorf("can't send to ourself") + } else { + pe.Relation = p.Relation + pe.Verify = false + if b2, err := json.Marshal(pe); err == nil { + if _, err := ps.PublishCommon(dt, user, p.PeerID, ProtocolUpdateResource, b2); err != nil { + return err + } + if p.Relation == peer.PARTNER { + if ps.Streams[ProtocolHeartbeatPartner] == nil { + ps.Streams[ProtocolHeartbeatPartner] = map[pp.ID]*common.Stream{} + } + fmt.Println("SHOULD CONNECT") + ps.ConnectToPartner(p.StreamAddress) + } else if ps.Streams[ProtocolHeartbeatPartner] != nil && ps.Streams[ProtocolHeartbeatPartner][pid] != nil { + for _, pids := range ps.Streams { + if pids[pid] != nil { + delete(pids, pid) + } + } } } } return nil } - if peers, err := ps.searchPeer(fmt.Sprintf("%v", peer.PARTNER.EnumIndex())); err != nil { - return err - } else { - for _, p := range peers { - for protocol := range protocolsPartners { - ad, err := pp.AddrInfoFromString(p.StreamAddress) - if err != nil { - continue - } - ps.write(p.GetID(), ad, dt, user, payload, protocol) - } - } + ks := []protocol.ID{} + for k := range protocolsPartners { + ks = append(ks, k) } + ps.PublishesCommon(dt, user, &dbs.Filters{ // filter by like name, short_description, description, owner, url if no filters are provided + And: map[string][]dbs.Filter{ + "relation": {{Operator: dbs.EQUAL.String(), Value: peer.PARTNER}}, + }, + }, payload, ks...) return nil } @@ -158,6 +131,7 @@ func (s *StreamService) write( } stream := s.Streams[proto][peerID.ID] evt := common.NewEvent(string(proto), peerID.ID.String(), dt, user, payload) + fmt.Println("SEND EVENT ", evt.From, evt.DataType, evt.Timestamp) if err := json.NewEncoder(stream.Stream).Encode(evt); err != nil { stream.Stream.Close() logger.Err(err) diff --git a/daemons/node/stream/service.go b/daemons/node/stream/service.go index 53378f4..18bd5d0 100644 --- a/daemons/node/stream/service.go +++ b/daemons/node/stream/service.go @@ -116,7 +116,7 @@ func (s *StreamService) HandlePartnerHeartbeat(stream network.Stream) { streamsAnonym[k] = v } s.Mu.Unlock() - pid, hb, err := common.CheckHeartbeat(s.Host, stream, streamsAnonym, &s.Mu, s.maxNodesConn) + pid, hb, err := common.CheckHeartbeat(s.Host, stream, json.NewDecoder(stream), streamsAnonym, &s.Mu, s.maxNodesConn) if err != nil { return } @@ -132,10 +132,12 @@ func (s *StreamService) HandlePartnerHeartbeat(stream network.Stream) { s.ConnectToPartner(val) } } - go s.StartGC(30 * time.Second) + // GC is already running via InitStream — starting a new ticker goroutine on + // every heartbeat would leak an unbounded number of goroutines. } func (s *StreamService) connectToPartners() error { + logger := oclib.GetLogger() for proto, info := range protocolsPartners { f := func(ss network.Stream) { if s.Streams[proto] == nil { @@ -147,11 +149,12 @@ func (s *StreamService) connectToPartners() error { } go s.readLoop(s.Streams[proto][ss.Conn().RemotePeer()], ss.Conn().RemotePeer(), proto, info) } - fmt.Println("SetStreamHandler", proto) + logger.Info().Msg("SetStreamHandler " + string(proto)) s.Host.SetStreamHandler(proto, f) } peers, err := s.searchPeer(fmt.Sprintf("%v", peer.PARTNER.EnumIndex())) if err != nil { + logger.Err(err) return err } for _, p := range peers { @@ -161,19 +164,19 @@ func (s *StreamService) connectToPartners() error { } func (s *StreamService) ConnectToPartner(address string) { + logger := oclib.GetLogger() if ad, err := pp.AddrInfoFromString(address); err == nil { + logger.Info().Msg("Connect to Partner " + ProtocolHeartbeatPartner + " " + address) common.SendHeartbeat(context.Background(), ProtocolHeartbeatPartner, conf.GetConfig().Name, - s.Host, s.Streams, map[string]*pp.AddrInfo{address: ad}, 20*time.Second) + s.Host, s.Streams, map[string]*pp.AddrInfo{address: ad}, nil, 20*time.Second) } } func (s *StreamService) searchPeer(search string) ([]*peer.Peer, error) { - /* TODO FOR TEST ONLY A VARS THAT DEFINE ADDRESS... deserialize */ ps := []*peer.Peer{} if conf.GetConfig().PeerIDS != "" { for _, peerID := range strings.Split(conf.GetConfig().PeerIDS, ",") { ppID := strings.Split(peerID, "/") - fmt.Println(ppID, peerID) ps = append(ps, &peer.Peer{ AbstractObject: utils.AbstractObject{ UUID: uuid.New().String(), @@ -185,7 +188,6 @@ func (s *StreamService) searchPeer(search string) ([]*peer.Peer, error) { }) } } - access := oclib.NewRequestAdmin(oclib.LibDataEnum(oclib.PEER), nil) peers := access.Search(nil, search, false) for _, p := range peers.Data { @@ -252,8 +254,9 @@ func (ps *StreamService) readLoop(s *common.Stream, id pp.ID, proto protocol.ID, } var evt common.Event if err := json.NewDecoder(s.Stream).Decode(&evt); err != nil { - s.Stream.Close() - continue + // Any decode error (EOF, reset, malformed JSON) terminates the loop; + // continuing on a dead/closed stream creates an infinite spin. + return } ps.handleEvent(evt.Type, &evt) if protocolInfo.WaitResponse && !protocolInfo.PersistantStream { diff --git a/demo-discovery.sh b/demo-discovery.sh index 4b4412a..39cc75c 100755 --- a/demo-discovery.sh +++ b/demo-discovery.sh @@ -1,23 +1,33 @@ #!/bin/bash - IMAGE_BASE_NAME="oc-discovery" DOCKERFILE_PATH="." -for i in {0..3}; do +docker network create \ + --subnet=172.40.0.0/24 \ + discovery + +for i in $(seq ${1:-0} ${2:-3}); do NUM=$((i + 1)) PORT=$((4000 + $NUM)) IMAGE_NAME="${IMAGE_BASE_NAME}:${NUM}" + echo "▶ Building image ${IMAGE_NAME} with CONF_NUM=${NUM}" docker build \ --build-arg CONF_NUM=${NUM} \ - -t ${IMAGE_NAME} \ + -t "${IMAGE_BASE_NAME}_${NUM}" \ ${DOCKERFILE_PATH} + docker kill "${IMAGE_BASE_NAME}_${NUM}" | true + docker rm "${IMAGE_BASE_NAME}_${NUM}" | true + echo "▶ Running container ${IMAGE_NAME} on port ${PORT}:${PORT}" docker run -d \ + --network="${3:-oc}" \ -p ${PORT}:${PORT} \ --name "${IMAGE_BASE_NAME}_${NUM}" \ - ${IMAGE_NAME} + "${IMAGE_BASE_NAME}_${NUM}" + + docker network connect --ip "172.40.0.${NUM}" discovery "${IMAGE_BASE_NAME}_${NUM}" done \ No newline at end of file diff --git a/docker_discovery10.json b/docker_discovery10.json new file mode 100644 index 0000000..df06bbd --- /dev/null +++ b/docker_discovery10.json @@ -0,0 +1,10 @@ +{ + "MONGO_URL":"mongodb://mongo:27017/", + "MONGO_DATABASE":"DC_myDC", + "NATS_URL": "nats://nats:4222", + "NODE_MODE": "node", + "NODE_ENDPOINT_PORT": 4010, + "NATIVE_INDEXER_ADDRESSES": "/ip4/172.40.0.5/tcp/4005/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu", + "MIN_INDEXER": 2, + "PEER_IDS": "/ip4/172.40.0.9/tcp/4009/p2p/12D3KooWGnQfKwX9E4umCPE8dUKZuig4vw5BndDowRLEbGmcZyta" +} \ No newline at end of file diff --git a/docker_discovery2.json b/docker_discovery2.json index 0f19bfb..12ab3a6 100644 --- a/docker_discovery2.json +++ b/docker_discovery2.json @@ -4,5 +4,5 @@ "NATS_URL": "nats://nats:4222", "NODE_MODE": "indexer", "NODE_ENDPOINT_PORT": 4002, - "INDEXER_ADDRESSES": "/ip4/172.19.0.2/tcp/4001/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu" + "INDEXER_ADDRESSES": "/ip4/172.40.0.1/tcp/4001/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu" } \ No newline at end of file diff --git a/docker_discovery3.json b/docker_discovery3.json index de50e4a..89649bc 100644 --- a/docker_discovery3.json +++ b/docker_discovery3.json @@ -4,5 +4,5 @@ "NATS_URL": "nats://nats:4222", "NODE_MODE": "node", "NODE_ENDPOINT_PORT": 4003, - "INDEXER_ADDRESSES": "/ip4/172.19.0.3/tcp/4002/p2p/12D3KooWC3GNStak8KCYtJq11Dxiq45EJV53z1ZvKetMcZBeBX6u" + "INDEXER_ADDRESSES": "/ip4/172.40.0.2/tcp/4002/p2p/12D3KooWC3GNStak8KCYtJq11Dxiq45EJV53z1ZvKetMcZBeBX6u" } \ No newline at end of file diff --git a/docker_discovery4.json b/docker_discovery4.json index eeb4ba9..9fe4eab 100644 --- a/docker_discovery4.json +++ b/docker_discovery4.json @@ -4,6 +4,6 @@ "NATS_URL": "nats://nats:4222", "NODE_MODE": "node", "NODE_ENDPOINT_PORT": 4004, - "INDEXER_ADDRESSES": "/ip4/172.19.0.2/tcp/4001/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu", - "PEER_IDS": "/ip4/172.19.0.4/tcp/4003/p2p/12D3KooWBh9kZrekBAE5G33q4jCLNRAzygem3gP1mMdK8mhoCTaw" + "INDEXER_ADDRESSES": "/ip4/172.40.0.1/tcp/4001/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu", + "PEER_IDS": "/ip4/172.40.0.3/tcp/4003/p2p/12D3KooWBh9kZrekBAE5G33q4jCLNRAzygem3gP1mMdK8mhoCTaw" } diff --git a/docker_discovery5.json b/docker_discovery5.json new file mode 100644 index 0000000..1adaac2 --- /dev/null +++ b/docker_discovery5.json @@ -0,0 +1,7 @@ +{ + "MONGO_URL":"mongodb://mongo:27017/", + "MONGO_DATABASE":"DC_myDC", + "NATS_URL": "nats://nats:4222", + "NODE_MODE": "native-indexer", + "NODE_ENDPOINT_PORT": 4005 +} diff --git a/docker_discovery6.json b/docker_discovery6.json new file mode 100644 index 0000000..f4c7665 --- /dev/null +++ b/docker_discovery6.json @@ -0,0 +1,8 @@ +{ + "MONGO_URL":"mongodb://mongo:27017/", + "MONGO_DATABASE":"DC_myDC", + "NATS_URL": "nats://nats:4222", + "NODE_MODE": "native-indexer", + "NODE_ENDPOINT_PORT": 4006, + "NATIVE_INDEXER_ADDRESSES": "/ip4/172.40.0.5/tcp/4005/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu" +} diff --git a/docker_discovery7.json b/docker_discovery7.json new file mode 100644 index 0000000..e93e1c6 --- /dev/null +++ b/docker_discovery7.json @@ -0,0 +1,8 @@ +{ + "MONGO_URL":"mongodb://mongo:27017/", + "MONGO_DATABASE":"DC_myDC", + "NATS_URL": "nats://nats:4222", + "NODE_MODE": "indexer", + "NODE_ENDPOINT_PORT": 4007, + "NATIVE_INDEXER_ADDRESSES": "/ip4/172.40.0.6/tcp/4006/p2p/12D3KooWC3GNStak8KCYtJq11Dxiq45EJV53z1ZvKetMcZBeBX6u" +} \ No newline at end of file diff --git a/docker_discovery8.json b/docker_discovery8.json new file mode 100644 index 0000000..dd5cd29 --- /dev/null +++ b/docker_discovery8.json @@ -0,0 +1,8 @@ +{ + "MONGO_URL":"mongodb://mongo:27017/", + "MONGO_DATABASE":"DC_myDC", + "NATS_URL": "nats://nats:4222", + "NODE_MODE": "indexer", + "NODE_ENDPOINT_PORT": 4008, + "NATIVE_INDEXER_ADDRESSES": "/ip4/172.40.0.5/tcp/4005/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu" +} \ No newline at end of file diff --git a/docker_discovery9.json b/docker_discovery9.json new file mode 100644 index 0000000..a2ceb14 --- /dev/null +++ b/docker_discovery9.json @@ -0,0 +1,8 @@ +{ + "MONGO_URL":"mongodb://mongo:27017/", + "MONGO_DATABASE":"DC_myDC", + "NATS_URL": "nats://nats:4222", + "NODE_MODE": "node", + "NODE_ENDPOINT_PORT": 4009, + "NATIVE_INDEXER_ADDRESSES": "/ip4/172.40.0.6/tcp/4006/p2p/12D3KooWC3GNStak8KCYtJq11Dxiq45EJV53z1ZvKetMcZBeBX6u,/ip4/172.40.0.5/tcp/4005/p2p/12D3KooWGn3j4XqTSrjJDGGpTQERdDV5TPZdhQp87rAUnvQssvQu" +} \ No newline at end of file diff --git a/go.mod b/go.mod index 3bf0704..dc59d05 100644 --- a/go.mod +++ b/go.mod @@ -3,7 +3,7 @@ module oc-discovery go 1.25.0 require ( - cloud.o-forge.io/core/oc-lib v0.0.0-20260224130821-ce8ef70516f7 + cloud.o-forge.io/core/oc-lib v0.0.0-20260302152414-542b0b73aba5 github.com/libp2p/go-libp2p v0.47.0 github.com/libp2p/go-libp2p-record v0.3.1 github.com/multiformats/go-multiaddr v0.16.1 diff --git a/go.sum b/go.sum index baadcec..703622a 100644 --- a/go.sum +++ b/go.sum @@ -1,5 +1,13 @@ cloud.o-forge.io/core/oc-lib v0.0.0-20260224130821-ce8ef70516f7 h1:p9uJjMY+QkE4neA+xRmIRtAm9us94EKZqgajDdLOd0Y= cloud.o-forge.io/core/oc-lib v0.0.0-20260224130821-ce8ef70516f7/go.mod h1:+ENuvBfZdESSvecoqGY/wSvRlT3vinEolxKgwbOhUpA= +cloud.o-forge.io/core/oc-lib v0.0.0-20260226084851-959fce48ef6c h1:FTUu9tdEfib6J+fuc7e5wYTe++EIlB70bVNpOeFjnyU= +cloud.o-forge.io/core/oc-lib v0.0.0-20260226084851-959fce48ef6c/go.mod h1:+ENuvBfZdESSvecoqGY/wSvRlT3vinEolxKgwbOhUpA= +cloud.o-forge.io/core/oc-lib v0.0.0-20260226085754-f4e2d8057df0 h1:lvrRF4ToIMl/5k1q4AiPEy6ycjwRtOaDhWnQ/LrW1ZA= +cloud.o-forge.io/core/oc-lib v0.0.0-20260226085754-f4e2d8057df0/go.mod h1:+ENuvBfZdESSvecoqGY/wSvRlT3vinEolxKgwbOhUpA= +cloud.o-forge.io/core/oc-lib v0.0.0-20260226091217-cb3771c17a31 h1:hvkvJibS9NmImw73j79Ov5VpIYs4WbP4SYGlK/XO82Q= +cloud.o-forge.io/core/oc-lib v0.0.0-20260226091217-cb3771c17a31/go.mod h1:+ENuvBfZdESSvecoqGY/wSvRlT3vinEolxKgwbOhUpA= +cloud.o-forge.io/core/oc-lib v0.0.0-20260302152414-542b0b73aba5 h1:h+Fkyj6cfwAirc0QGCBEkZSSrgcyThXswg7ytOLm948= +cloud.o-forge.io/core/oc-lib v0.0.0-20260302152414-542b0b73aba5/go.mod h1:+ENuvBfZdESSvecoqGY/wSvRlT3vinEolxKgwbOhUpA= github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= github.com/Masterminds/semver/v3 v3.4.0 h1:Zog+i5UMtVoCU8oKka5P7i9q9HgrJeGzI9SA1Xbatp0= github.com/Masterminds/semver/v3 v3.4.0/go.mod h1:4V+yj/TJE1HU9XfppCwVMZq3I84lprf4nC11bSS5beM= diff --git a/main.go b/main.go index c20dc83..6354b77 100644 --- a/main.go +++ b/main.go @@ -28,11 +28,15 @@ func main() { conf.GetConfig().PSKPath = o.GetStringDefault("PSK_PATH", "./psk/psk.key") conf.GetConfig().NodeEndpointPort = o.GetInt64Default("NODE_ENDPOINT_PORT", 4001) conf.GetConfig().IndexerAddresses = o.GetStringDefault("INDEXER_ADDRESSES", "") + conf.GetConfig().NativeIndexerAddresses = o.GetStringDefault("NATIVE_INDEXER_ADDRESSES", "") conf.GetConfig().PeerIDS = o.GetStringDefault("PEER_IDS", "") conf.GetConfig().NodeMode = o.GetStringDefault("NODE_MODE", "node") + conf.GetConfig().MinIndexer = o.GetIntDefault("MIN_INDEXER", 1) + conf.GetConfig().MaxIndexer = o.GetIntDefault("MAX_INDEXER", 5) + ctx, stop := signal.NotifyContext( context.Background(), os.Interrupt, @@ -47,7 +51,7 @@ func main() { if n, err := node.InitNode(isNode, isIndexer, isNativeIndexer); err != nil { panic(err) } else { - <-ctx.Done() // 👈 the only blocking point + <-ctx.Done() // the only blocking point log.Println("shutting down") n.Close() } diff --git a/pem/private10.pem b/pem/private10.pem new file mode 100644 index 0000000..3e4d2bf --- /dev/null +++ b/pem/private10.pem @@ -0,0 +1,3 @@ +-----BEGIN PRIVATE KEY----- +MC4CAQAwBQYDK2VwBCIEIPc7D3Mgb1U2Ipyb/85hA4Ew7dC8zHDEuQYSjqzzRgLK +-----END PRIVATE KEY----- diff --git a/pem/private5.pem b/pem/private5.pem new file mode 100644 index 0000000..c7b53e8 --- /dev/null +++ b/pem/private5.pem @@ -0,0 +1,3 @@ +-----BEGIN PRIVATE KEY----- +MC4CAQAwBQYDK2VwBCIEIK2oBaOtGNchE09MBRtPd5oEOUcVUQG2ndym5wKExj7R +-----END PRIVATE KEY----- diff --git a/pem/private6.pem b/pem/private6.pem new file mode 100644 index 0000000..1b70fcd --- /dev/null +++ b/pem/private6.pem @@ -0,0 +1,3 @@ +-----BEGIN PRIVATE KEY----- +MC4CAQAwBQYDK2VwBCIEIE58GDazCyF1jp796ivSmHiCepbkC8TpzliIaQ7eGEpu +-----END PRIVATE KEY----- diff --git a/pem/private7.pem b/pem/private7.pem new file mode 100644 index 0000000..06e96ed --- /dev/null +++ b/pem/private7.pem @@ -0,0 +1,3 @@ +-----BEGIN PRIVATE KEY----- +MC4CAQAwBQYDK2VwBCIEIAeX4O7ldwehRSnPkbzuE6csyo63vjvqAcNNujENOKUC +-----END PRIVATE KEY----- diff --git a/pem/private8.pem b/pem/private8.pem new file mode 100644 index 0000000..fb35164 --- /dev/null +++ b/pem/private8.pem @@ -0,0 +1,3 @@ +-----BEGIN PRIVATE KEY----- +MC4CAQAwBQYDK2VwBCIEIEkgqINXDLnxIJZs2LEK9O4vdsqk43dwbULGUE25AWuR +-----END PRIVATE KEY----- diff --git a/pem/private9.pem b/pem/private9.pem new file mode 100644 index 0000000..9614102 --- /dev/null +++ b/pem/private9.pem @@ -0,0 +1,3 @@ +-----BEGIN PRIVATE KEY----- +MC4CAQAwBQYDK2VwBCIEIBcflxGlZYyUVJoExC94rHZbIyKMwZ+Oh7EDkb0qUlxd +-----END PRIVATE KEY----- diff --git a/pem/public10.pem b/pem/public10.pem new file mode 100644 index 0000000..e94c88c --- /dev/null +++ b/pem/public10.pem @@ -0,0 +1,3 @@ +-----BEGIN PUBLIC KEY----- +MCowBQYDK2VwAyEAEomuEQGmGsYVw35C6DB5tfY8LI8jm359ceAxRX8eQ0o= +-----END PUBLIC KEY----- diff --git a/pem/public5.pem b/pem/public5.pem new file mode 100644 index 0000000..50d91ba --- /dev/null +++ b/pem/public5.pem @@ -0,0 +1,3 @@ +-----BEGIN PUBLIC KEY----- +MCowBQYDK2VwAyEAZ2nLJBL8a5opfa8nFeVj0SZToW8pl4+zgcSUkeZFRO4= +-----END PUBLIC KEY----- diff --git a/pem/public6.pem b/pem/public6.pem new file mode 100644 index 0000000..ce3b5c4 --- /dev/null +++ b/pem/public6.pem @@ -0,0 +1,3 @@ +-----BEGIN PUBLIC KEY----- +MCowBQYDK2VwAyEAIQVeSGwsjPjyepPTnzzYqVxIxviSEjZXU7C7zuNTui4= +-----END PUBLIC KEY----- diff --git a/pem/public7.pem b/pem/public7.pem new file mode 100644 index 0000000..3c2550d --- /dev/null +++ b/pem/public7.pem @@ -0,0 +1,3 @@ +-----BEGIN PUBLIC KEY----- +MCowBQYDK2VwAyEAG95Ettl3jTi41HM8le1A9WDmOEq0ANEqpLF7zTZrfXA= +-----END PUBLIC KEY----- diff --git a/pem/public8.pem b/pem/public8.pem new file mode 100644 index 0000000..b7d538d --- /dev/null +++ b/pem/public8.pem @@ -0,0 +1,3 @@ +-----BEGIN PUBLIC KEY----- +MCowBQYDK2VwAyEA/ymOIb0sJ0qCWrf3mKz7ACCvsMXLog/EK533JfNXZTM= +-----END PUBLIC KEY----- diff --git a/pem/public9.pem b/pem/public9.pem new file mode 100644 index 0000000..490cc05 --- /dev/null +++ b/pem/public9.pem @@ -0,0 +1,3 @@ +-----BEGIN PUBLIC KEY----- +MCowBQYDK2VwAyEAZ4F3KqOp/5QrPdZGqqX6PYYEGd2snX4Q3AUt9XAG3v8= +-----END PUBLIC KEY-----