This commit is contained in:
mr
2026-03-09 14:57:41 +01:00
parent 3751ec554d
commit 83cef6e6f6
47 changed files with 2704 additions and 1034 deletions

View File

@@ -0,0 +1,55 @@
@startuml failure_indexer_crash
title Indexer Failure → replenish from a Native
participant "Node" as N
participant "Indexer A (alive)" as IA
participant "Indexer B (crashed)" as IB
participant "Native A" as NA
participant "Native B" as NB
note over N: Active Pool : Indexers = [IA, IB]\\nActive Heartbeat long-lived from IA & IB
== IB Failure ==
IB ->x N: heartbeat fails (sendHeartbeat err)
note over N: doTick() dans SendHeartbeat triggers failure\\n→ delete(Indexers[IB])\\n→ delete(IndexerMeta[IB])\\nUnique heartbeat goroutine continue
N -> N: go replenishIndexersFromNative(need=1)
note over N: Reduced Pool to 1 indexers.\\nReplenish triggers with goroutine.
== Replenish from natives ==
par Fetch pool (timeout 6s)
N -> NA: GET /opencloud/native/indexers/1.0\\nGetIndexersRequest{Count: max}
NA -> NA: reachableLiveIndexers()\\n(IB absent because of a expired heartbeat)
NA --> N: GetIndexersResponse{Indexers:[IA,IC], FillRates:{IA:0.4,IC:0.2}}
else
N -> NB: GET /opencloud/native/indexers/1.0
NB --> N: GetIndexersResponse{Indexers:[IA,IC]}
end par
note over N: Fusion + duplication → candidates = [IA, IC]\\n(IA already in pool → IC new candidate)
par Consensus Phase 1 (timeout 4s)
N -> NA: /opencloud/native/consensus/1.0\\nConsensusRequest{Candidates:[IA,IC]}
NA --> N: ConsensusResponse{Trusted:[IA,IC]}
else
N -> NB: /opencloud/native/consensus/1.0
NB --> N: ConsensusResponse{Trusted:[IA,IC]}
end par
note over N: IC → 2/2 votes → admit\\nadmittedAt = time.Now()
par Phase 2 — liveness vote (if stable voters )
N -> IA: /opencloud/indexer/consensus/1.0\\nIndexerConsensusRequest{Candidates:[IC]}
IA -> IA: StreamRecords[ProtocolHB][IC]\\nLastSeen ≤ 120s && LastScore ≥ 30
IA --> N: IndexerConsensusResponse{Alive:[IC]}
end par
note over N: IC confirmed alive → add to pool
N -> N: replaceStaticIndexers(pool={IA,IC})
N -> IC: SendHeartbeat /opencloud/heartbeat/1.0 (goroutine long-live)
note over N: Pool restaured to 2 indexers.
@enduml