2026-05-31 — TanStack npm 供應鏈事後分析、Kubernetes Prometheus + Cilium 整合稅

TanStack npm 供應鏈淪陷事後分析：三層攻擊鏈如何在 26 分鐘內感染 160+ 套件

TanStack · 2026-05-21

2026 年 5 月 11 至 12 日，攻擊組織 TeamPCP 透過「Mini Shai-Hulud」行動，在不到 26 分鐘內將惡意版本發佈至 42 個 @tanstack/* npm 套件。攻擊利用三個可組合漏洞形成完整的供應鏈入侵鏈，最終波及 160 餘個 npm/PyPI 套件，包含 Grafana Labs、Mistral AI、UiPath 的程式碼庫。

三層攻擊鏈拆解

第一層：Pwn Request。TanStack 的 bundle-size.yml GitHub Actions workflow 使用 pull_request_target trigger，此 trigger 在 base repo 上下文執行，即便觸發來自 fork。攻擊者提交包含惡意 workflow 步驟的 PR，在 base repo 的 runner 上執行任意指令。

第二層：GitHub Actions Cache Poisoning。runner 上的惡意步驟將受感染的建構工具鏈寫入共用 cache 條目。後續合法 workflow run 在 actions/cache 的 restore 步驟載入受污染的二進位，形成持久後門。

第三層：OIDC Token 記憶體提取。透過 runner 行程記憶體 dump，攻擊者取得用於 npm Trusted Publishing 的 OIDC token。此 token 允許在無需 2FA 的情況下發佈套件，使攻擊者以 TanStack 身份發佈 84 個惡意版本。

惡意 Payload 行為

payload 從以下位置竊取憑證：AWS、GCP、Kubernetes、HashiCorp Vault、GitHub、npm、SSH key，並透過 Session/Oxen 加密 messenger 外傳。蠕蟲特性：以受害者維護者身份自我複製至其他套件，這是 160+ 套件被感染的主要原因。攻擊者選用的 payload 剛好會破壞測試，無意中加速了偵測速度——StepSecurity 外部研究員在 20–26 分鐘後偵測到異常，所有受影響版本在 1 小時 43 分鐘內被 deprecate。

根本修復

以 pull_request（隔離的 fork 上下文）替換 pull_request_target，或加入嚴格的 actor 白名單
設定 GitHub Actions cache isolation，限制 cache 只能由同一 repository/branch 恢復
為 npm Trusted Publishing 設定 environment protection rules，限制 OIDC token 只能從受保護的 release workflow 取得
啟用 npm Provenance 將套件版本與 CI job 綁定，外部可驗證

原始來源：TanStack Postmortem、Grafana Labs、Orca Security

Kubernetes 整合稅：Prometheus + Cilium kube-proxy-free 模式的 scrape failure 問題

CNCF · 2026-05-28

在同一 Kubernetes 叢集中同時部署 Prometheus 監控與 Cilium eBPF 網路時，兩套系統各自運作良好，但組合後的「整合稅」只在生產規模下才浮現。CNCF 文章以 3,000 pod 叢集的真實案例，記錄了 kube-proxy-free 模式下的具體問題與緩解策略。

根本問題：ClusterIP 路由假設分歧

Prometheus Operator 的 scrape routing 預設透過 ClusterIP + iptables NAT 轉發。當 Cilium 以 kubeProxyReplacement: strict 模式啟用，iptables KUBE-SVC 規則不再存在，由 Cilium eBPF map 直接處理封包。部分 Prometheus Operator 版本在初始化時嘗試查詢 iptables 規則以驗證 ClusterIP 可達性，失敗後靜默降級導致 scrape failure rate 高達 12%，且不產生明確錯誤訊息。

第二個摩擦點是 Hubble（Cilium 的可觀測性組件）與 Prometheus 的指標標籤 schema 衝突，使跨維度關聯分析需要額外的 relabeling 規則。

緩解方案

Prometheus scraper hostNetworking: true：讓 scraper pod 直接使用 node 網路棧，繞過 eBPF datapath 的 ClusterIP 層，scrape failure 降至 0.3%
Cilium socketLB 模式：保留 iptables KUBE-SVC 規則作為 fallback，兼容性最佳但放棄 kube-proxy-free 的部分效能優勢
Hubble + Prometheus 標籤衝突：透過 metricRelabelings 在 scrape 時統一格式

隱性成本的核心在於這些問題在小規模測試環境通常不會出現，只有在生產流量與複雜網路 policy 共存時才被觸發。

原始來源：CNCF Blog

End of article