On Call Brief

On Call Brief – Week of 2026-02-08

2026-02-08

This week's top stories

1. Signal Outage [Ongoing]

Category: Community
What happened: Signal is currently experiencing an ongoing outage affecting its messaging services.
Worth reading: This outage may impact user communication and could lead to increased support tickets or user dissatisfaction.
Source: Hacker News (incidents)
Discussion: https://news.ycombinator.com/item?id=46872945

Tags: outage signal messaging

2. Hetzner Outage

Category: Community
What happened: Hetzner experienced an outage that affected multiple services, leading to downtime for customers.
Worth reading: This outage may impact users relying on Hetzner for hosting, potentially affecting service availability and performance.
Source: Hacker News (incidents)
Discussion: https://news.ycombinator.com/item?id=46870305

Tags: outage hetzner

CVE & Security

1. Privilege Escalation in Aurora PostgreSQL using AWS JDBC Wrapper, AWS Go Wrapper, AWS NodeJS Wrapper, AWS Python Wrapper, AWS PGSQL ODBC driver

Category: Security / Patch
What happened: A critical privilege escalation vulnerability (CVE-2025-12967) has been identified in AWS wrappers for Amazon Aurora PostgreSQL, allowing low-privileged users to escalate their privileges to the rds_superuser role. This affects several AWS wrapper versions, necessitating immediate attention and updates.
Do this Monday: If your organization uses affected AWS wrappers for Aurora PostgreSQL, you must upgrade to the specified versions to mitigate the risk of unauthorized privilege escalation, which could lead to significant security breaches.
Source: AWS Security Bulletins

Tags: cve-2025-12967 aws-jdbc-wrapper aws-go-wrapper aws-nodejs-wrapper aws-python-wrapper aurora-postgresql

2. CVE-2025-9039 - Issue with Amazon ECS agent introspection server

Category: Security / Patch
What happened: CVE-2025-9039 affects the Amazon ECS agent, potentially allowing unauthorized off-host access to the introspection server under specific security group configurations. This vulnerability is critical for users relying on ECS for container orchestration, especially if they have not disabled off-host access.
Do this Monday: Operators using Amazon ECS should review their security group settings and ensure that off-host access to the introspection server is disabled to mitigate potential unauthorized access. Immediate action is recommended for affected ECS Agent versions.
Source: AWS Security Bulletins

Tags: cve-2025-9039 amazon-ecs

3. IngressNightmare Vulnerabilities: All You Need to Know

Category: Security / Patch
What happened: A series of critical vulnerabilities, collectively referred to as IngressNightmare, have been identified in the ingress-nginx Controller for Kubernetes, with CVE numbers CVE-2025-1097, CVE-2025-1098, CVE-2025-24514, and CVE-2025-1974. These vulnerabilities pose significant security risks as they could potentially allow attackers to compromise Kubernetes environments. According to the Kubernetes Steering and Security Response Committees, the Ingress NGINX will be retired in March 2026 due to insufficient contributors and maintainers, which means it will no longer receive updates or security patches. Operators are advised to prioritize patching these vulnerabilities immediately and consider transitioning to alternative ingress solutions before the retirement date to ensure continued security and support. This information is sourced from the Aqua Security Blog and the Kubernetes Blog.
Do this Monday: Operators using ingress-nginx should prioritize patching these vulnerabilities to prevent unauthorized access to sensitive data and mitigate the risk of a cluster takeover.
Sources: Aqua Security Blog, Kubernetes Blog

Tags: ingress-nginx cve-2025-1097 cve-2025-1098 cve-2025-24514 cve-2025-1974 ingress-nginx migration kubernetes

4. Key Commitment Issues in S3 Encryption Clients

Category: Security / Patch
What happened: Multiple CVEs have been identified in AWS S3 Encryption Clients across various programming languages, exposing encrypted data keys to potential attacks. This issue affects Java, Go, .NET, C++, PHP, and Ruby clients, necessitating immediate attention and updates to the specified versions to mitigate risks.
Do this Monday: Operators using the affected S3 Encryption Clients must update to the specified versions to prevent potential exposure of encrypted data keys, which could lead to unauthorized access to sensitive data stored in S3.
Source: AWS Security Bulletins

Tags: cve-2025-14763 cve-2025-14764 cve-2025-14759 cve-2025-14760 cve-2025-14761 cve-2025-14762 aws-s3

5. Docker Compose vulnerability opens door to host-level writes – patch pronto

Category: Security / Patch
What happened: A critical vulnerability in Docker Compose has been identified, allowing for potential path traversal attacks that could lead to host-level writes. Users are urged to upgrade immediately to mitigate this risk.
Do this Monday: This vulnerability could allow attackers to gain unauthorized access to the host system, posing a significant security risk for environments using Docker Compose. Immediate patching is recommended to prevent exploitation.
Source: The Register (DevOps)

Tags: docker-compose cve-2025-xxxx

6. Microsoft kills 9.9-rated ASP.NET Core bug – 'our highest ever' score

Category: Security / Patch
What happened: Microsoft has released a critical patch for a vulnerability in the Kestrel web server component of ASP.NET Core, which has a CVSS score of 9.9, marking it as their highest severity rating ever. The flaw allows for request smuggling, and its impact varies based on the hosting setup and application code.
Do this Monday: This vulnerability poses a significant risk to applications using ASP.NET Core, particularly those relying on Kestrel for handling requests. Operators should prioritize applying this patch to mitigate potential security breaches.
Source: The Register (DevOps)

Tags: aspnet-core kestrel cve-2025-xxxx

7. GitLab: 18.8.4, 18.6.2

Category: Security / Patch
What happened: GitLab has issued several critical patch releases for its products, specifically versions 18.8.4, 18.7.4, 18.6.6, 18.6.2, 18.5.4, and 18.4.6, addressing high-severity vulnerabilities that could potentially lead to Denial of Service or unauthorized code execution. Notably, the GitLab AI Gateway is affected by a severe vulnerability identified as CVE-2026-1868, which is mitigated in versions 18.6.2, 18.7.1, and 18.8.1. Operators should prioritize upgrading to these patched versions immediately to ensure system security and stability. These updates are crucial for both Community and Enterprise editions, as they resolve several critical security flaws. For further details, refer to the official GitLab Security Releases documentation.
Do this Monday: Failure to upgrade could expose self-managed GitLab installations to critical vulnerabilities, potentially leading to service disruptions or data breaches.
Sources: GitLab Security Releases, GitLab Security Releases, GitLab Security Releases

Tags: gitlab cve-2025-7659 cve-2025-8099 cve-2026-0958 cve-2025-14560 cve-2026-0595 security patch

8. Helm v4.0.4

Category: Security / Patch
What happened: Helm v4.0.4 is a patch release addressing a security vulnerability related to a Go CVE in the previous version. Users are encouraged to upgrade to this version to mitigate potential security risks.
Do this Monday: This release is critical for users of Helm as it addresses a security vulnerability. Operators should prioritize upgrading to v4.0.4 to ensure their deployments are secure.
Source: Helm releases

Tags: helm cve-2023-xxxx

9. Traefik v3.6.4: Critical Vulnerabilities Fixed and Migration Required

Category: Security / Patch
What happened: Traefik v3.6.4 has been released, addressing critical vulnerabilities including CVE-2025-66490 and CVE-2025-66491. The release includes a breaking change that requires users to follow a migration guide. Additionally, several bug fixes and documentation improvements have been made.
Do this Monday: Operators using Traefik should prioritize upgrading to v3.6.4 to mitigate security vulnerabilities. The breaking change necessitates careful migration to avoid disruptions in service. Failure to upgrade could expose systems to potential exploits.
Source: Traefik releases

Tags: traefik cve-2025-66490 cve-2025-66491 kubernetes ingress-nginx

10. KEDA v2.17.3: Security Vulnerability Fix for CVE-2025-68476 Released

Category: Security / Patch
What happened: KEDA v2.17.3 has been released, addressing a security vulnerability identified as CVE-2025-68476.
Do this Monday: This release includes a fix for a critical security vulnerability, which may affect deployments using KEDA. Operators should upgrade to this version to mitigate potential risks.
Source: KEDA releases

Tags: keda cve-2025-68476

11. Envoy Proxy: v1.35.7, v1.33.13, v1.36.3

Category: Security / Patch
What happened: Envoy Proxy has released several updates addressing critical security vulnerabilities across multiple versions. Specifically, versions v1.33.13, v1.34.11, v1.35.7, and v1.36.3 include fixes for issues such as JWT authentication crashes, TLS certificate matching problems, and potential request smuggling risks. SRE and DevOps engineers should immediately upgrade to the latest version applicable to their deployment to mitigate these vulnerabilities and ensure system security. These updates are crucial for maintaining the integrity and reliability of services using Envoy Proxy. For detailed information and guidance, refer to the official Envoy Proxy release notes.
Do this Monday: Operators using Envoy should upgrade to v1.35.7 to mitigate risks associated with these vulnerabilities, particularly in environments utilizing JWT authentication and TLS configurations.
Sources: Envoy Proxy releases, Envoy Proxy releases, Envoy Proxy releases

Tags: envoy cve-2025-64527 cve-2025-66220 cve-2025-64763 envoy-proxy

12. Supply Chain Security Risk: GitHub Action tj-actions/changed-files Compromised

Category: Security / Patch
What happened: A critical vulnerability (CVE-2025-30066) was discovered in the GitHub Action tj-actions/changed-files, which could expose CI/CD secrets in build logs. This affects users of the action in public repositories, where logs may be accessible to unauthorized users.
Do this Monday: If your workflows use tj-actions/changed-files, review your CI/CD logs for exposed secrets and consider switching to a more secure alternative until the vulnerability is addressed.
Source: Aqua Security Blog

Tags: github-actions cve-2025-30066 tj-actions

13. Blog: May 2022 Security Announcement

Category: Security / Patch
What happened: The Flux Team has identified three critical security vulnerabilities in Flux that affect versions prior to 0.29.0. These vulnerabilities include improper kubeconfig validation allowing arbitrary code execution (CVE-2022-24817), improper path handling in Kustomization files leading to path traversal (CVE-2022-24877), and denial of service (CVE-2022-24878). Users are strongly advised to upgrade their clusters to mitigate these risks.
Do this Monday: Operators using affected versions of Flux should prioritize upgrading to version 0.29.0 or later to avoid potential security risks, especially in multi-tenant environments where the impact could be more severe. Failure to upgrade could lead to unauthorized access and service disruptions.
Source: FluxCD Blog

Tags: flux cve-2022-24817 cve-2022-24877 cve-2022-24878

14. Do nothing for better work 🚶➡️, open-source decisions 📂, self-cleaning JavaScript 🧼

Category: Security / Patch
What happened: The article discusses recent CVEs (Common Vulnerabilities and Exposures) discovered in React and Node.js, highlighting the role of AI in identifying these vulnerabilities. It emphasizes the importance of staying updated on security issues in popular frameworks and libraries used in web development.
Do this Monday: These CVEs could potentially affect applications built on React and Node.js, necessitating prompt updates and patches to mitigate security risks.
Source: TLDR Dev

Tags: cve-2023-xxxx react nodejs

15. Crossplane v2.0.7: Backports fix for shared transitive dependencies and security updates

Category: Security / Patch
What happened: Crossplane v2.0.7 backports a fix for upgrading shared transitive dependencies and includes security updates for its dependencies.
Do this Monday: This release addresses a critical upgrade issue and includes important security patches that could affect system stability and security.
Source: Crossplane releases

Tags: crossplane release security

Releases

1. Argo CD v3.2.4: Release Invalid - Upgrade to v3.2.5 Recommended

Category: Release
What happened: Release v3.2.4 of Argo CD is invalid and should not be used; users are directed to upgrade to v3.2.5 instead.
Do this Monday: Using the invalid release could lead to unexpected issues in production environments.
Source: Argo CD releases

Tags: argo-cd release

2. [Release] Antigravity Link v1.0.10 – Fixes for the recent Google IDE update

Category: Release
What happened: Antigravity Link v1.0.10 has been released to address issues caused by a recent update to the Google Antigravity IDE, restoring functionality for message injection and UI elements.
Do this Monday: Engineers should update to v1.0.10 to ensure chat functionality works properly, as previous versions may lead to usability issues.
Source: Reddit r/devops

Tags: antigravity ide update

3. Announcing Claude Opus 4.6 on Vertex AI

Category: Release
What happened: Google Cloud has released Claude Opus 4.6 on Vertex AI, enhancing capabilities for enterprise workflows, financial analysis, coding, and complex task orchestration.
Do this Monday: This update may affect production environments utilizing AI for document generation and coding tasks, potentially improving efficiency but requiring validation of output quality.
Source: Google Cloud Blog

Tags: ai vertex-ai claude-opus

4. Amazon EC2 C8id, M8id, and R8id instances with up to 22.8 TB local NVMe storage are generally available

Category: Release
What happened: AWS has launched new EC2 instance types (C8id, M8id, R8id) that feature significantly increased vCPUs, memory, and up to 22.8 TB of local NVMe storage.
Do this Monday: These instances may provide enhanced performance for workloads requiring high storage and compute capabilities, impacting resource planning and cost management.
Source: AWS What's New

Tags: aws ec2 nvme

Lightning links

Announcing Amazon EC2 G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs (AWS What's New) -- AWS has launched EC2 G7e instances powered by NVIDIA RTX PRO 6000 GPUs, offering up to 2.
Automating AWS SDK for Java v1 to v2 Upgrades with AWS Transform (AWS DevOps Blog) -- AWS has introduced a tool to automate the upgrade process from AWS SDK for Java v1 to v2, crucial as v1 reaches end-of-support by December 31, 2025.
Noma Security Identifies Security Flaw in Docker AI Assistant (Container Journal) -- Noma Security has identified a critical security flaw in Docker's AI assistant that could allow exploitation of Docker...
Is Claude Opus 4.6 the Best Security Researcher Ever? (DevOps.com) -- Anthropic's Claude Opus 4.6 has identified over 600 new vulnerabilities in popular open source software, showcasing the...
Traefik v2.11.32 (Traefik releases) -- Traefik v2.11.32 includes a critical security fix for CVE-2025-66490, which introduces a breaking change, urging...
Helm v3.19.4 (Helm releases) -- Helm v3.19.4 is a security patch addressing a Go CVE, and users are advised to upgrade to mitigate potential vulnerabilities.
cert-manager v1.18.5 (cert-manager releases) -- cert-manager v1.18.5 addresses a moderate severity DoS vulnerability and improves certificate issuance processes,...
DevOps'ish 295: death of an ingress, Amazon layoffs, my desk, and more (DevOps'ish) -- Ingress NGINX will reach end of life in March 2026, requiring users to migrate to alternative solutions to avoid security risks.
Our plan for a more secure npm supply chain (GitHub Security) -- GitHub has announced a comprehensive plan to enhance the security of the npm supply chain in response to recent...
Edera Advisory Highlights Remote Code Execution Flaw in Kubernetes (Container Journal) -- Edera has published an advisory highlighting a design flaw in Kubernetes that could allow for full remote code execution in any container on a node.

Human Stories

In the whirlwind of incidents, from Signal's ongoing outage to Hetzner's unexpected downtime, we're reminded of the delicate dance between technology and reliability. Each outage tells a story of complexity, where even the most robust systems can falter under unforeseen pressures. These events highlight the importance of resilience and the constant vigilance we must maintain in our roles. As SREs, our mission goes beyond simply fixing issues; it’s about understanding these stories, learning from them, and fortifying our systems against the tides of unpredictability. Let us carry forward these lessons with humility, knowing that while perfection is a moving target, our commitment to improvement is unwavering.

Also worth reading

At what point does reasonable assurance turn into busywork? (Reddit r/sre)

The discussion highlights concerns about audit requests becoming excessive and focused on formatting rather than actual risk assessment, leading to increased overhead for teams.

Bazzite Postmortem (Hacker News (incidents))

The Bazzite incident involved a significant outage due to a misconfiguration that led to service unavailability for users.

The requirement to deliver above all else (Reddit r/sre)

A discussion on the corporate pressure to prioritize delivery over technical integrity highlights the risks of neglecting underlying issues in systems, which can lead to long-term reliability problems.

View Full Brief →

On Call Brief – Week of 2026-02-08

This week's top stories

1. Signal Outage [Ongoing]

2. Hetzner Outage

CVE & Security

1. Privilege Escalation in Aurora PostgreSQL using AWS JDBC Wrapper, AWS Go Wrapper, AWS NodeJS Wrapper, AWS Python Wrapper, AWS PGSQL ODBC driver

2. CVE-2025-9039 - Issue with Amazon ECS agent introspection server

3. IngressNightmare Vulnerabilities: All You Need to Know

4. Key Commitment Issues in S3 Encryption Clients

5. Docker Compose vulnerability opens door to host-level writes – patch pronto

6. Microsoft kills 9.9-rated ASP.NET Core bug – 'our highest ever' score

7. GitLab: 18.8.4, 18.6.2

8. Helm v4.0.4

9. Traefik v3.6.4: Critical Vulnerabilities Fixed and Migration Required

10. KEDA v2.17.3: Security Vulnerability Fix for CVE-2025-68476 Released

11. Envoy Proxy: v1.35.7, v1.33.13, v1.36.3

12. Supply Chain Security Risk: GitHub Action tj-actions/changed-files Compromised

13. Blog: May 2022 Security Announcement

14. Do nothing for better work 🚶➡️, open-source decisions 📂, self-cleaning JavaScript 🧼

15. Crossplane v2.0.7: Backports fix for shared transitive dependencies and security updates

Releases

1. Argo CD v3.2.4: Release Invalid - Upgrade to v3.2.5 Recommended

2. [Release] Antigravity Link v1.0.10 – Fixes for the recent Google IDE update

3. Announcing Claude Opus 4.6 on Vertex AI

4. Amazon EC2 C8id, M8id, and R8id instances with up to 22.8 TB local NVMe storage are generally available

Lightning links

Human Stories

Also worth reading

Past Briefs

On Call Brief – Week of 2026-02-08