On Call Brief – Week of 2026-02-08

2026-02-08 Briefing: 2026-02-08
Category:
Tags:

This week's top stories

1. Signal Outage [Ongoing]

  • Tags: outage signal messaging
  • 2. Hetzner Outage

    • Category: Community
    • What happened: Hetzner experienced an outage that affected multiple services, leading to downtime for customers.
    • Worth reading: This outage may impact users relying on Hetzner for hosting, potentially affecting service availability and performance.
    • Source: Hacker News (incidents)
    • Discussion: https://news.ycombinator.com/item?id=46870305
  • Tags: outage hetzner

  • CVE & Security

    1. Privilege Escalation in Aurora PostgreSQL using AWS JDBC Wrapper, AWS Go Wrapper, AWS NodeJS Wrapper, AWS Python Wrapper, AWS PGSQL ODBC driver

    • Category: Security / Patch
    • What happened: A critical privilege escalation vulnerability (CVE-2025-12967) has been identified in AWS wrappers for Amazon Aurora PostgreSQL, allowing low-privileged users to escalate their privileges to the rds_superuser role. This affects several AWS wrapper versions, necessitating immediate attention and updates.
    • Do this Monday: If your organization uses affected AWS wrappers for Aurora PostgreSQL, you must upgrade to the specified versions to mitigate the risk of unauthorized privilege escalation, which could lead to significant security breaches.
    • Source: AWS Security Bulletins
  • Tags: cve-2025-12967 aws-jdbc-wrapper aws-go-wrapper aws-nodejs-wrapper aws-python-wrapper aurora-postgresql
  • 2. CVE-2025-9039 - Issue with Amazon ECS agent introspection server

    • Category: Security / Patch
    • What happened: CVE-2025-9039 affects the Amazon ECS agent, potentially allowing unauthorized off-host access to the introspection server under specific security group configurations. This vulnerability is critical for users relying on ECS for container orchestration, especially if they have not disabled off-host access.
    • Do this Monday: Operators using Amazon ECS should review their security group settings and ensure that off-host access to the introspection server is disabled to mitigate potential unauthorized access. Immediate action is recommended for affected ECS Agent versions.
    • Source: AWS Security Bulletins
  • Tags: cve-2025-9039 amazon-ecs
  • 3. IngressNightmare Vulnerabilities: All You Need to Know

    • Category: Security / Patch
    • What happened: A series of critical vulnerabilities, collectively referred to as IngressNightmare, have been identified in the ingress-nginx Controller for Kubernetes, with CVE numbers CVE-2025-1097, CVE-2025-1098, CVE-2025-24514, and CVE-2025-1974. These vulnerabilities pose significant security risks as they could potentially allow attackers to compromise Kubernetes environments. According to the Kubernetes Steering and Security Response Committees, the Ingress NGINX will be retired in March 2026 due to insufficient contributors and maintainers, which means it will no longer receive updates or security patches. Operators are advised to prioritize patching these vulnerabilities immediately and consider transitioning to alternative ingress solutions before the retirement date to ensure continued security and support. This information is sourced from the Aqua Security Blog and the Kubernetes Blog.
    • Do this Monday: Operators using ingress-nginx should prioritize patching these vulnerabilities to prevent unauthorized access to sensitive data and mitigate the risk of a cluster takeover.
    • Sources: Aqua Security Blog, Kubernetes Blog
  • Tags: ingress-nginx cve-2025-1097 cve-2025-1098 cve-2025-24514 cve-2025-1974 ingress-nginx migration kubernetes
  • 4. Key Commitment Issues in S3 Encryption Clients

    • Category: Security / Patch
    • What happened: Multiple CVEs have been identified in AWS S3 Encryption Clients across various programming languages, exposing encrypted data keys to potential attacks. This issue affects Java, Go, .NET, C++, PHP, and Ruby clients, necessitating immediate attention and updates to the specified versions to mitigate risks.
    • Do this Monday: Operators using the affected S3 Encryption Clients must update to the specified versions to prevent potential exposure of encrypted data keys, which could lead to unauthorized access to sensitive data stored in S3.
    • Source: AWS Security Bulletins
  • Tags: cve-2025-14763 cve-2025-14764 cve-2025-14759 cve-2025-14760 cve-2025-14761 cve-2025-14762 aws-s3
  • 5. Docker Compose vulnerability opens door to host-level writes – patch pronto

    • Category: Security / Patch
    • What happened: A critical vulnerability in Docker Compose has been identified, allowing for potential path traversal attacks that could lead to host-level writes. Users are urged to upgrade immediately to mitigate this risk.
    • Do this Monday: This vulnerability could allow attackers to gain unauthorized access to the host system, posing a significant security risk for environments using Docker Compose. Immediate patching is recommended to prevent exploitation.
    • Source: The Register (DevOps)
  • Tags: docker-compose cve-2025-xxxx
  • 6. Microsoft kills 9.9-rated ASP.NET Core bug – 'our highest ever' score

    • Category: Security / Patch
    • What happened: Microsoft has released a critical patch for a vulnerability in the Kestrel web server component of ASP.NET Core, which has a CVSS score of 9.9, marking it as their highest severity rating ever. The flaw allows for request smuggling, and its impact varies based on the hosting setup and application code.
    • Do this Monday: This vulnerability poses a significant risk to applications using ASP.NET Core, particularly those relying on Kestrel for handling requests. Operators should prioritize applying this patch to mitigate potential security breaches.
    • Source: The Register (DevOps)
  • Tags: aspnet-core kestrel cve-2025-xxxx
  • 7. GitLab: 18.8.4, 18.6.2

    • Category: Security / Patch
    • What happened: GitLab has issued several critical patch releases for its products, specifically versions 18.8.4, 18.7.4, 18.6.6, 18.6.2, 18.5.4, and 18.4.6, addressing high-severity vulnerabilities that could potentially lead to Denial of Service or unauthorized code execution. Notably, the GitLab AI Gateway is affected by a severe vulnerability identified as CVE-2026-1868, which is mitigated in versions 18.6.2, 18.7.1, and 18.8.1. Operators should prioritize upgrading to these patched versions immediately to ensure system security and stability. These updates are crucial for both Community and Enterprise editions, as they resolve several critical security flaws. For further details, refer to the official GitLab Security Releases documentation.
    • Do this Monday: Failure to upgrade could expose self-managed GitLab installations to critical vulnerabilities, potentially leading to service disruptions or data breaches.
    • Sources: GitLab Security Releases, GitLab Security Releases, GitLab Security Releases
  • Tags: gitlab cve-2025-7659 cve-2025-8099 cve-2026-0958 cve-2025-14560 cve-2026-0595 security patch
  • 8. Helm v4.0.4

    • Category: Security / Patch
    • What happened: Helm v4.0.4 is a patch release addressing a security vulnerability related to a Go CVE in the previous version. Users are encouraged to upgrade to this version to mitigate potential security risks.
    • Do this Monday: This release is critical for users of Helm as it addresses a security vulnerability. Operators should prioritize upgrading to v4.0.4 to ensure their deployments are secure.
    • Source: Helm releases
  • Tags: helm cve-2023-xxxx
  • 9. Traefik v3.6.4: Critical Vulnerabilities Fixed and Migration Required

    • Category: Security / Patch
    • What happened: Traefik v3.6.4 has been released, addressing critical vulnerabilities including CVE-2025-66490 and CVE-2025-66491. The release includes a breaking change that requires users to follow a migration guide. Additionally, several bug fixes and documentation improvements have been made.
    • Do this Monday: Operators using Traefik should prioritize upgrading to v3.6.4 to mitigate security vulnerabilities. The breaking change necessitates careful migration to avoid disruptions in service. Failure to upgrade could expose systems to potential exploits.
    • Source: Traefik releases
  • Tags: traefik cve-2025-66490 cve-2025-66491 kubernetes ingress-nginx
  • 10. KEDA v2.17.3: Security Vulnerability Fix for CVE-2025-68476 Released

    • Category: Security / Patch
    • What happened: KEDA v2.17.3 has been released, addressing a security vulnerability identified as CVE-2025-68476.
    • Do this Monday: This release includes a fix for a critical security vulnerability, which may affect deployments using KEDA. Operators should upgrade to this version to mitigate potential risks.
    • Source: KEDA releases
  • Tags: keda cve-2025-68476
  • 11. Envoy Proxy: v1.35.7, v1.33.13, v1.36.3

    • Category: Security / Patch
    • What happened: Envoy Proxy has released several updates addressing critical security vulnerabilities across multiple versions. Specifically, versions v1.33.13, v1.34.11, v1.35.7, and v1.36.3 include fixes for issues such as JWT authentication crashes, TLS certificate matching problems, and potential request smuggling risks. SRE and DevOps engineers should immediately upgrade to the latest version applicable to their deployment to mitigate these vulnerabilities and ensure system security. These updates are crucial for maintaining the integrity and reliability of services using Envoy Proxy. For detailed information and guidance, refer to the official Envoy Proxy release notes.
    • Do this Monday: Operators using Envoy should upgrade to v1.35.7 to mitigate risks associated with these vulnerabilities, particularly in environments utilizing JWT authentication and TLS configurations.
    • Sources: Envoy Proxy releases, Envoy Proxy releases, Envoy Proxy releases
  • Tags: envoy cve-2025-64527 cve-2025-66220 cve-2025-64763 envoy-proxy
  • 12. Supply Chain Security Risk: GitHub Action tj-actions/changed-files Compromised

    • Category: Security / Patch
    • What happened: A critical vulnerability (CVE-2025-30066) was discovered in the GitHub Action tj-actions/changed-files, which could expose CI/CD secrets in build logs. This affects users of the action in public repositories, where logs may be accessible to unauthorized users.
    • Do this Monday: If your workflows use tj-actions/changed-files, review your CI/CD logs for exposed secrets and consider switching to a more secure alternative until the vulnerability is addressed.
    • Source: Aqua Security Blog
  • Tags: github-actions cve-2025-30066 tj-actions
  • 13. Blog: May 2022 Security Announcement

    • Category: Security / Patch
    • What happened: The Flux Team has identified three critical security vulnerabilities in Flux that affect versions prior to 0.29.0. These vulnerabilities include improper kubeconfig validation allowing arbitrary code execution (CVE-2022-24817), improper path handling in Kustomization files leading to path traversal (CVE-2022-24877), and denial of service (CVE-2022-24878). Users are strongly advised to upgrade their clusters to mitigate these risks.
    • Do this Monday: Operators using affected versions of Flux should prioritize upgrading to version 0.29.0 or later to avoid potential security risks, especially in multi-tenant environments where the impact could be more severe. Failure to upgrade could lead to unauthorized access and service disruptions.
    • Source: FluxCD Blog
  • Tags: flux cve-2022-24817 cve-2022-24877 cve-2022-24878
  • 14. Do nothing for better work 🚢➑️, open-source decisions πŸ“‚, self-cleaning JavaScript 🧼

    • Category: Security / Patch
    • What happened: The article discusses recent CVEs (Common Vulnerabilities and Exposures) discovered in React and Node.js, highlighting the role of AI in identifying these vulnerabilities. It emphasizes the importance of staying updated on security issues in popular frameworks and libraries used in web development.
    • Do this Monday: These CVEs could potentially affect applications built on React and Node.js, necessitating prompt updates and patches to mitigate security risks.
    • Source: TLDR Dev
  • Tags: cve-2023-xxxx react nodejs
  • 15. Crossplane v2.0.7: Backports fix for shared transitive dependencies and security updates

    • Category: Security / Patch
    • What happened: Crossplane v2.0.7 backports a fix for upgrading shared transitive dependencies and includes security updates for its dependencies.
    • Do this Monday: This release addresses a critical upgrade issue and includes important security patches that could affect system stability and security.
    • Source: Crossplane releases
  • Tags: crossplane release security

  • Releases

    1. Argo CD v3.2.4: Release Invalid - Upgrade to v3.2.5 Recommended

    • Category: Release
    • What happened: Release v3.2.4 of Argo CD is invalid and should not be used; users are directed to upgrade to v3.2.5 instead.
    • Do this Monday: Using the invalid release could lead to unexpected issues in production environments.
    • Source: Argo CD releases
  • Tags: argo-cd release
  • 2. [Release] Antigravity Link v1.0.10 – Fixes for the recent Google IDE update

    • Category: Release
    • What happened: Antigravity Link v1.0.10 has been released to address issues caused by a recent update to the Google Antigravity IDE, restoring functionality for message injection and UI elements.
    • Do this Monday: Engineers should update to v1.0.10 to ensure chat functionality works properly, as previous versions may lead to usability issues.
    • Source: Reddit r/devops
  • Tags: antigravity ide update
  • 3. Announcing Claude Opus 4.6 on Vertex AI

    • Category: Release
    • What happened: Google Cloud has released Claude Opus 4.6 on Vertex AI, enhancing capabilities for enterprise workflows, financial analysis, coding, and complex task orchestration.
    • Do this Monday: This update may affect production environments utilizing AI for document generation and coding tasks, potentially improving efficiency but requiring validation of output quality.
    • Source: Google Cloud Blog
  • Tags: ai vertex-ai claude-opus
  • 4. Amazon EC2 C8id, M8id, and R8id instances with up to 22.8 TB local NVMe storage are generally available

    • Category: Release
    • What happened: AWS has launched new EC2 instance types (C8id, M8id, R8id) that feature significantly increased vCPUs, memory, and up to 22.8 TB of local NVMe storage.
    • Do this Monday: These instances may provide enhanced performance for workloads requiring high storage and compute capabilities, impacting resource planning and cost management.
    • Source: AWS What's New
  • Tags: aws ec2 nvme

  • Lightning links

    Human Stories

    In the whirlwind of incidents, from Signal's ongoing outage to Hetzner's unexpected downtime, we're reminded of the delicate dance between technology and reliability. Each outage tells a story of complexity, where even the most robust systems can falter under unforeseen pressures. These events highlight the importance of resilience and the constant vigilance we must maintain in our roles. As SREs, our mission goes beyond simply fixing issues; it’s about understanding these stories, learning from them, and fortifying our systems against the tides of unpredictability. Let us carry forward these lessons with humility, knowing that while perfection is a moving target, our commitment to improvement is unwavering.

    Also worth reading

    At what point does reasonable assurance turn into busywork? (Reddit r/sre)

    The discussion highlights concerns about audit requests becoming excessive and focused on formatting rather than actual risk assessment, leading to increased overhead for teams.

    Bazzite Postmortem (Hacker News (incidents))

    The Bazzite incident involved a significant outage due to a misconfiguration that led to service unavailability for users.

    The requirement to deliver above all else (Reddit r/sre)

    A discussion on the corporate pressure to prioritize delivery over technical integrity highlights the risks of neglecting underlying issues in systems, which can lead to long-term reliability problems.
    Scroll to Top