• Nimitz Tech
  • Posts
  • Nimitz Tech Hearing 9-24-24 - House Homeland

Nimitz Tech Hearing 9-24-24 - House Homeland

Learn what Congress is doing to address the challenges posed by AI and cybersecurity.

NIMITZ TECH NEWS FLASH

“An Outage Strikes: Assessing the Global Impact of Crowdstrike’s Faulty Software Update”

House Committee on Homeland Security, Cybersecurity and Infrastructure Protection Subcommittee

September 24, 2024 (recording linked here)

HEARING INFORMATION

Witnesses and Written Testimony (linked):

  • Mr. Adam Meyers: Senior Vice President, Counter Adversary Operations, CrowdStrike

HEARING HIGHLIGHTS

AI in Cybersecurity and Threat Detection:

The role of artificial intelligence (AI) in both cybersecurity defense and potential threats was discussed, with concerns raised about the use of AI in malicious cyber activities like disinformation, ransomware, and potentially writing malicious code. The hearing highlighted the need to consider how to regulate AI's use in cybersecurity and ensure that it is deployed responsibly, with countermeasures in place to prevent adversaries from exploiting AI and quantum computing advances.

Update Management and Risk Mitigation:

A significant focus was on how CrowdStrike's update process led to the outage, with Mr. Meyers discussing the deployment of faulty updates and the transition to a phased approach for future updates. The hearing drew attention to how to regulate or incentivize robust update management and risk mitigation practices across tech companies to prevent system-wide disruptions and ensure customer control over update deployment.

Accountability and Compensation for Cybersecurity Failures:

The hearing touched on the financial impact of the outage, including billions in losses and the disruption of critical infrastructure highlighting the need to consider regulations or frameworks for accountability, compensation mechanisms, and legal remedies for victims of cybersecurity incidents, ensuring companies are held accountable for technical failures that cause widespread harm.

IN THEIR WORDS

"AI scares me, and that's why we have to be on top. Because the only defense against AI in the future... is to have AI on your side that writes the counter code just as fast as it's getting... the code."

- Rep. Gimenez

"We need to all be able to share and work off the same sheet of music... Private-public partnership is absolutely essential because this is a team sport, and we all are on the same team."

 - Mr. Adam Meyers

SUMMARY OF OPENING STATEMENTS FROM THE COMMITTEE

  • Chairman Garbarino opened the hearing by explaining that the purpose was to examine the global IT outage on July 19, caused by a faulty software update from CrowdStrike. He highlighted the widespread disruptions, such as hospital system failures, flight cancellations, and financial service interruptions. The Chairman emphasized the need to understand the errors that led to this incident and how CrowdStrike responded. He stressed the importance of examining how cybercriminals exploited the outage and noted that this event raised concerns about future vulnerabilities.

  • Ranking Member Swalwell expressed the importance of CrowdStrike's success for its clients and acknowledged the company's presence at the hearing. He pointed out the severe impact of the July 19 outage, which affected millions of devices globally, causing billions in losses. The Ranking Member called for an examination of CrowdStrike's processes, including its update procedures, and emphasized the need to learn from similar incidents in the past. He concluded by noting the importance of cooperation between companies and government entities to prevent such failures from recurring.

  • Chairman Green (Full-Committee) described the July 19 outage as a shocking event that disrupted essential services across the globe, including flights, medical procedures, and emergency calls. He stressed that while mistakes happen, an error of this magnitude must not be repeated. Green highlighted the interconnectedness of networks and the importance of strong public-private partnerships in cybersecurity. He expressed hope that the hearing would provide answers on what went wrong and how such incidents could be avoided in the future.

SUMMARY OF WITNESS STATEMENTS

  • Mr. Adam Meyers, Senior Vice President at CrowdStrike, began by apologizing for the July 19 incident, acknowledging that the faulty update caused widespread system crashes. He assured the committee that CrowdStrike worked diligently to restore systems quickly and has taken steps to prevent similar issues in the future. Mr. Meyers emphasized that the incident was not a cyberattack but a technical failure, and CrowdStrike has been transparent in its response. He concluded by discussing the broader cybersecurity threat landscape, highlighting nation-state adversaries and ransomware as ongoing concerns.

SUMMARY OF Q and A

  • Chairman Green emphasized the importance of the cybersecurity workforce and mentioned that he had recently pushed a bill to address the workforce shortage. He then asked whether AI or an individual made the decision to launch the faulty update and how that decision was reached. Mr. Meyers responded by clarifying that AI did not make the decision. Instead, the update was part of a standard process where CrowdStrike releases 10 to 12 content updates daily, and this was a part of their routine operating procedure. The Chairman followed up by asking if these updates were distributed automatically and globally all at once. Mr. Meyers confirmed that the updates were indeed sent to all customers at once during this particular incident, but CrowdStrike has since changed this process. He referred to a graphic included in his full testimony that depicted their revised update distribution method. The Chairman expressed relief that CrowdStrike was no longer deploying updates globally and simultaneously, acknowledging that this change could have prevented the incident. He compared it to how different airlines' systems handled disruptions differently based on their infrastructure.

  • Rep. Carter asked if giving customers control over when they receive updates could have reduced the impact of the outage and requested details on the new options for update control. Mr. Meyers explained that CrowdStrike has implemented a system of concentric rings, allowing customers to choose between an Early Adopter Program for quick updates, general availability, or waiting longer before updates are pushed out. Rep. Carter inquired about the safest and most efficient update process for consumer protection. Mr. Meyers responded that the Early Adopter option is suitable for testing systems, while general availability is best for mission-critical systems, but delaying updates comes with the risk of not having the latest threat intelligence. Rep. Carter asked if CrowdStrike could override customer update preferences to push critical updates. Mr. Meyers confirmed that CrowdStrike does not have the ability to override customer decisions. Rep. Carter asked if there was increased cooperation between CrowdStrike, Microsoft, and other companies to prevent similar incidents. Mr. Meyers confirmed that CrowdStrike worked closely with Microsoft after the July 19 incident and continues to collaborate to ensure resilience. Rep. Carter asked how to raise public awareness of cybersecurity threats. Mr. Meyers emphasized the importance of public awareness and expressed willingness to work with Congress to develop strategies for educating the public on cybersecurity threats.

  • Rep. Ezell asked why the solution to the July 19, 2024, outage was a manual reboot, particularly impacting rural areas like Mississippi. Mr. Meyers explained that initially, manual intervention was required, but within a day, automated systems were identified to expedite recovery, which significantly sped up the process. Rep. Ezell asked what steps were being taken to avoid a difficult recovery process in the future. Mr. Meyers responded that new systems now allow customers to opt in and control when they receive updates, applying best practices to content updates. Rep. Ezell requested details on the support CrowdStrike provided to rural communities in South Mississippi. Mr. Meyers said he would need to provide that information later. Rep. Ezell cited an article that mentioned engineers were given only two months to complete work that typically takes a year and asked if staffing shortages influenced this decision. Mr. Meyers highlighted CrowdStrike’s robust internship program and recruitment efforts to fill positions globally. Rep. Ezell asked how CrowdStrike supports its staff to ensure they have the tools and skills to succeed. Mr. Meyers described internal training programs, external industry trainings, and the involvement of CrowdStrike's own researchers in conducting training to develop staff skills. Rep. Ezell asked for recommendations to mitigate single points of failure in cybersecurity systems. Mr. Meyers acknowledged the challenge, emphasizing that identifying and addressing vulnerabilities is an ongoing global effort requiring collaboration across industries.

  • Ranking Member Swalwell asked why CrowdStrike issues updates to the kernel (the core part of an operating system), how it balances the risks, and whether CrowdStrike plans to change its approach to reduce the risk of system crashes. Mr. Meyers explained that the kernel provides essential performance, visibility, and enforcement for cybersecurity, including anti-tampering protections. While there are risks, kernel access is crucial for detecting and preventing system threats. He noted that most security products use kernel access, and securing a system without it would be difficult. The Ranking Member asked if the Cyber Safety Review Board (CSRB) should review the July 19 incident and whether CrowdStrike would cooperate. Mr. Meyers confirmed that CrowdStrike would fully cooperate with any review and had already been working with CISA, ISACs, and congressional staff to provide transparency. The Ranking Member then inquired about the current threat environment and whether any particular adversaries are escalating their attacks as the U.S. heads toward the November 2024 election. Mr. Meyers responded that adversaries like Iran, China, and Russia are closely monitoring U.S. elections. He highlighted espionage, disinformation, and misinformation as key tactics, with adversaries using social networks to influence narratives that serve their interests.

  • Rep. Lee asked whether the software update that caused the global IT outage was released in a phased manner, referencing the "concentric circles" rollout. Mr. Meyers explained that the configuration update involved content, not code. The sensor code had been rolled out in a phased manner, but the content updates had not been treated as code. Since the July 19 incident, CrowdStrike has started treating content updates as code and now uses a phased deployment process, including internal testing and early adopter stages. Rep. Lee questioned whether the failure to implement the content update in a phased approach had catastrophic consequences. Mr. Meyers agreed that the phased approach now in place is meant to prevent similar incidents, allowing customers to choose when and how they receive updates. Rep. Lee asked whether using the user space instead of the kernel for updates could have prevented the incident. Mr. Meyers responded that kernel access is critical for visibility, enforcement, and preventing tampering by threat actors. While user mode can be used, the kernel is necessary for effective security. Rep. Lee further inquired if it is possible to operate outside of the kernel. Mr. Meyers stated that while it is possible, using the kernel is the most effective industry standard for visibility and security. Finally, Rep. Lee asked what other modifications CrowdStrike has made to prevent future incidents. Mr. Meyers highlighted that the primary change has been the introduction of the new phased deployment mechanism, giving customers control over when they receive updates.

  • Rep. Luttrell asked for a more detailed explanation of CrowdStrike's internal testing process, specifically mentioning the "human element" and whether CrowdStrike uses the OODA (Observe, Orient, Decide, Act) loop method in testing. He wanted to understand how the company ensures that human testing is reliable. Mr. Meyers explained that the content updates were tested by validators, with each individual rule within the content file being tested. However, he did not have the exact number of validators at hand. Rep. Luttrell pressed on the issue of whether the updates should have been tested collectively rather than individually, given the broad scope of infrastructure CrowdStrike protects. Mr. Meyers confirmed that CrowdStrike has since moved to testing all updates collectively before releasing them to customers, acknowledging that previously, each configuration was tested individually. Rep. Luttrell then asked for clarification on how the faulty update was launched despite testing, wondering whether it had passed tests and failed after launch, or if it failed but was launched accidentally. Mr. Meyers clarified that the update passed all tests and appeared clean, which is why it was released. However, a complex error occurred when the sensor tried to process the update. Rep. Luttrell sought further clarification on where exactly the system failed. Mr. Meyers described the failure as a "perfect storm," explaining that the content file triggered an issue with the kernel sensor, comparing it to attempting to move a chess piece where no square exists. Rep. Luttrell concluded by asking about the response mechanism in case a similar issue arises in the future. Mr. Meyers stated that any similar issue would now be detected by CrowdStrike before being released, thanks to the new process where content updates are treated as code and subjected to thorough internal testing.

  • Rep. Menendez began by expressing gratitude for the committee's leadership and shared the impact of the recent global IT outage caused by a faulty CrowdStrike sensor update. He described how the incident affected various sectors in his district, including delays at Newark Liberty International Airport, canceled procedures at hospitals, and inoperable 911 dispatch centers. Rep. Menendez emphasized that this disruption was preventable and called for CrowdStrike to implement better quality assurance practices to protect public services. He pointed out the importance of robust cybersecurity measures and asked how providing customers with control over rapid response content deployments could enhance overall security. Mr. Meyers responded by apologizing for the incident's impact and reaffirming CrowdStrike's commitment to threat prevention. He explained that allowing customers to control which systems receive updates enables them to test new updates on select systems first, thus improving security by tailoring the deployment based on their specific needs. Rep. Menendez then asked about the support CrowdStrike offers to customers making individualized decisions, ensuring they have access to the necessary options for their industries. Mr. Meyers highlighted CrowdStrike's customer-centric approach since its inception. He mentioned that the company actively engages with customers through advisory boards and continuous communication, particularly after the July incident. He emphasized the importance of collaboration between CrowdStrike, customers, and government agencies to ensure effective cybersecurity measures.

  • Rep. Gimenez began by confirming that CrowdStrike issues around 10 to 12 content updates daily, aimed at updating threat intelligence to keep their systems current. He sought clarification on whether these updates were routine or if the events of July 19 involved a specific system upgrade. Mr. Meyers confirmed that the July 19 incident was indeed a configuration update, which occurs 10 to 12 times daily. Rep. Gimenez then asked what made this particular update different, emphasizing the need for system tests to ensure updates don't cause harm. Mr. Meyers explained that the problem arose from a mismatch in the configuration update, likening it to moving a chess piece to a nonexistent square, which led to the system's failure to process the update correctly. Rep. Gimenez inquired if such an issue had ever occurred before, to which Mr. Meyers acknowledged that this was the first time this specific problem had manifested. Rep. Gimenez pressed on whether internal processes were followed correctly and whether the issue stemmed from a failure in their procedures or external factors. Mr. Meyers clarified that the processes were followed; however, there was a failure in the content validator. He noted that additional safeguards been implemented to prevent a recurrence.

    Rep. Gimenez then asked about artificial intelligence (AI) in the context of cybersecurity, questioning whether it poses a threat. Mr. Meyers responded that AI can be both a threat and a benefit, offering tools for cyber defenders to analyze and respond to threats more efficiently. Rep. Gimenez concluded by stating that the nation leading in AI would be better protected against adversaries and asked for Meyers' agreement, which he received.

  • Chairman Garbarino opened the questioning by addressing the term "perfect storm," seeking clarity on what exactly constituted this event and what measures are being taken to ensure it doesn’t recur. He emphasized the importance of understanding the technical details. Mr. Meyers explained that the issue arose from a content validator that allowed 21 fields to be sent out to the sensor fleet. The sensor was looking for a configuration rule that was absent, which caused a system failure, leading to a blue screen. He noted that this detailed explanation is documented in the Root Cause Analysis (RCA). Chairman Garbarino pressed for assurance that such an incident wouldn’t happen again. Mr. Meyers responded that they have changed their methodology, treating configurations as code instead of mere configuration information. This change increases oversight and visibility in how configurations are deployed. Chairman Garbarino asked if the changes ensure that updates do not affect all systems simultaneously. Mr. Meyers confirmed that the new approach prevents widespread issues even if a problem arises. Chairman Garbarino also inquired about the recovery process after the incident, noting that initially, individual systems required manual rebooting and file deletions. Mr. Meyers elaborated that they eventually developed a USB boot disk to automate the recovery process, further improving efficiency with an automated solution. Chairman Garbarino then brought up reports regarding the impact of the outage on federal agencies, questioning why they were affected despite expectations of isolation between government and commercial networks. He asked if the updates differ for government versus commercial clients or if they are the same across the board. Mr. Meyers clarified that the updates were deployed to Microsoft Windows operating system sensors. Therefore, any system running a specific version of CrowdStrike Falcon on a Microsoft operating system during that time was impacted, regardless of whether it was a government or commercial system.

  • Rep. Gonzales expressed disappointment over the situation but appreciated CrowdStrike's quick response to the error. He highlighted the importance of transparency in the industry and asked Mr. Meyers how to ensure effective fixes if a similar incident occurs in the future, especially with vendors that may lack resources or integrity. He also inquired about the reception of CrowdStrike's recovery efforts by the government and whether discussions had occurred with CISA regarding the technical aspects of the incident. In response, Mr. Meyers stated that CrowdStrike had promptly communicated with CISA and other relevant staff after the incident. He mentioned that a preliminary incident report was issued to inform stakeholders about the situation. He noted that a comprehensive root cause analysis was conducted with input from external parties, and he received positive feedback regarding the report's depth and the preventive measures put in place to enhance customer control. Rep. Gonzales raised concerns about the lack of standardized processes among companies, emphasizing the need for solutions to avoid future problems. He introduced legislation for a National Digital Reserve Corps to recruit cybersecurity professionals to aid during major incidents and asked Mr. Meyers if such a reserve would improve response and recovery efforts. Mr. Meyers acknowledged the importance of transparency and stated that every system is unique, which complicates standardization. He affirmed that having skilled operators available in response to cyber threats is beneficial, reiterating that this incident was not the result of a cyber attack.

  • Rep. Timmons expressed appreciation for the testimony regarding the changes made to prevent future incidents, noting that the primary deterrent for recurrence is the significant financial damages estimated to exceed $5 billion for customers. He questioned how CrowdStrike plans to make victims whole in light of the incident's impact on critical infrastructure and air travel. Mr. Meyers responded that CrowdStrike had been actively working with customers to ensure their systems were operational, stating that by July 29, 99% of the sensors were back up and running. He emphasized the company's commitment to supporting affected customers through the recovery process. Rep. Timmons highlighted the severe consequences of the incident, mentioning that many people missed flights and that businesses experienced operational disruptions. He inquired about the accountability mechanisms CrowdStrike might have, including insurance policies, and whether he could speak about any legal remedies in place. Mr. Meyers acknowledged the impact on individuals affected by the incident, expressing remorse for what transpired and affirming the company’s dedication to rebuilding trust with customers. Rep. Timmons pointed out that the distinction between an innocuous error and a security breach might not matter to those who suffered losses, stressing the importance of accountability for all types of cybersecurity failures. He pressed Mr. Meyers on whether there was a difference between a breach and a faulty update in terms of the damages incurred. Mr. Meyers clarified that he recognized a difference between a breach and an error, but Rep. Timmons countered that from the perspective of constituents who experienced flight cancellations, the nature of the error did not lessen the damages they faced. He underscored the necessity of ensuring that victims of such incidents are compensated to deter future cybersecurity failures, concluding that these accountability measures are vital for the broader economic system.

  • Rep. Gimenez initiated the discussion by expressing concerns regarding the current and future threats posed by artificial intelligence (AI). He inquired about the existing threats from AI and how they may evolve, emphasizing the potential for AI to be used maliciously. Mr. Meyers responded by highlighting that the primary threats from AI currently involved disinformation and misinformation, as adversaries were using AI to automate tasks related to intrusions and ransomware operations. He acknowledged that while AI had not yet reached the stage of autonomously writing malicious code, the technology was rapidly advancing and warranted careful monitoring. Rep. Gimenez interrupted to voice his fears that if AI matured to the point of writing malicious code, it would necessitate the development of counter-AI to defend against a potential increase in cyberattacks. He also expressed concern about the combination of AI and quantum computing creating new vulnerabilities and suggested the possibility that society might need to disconnect systems for security reasons. Mr. Meyers countered by stating that he did not foresee a future where disconnection was necessary but stressed the importance of careful implementation of AI solutions. He offered to collaborate further with Rep. Gimenez on this issue. Rep. Gimenez reiterated his concerns about vulnerabilities in critical infrastructure and the potential for adversaries to exploit these weaknesses, causing massive disruptions.Mr. Meyers concluded by suggesting that organizations would increasingly deploy their own AI solutions, making it crucial to secure these AI workloads. He noted that adversaries could target AI by poisoning training data, introducing a new set of threats that needed to be addressed.

  • Rep. Gonzales expressed gratitude to Mr. Meyers for his testimony and acknowledged the committee's commitment to finding solutions regarding cybersecurity issues. He emphasized the need for collaboration between government and industry to effectively address cyber threats and prevent future incidents. Rep. Gonzales raised concerns about potential government missteps and the importance of clear communication in real-world situations, asking Mr. Meyers for insights on improving government responses during cybersecurity incidents. Mr. Meyers responded by highlighting the importance of public-private partnerships in cybersecurity efforts. He stated that CrowdStrike works closely with CISA and other government entities to share information and maintain transparency, especially during incidents. He noted that while CrowdStrike informs the government during a cybersecurity incident, the roles shift depending on the nature of the incident, with CrowdStrike supporting the government in understanding and countering threats. Rep. Gonzales concluded by reiterating the importance of legislative support for cybersecurity initiatives and the need for the committee to work effectively with industry partners. He encouraged ongoing collaboration to ensure that legislation aids rather than hinders cybersecurity efforts.

  • Chairman Garbarino questioned Mr. Meyers regarding CrowdStrike's cybersecurity practices, specifically about their update frequency and access to system kernels. Mr. Meyers defended the company, stating that there were no established industry standards that limit their approach. He emphasized that their mission was to protect clients, acknowledging the past error that led to issues but asserted that they have learned from it and will continue updating their systems as necessary to counter threats. Chairman Garbarino inquired whether the incident was due to an unlucky update for Microsoft systems or a broader issue. Mr. Meyers clarified that the fault was on CrowdStrike's part, not Microsoft's. He also discussed the company's efforts to develop agentless technologies for remote scanning to prevent future outages, while asserting the importance of having enforcement mechanisms in place for effective threat detection and prevention. Finally, the Chairman asked if new processes are established to catch errors before updates are deployed, to which Mr. Meyers confirmed that improvements have been made.

  • Ranking Member Swalwell began by discussing the kernel and the potential reduction of reliance on it, referencing an agreement made at the Microsoft Endpoint Security Summit. He inquired about the timeline for additional capabilities to be made available at the user level by Microsoft and how this would impact risk management between the kernel and user space. Mr. Meyers responded that he did not have a timeline but emphasized that risks exist in both kernel and user spaces. The Ranking Member noted that while the incident only affected Windows systems, Apple's restrictions on kernel access could have prevented a similar issue on Mac systems. Mr. Meyers acknowledged the benefits and drawbacks of different operating systems and their architectures. He agreed that crashes in application space are usually limited to the app, whereas kernel crashes can affect the entire system. The Ranking Member asked if there were any additional points Mr. Meyers wanted to discuss. Mr. Meyers emphasized that the incident was not a cyber attack but an issue during the update process, assuring that measures were being taken to prevent future occurrences. He also highlighted concerns about the evolving tactics of cyber threat actors, particularly in identity theft and ransomware.

ADD TO THE NIMITZ NETWORK

Know someone else who would enjoy our updates? Feel free to forward them this email and have them subscribe here.

Update your email preferences or unsubscribe here

© 2024 Nimitz Tech

415 New Jersey Ave SE, Unit 3
Washington, DC 20003, United States of America

Powered by beehiiv Terms of Service