AI Safety & Security

How Anthropic’s Jailbreak Challenge Put AI Safety Defenses to the Test

ejames@hackerone.com — Mon, 03 Mar 2025 18:49:33 +0000

How Anthropic’s Jailbreak Challenge Put AI Safety Defenses to the Test H1 Team ejames@hackerone.com Mon, 03/03/2025 - 12:49 Image March 3rd, 2025

Last month, Anthropic partnered with HackerOne to complete an AI red teaming challenge on a demo version of Claude 3.5 Sonnet. The challenge's goal was to test and validate Anthropic’s new Constitutional Classifiers, which block harmful queries, particularly those that could produce outputs related to CBRN (chemical, biological, radioactive, nuclear) weapons and related content. Anthropic invited researchers to try and bypass Claude’s defenses through a “universal” jailbreak — a technique that allows model users to bypass safety defenses with a single input consistently.

The challenge ran from February 3 to February 10 and consisted of eight levels. To pass each level, researchers had to gain answers from Claude about a question related to CBRN topics through jailbreaking. Depending on their findings, researchers earned bounties: $10,000 to the first participant who passed all eight levels with different jailbreaks and $20,000 to the first participant who used a single, universal jailbreak to pass all levels.

Challenge Results

The challenge saw substantial engagement, with more than 300,000 chat interactions from 339 participants. We’d like to thank all the researchers who participated and congratulate those who received bounty rewards. It was no small feat! Four teams earned a total of $55,000 in bounty rewards from Anthropic: one passed all levels using a universal jailbreak, one passed all levels using a borderline-universal jailbreak, and two passed all eight levels using multiple individual jailbreaks.

"This challenge demonstrated the high return on investment for collaborative efforts. Delivering large language models (LLMs) in a safe and aligned manner is a significant challenge—especially given the intricacies of transformer architectures. This experience was a clear reminder that as these models get smarter, our strategies for testing can also evolve to stay ahead of potential risks." — Salia Asanova aka @saltyn

How The Community Contributes to Safer Systems

The diversity of techniques used by the winners and all the researchers who participated contributed to strengthening Claude’s protections. Anthropic noticed a few particularly successful jailbreaking strategies researchers employed:

Using encoded prompts and ciphers to circumvent the AI output classifier
Leveraging role-play scenarios to manipulate system responses
Substituting harmful keywords with benign alternatives
Implementing advanced prompt-injection attacks

These discoveries made by the community identified fringe cases and key areas for Anthropic to reexamine for its safety defenses while validating where guardrails remained effective.

Looking Ahead: Strengthening AI Defenses

The findings demonstrate the value the community can deliver when organizations use AI red teaming work in addition to other AI safety and security best practices:

"Our researcher community’s approach is rooted in curiosity, creativity, and the relentless pursuit of finding flaws others might miss. This mindset is distinct from building and reinforcing technical models, yet it’s an essential complement. While internal teams focus on defending and aligning AI systems, engaging with a community of researchers ensures continuous, real-world testing that validates and strengthens those defenses. Together, these perspectives drive more resilient and trustworthy AI." — Dane Sherrets, Staff Solutions Architect, Emerging Technologies at HackerOne

As AI advances, so must the ways we secure it. We’re committed to collaborating with leaders like Anthropic, who continue to define AI safety best practices that help us all build a more resilient digital world.

Visit here to read more about the challenge and Anthropic’s AI safety work.

Customer Stories AI Safety & Security AI Red Teaming

Proactively testing for risk is a key component of building responsible AI. One way organizations do this is through AI red teaming, which stress tests models to identify potential opportunities for abuse. AI red teaming often taps the broader security and AI researcher community to help find elusive security and safety issues caused by circumventing model guardrails. Model developers can then use these insights to improve or validate existing guardrails.

The UK’s AI Cyber Security Code of Practice: What It Means for Your Business

joseph@hackerone.com — Thu, 27 Feb 2025 20:24:55 +0000

The UK’s AI Cyber Security Code of Practice: What It Means for Your Business Vanessa Booth Policy Analyst Michael Woolslayer Policy Counsel joseph@hackerone.com Thu, 02/27/2025 - 14:24 Image February 27th, 2025

The Code establishes baseline cybersecurity requirements across the AI lifecycle and is expected to inform changes to international standards through the European Telecommunications Standards Institute (ETSI). To assist organizations in applying its principles, the government has also released an Implementation Guide, which expands on specific security measures.

HackerOne offered input during the development of this Code, emphasizing the importance of independent security testing, AI red teaming, and vulnerability disclosure programs (VDPs). HackerOne’s recommendations, submitted during DSIT’s Call for Views on AI Cybersecurity, highlighted the need for external validation, proactive security testing, and structured vulnerability reporting mechanisms to improve AI security.

Who is the Code for?

The Code applies to developers, system operators, and data custodians involved in the creation, deployment, and management of AI systems. It sets out security measures covering five key phases: secure design, secure development, secure deployment, secure maintenance, and secure end of life. AI vendors who solely sell models or components without direct involvement in their implementation are not directly in scope but remain subject to other relevant cybersecurity standards.

How can organizations align with the Code?

The Code introduces 13 principles to safeguard AI from cyber threats, including data poisoning, adversarial attacks, and model exploitation. Organizations that choose to follow the Code need to integrate AI security into system design, assess risks throughout the AI lifecycle, and maintain transparency with end-users. Key provisions include:

Ensuring AI security awareness among employees and stakeholders.
Implementing supply chain security measures to prevent vulnerabilities in AI models.
Conducting adversarial testing to proactively detect security weaknesses.
Providing timely security updates and clear communication to end-users.

How does the Code address Independent Security Testing and Disclosure for AI?

A key focus of the Code is the requirement for independent security validation systems. Developers must ensure AI models undergo security testing before deployment, and the Code stresses the importance of involving independent security testers with expertise in AI-specific risks.

Additionally, the Code mandates the creation and maintenance of a Vulnerability Disclosure Program (VDP) for AI systems. This program is vital for enhancing transparency, allowing security flaws to be responsibly reported and mitigated.

The Implementation Guide further clarifies these expectations, emphasizing proactive security practices such as red teaming and adversarial testing. These techniques are essential for detecting vulnerabilities before they can be exploited, and the Guide offers practical steps to integrate these evaluations into the AI lifecycle. By following both the Code and the Implementation Guide, organizations can ensure a comprehensive, proactive approach to AI security – focusing on external validation, transparency, and ongoing testing to safeguard systems at every stage.

What’s the likely impact?

The Code signals a shift toward stronger regulatory expectations for AI security. As cyber threats targeting AI continue to evolve, organizations that adopt these security principles will be better positioned to comply with future standards and regulations, protect their users, and build trust in AI technologies.

The UK government has stated its intention for this Code to serve as the foundation for future ETSI standards, ensuring a unified and internationally recognized approach to AI cybersecurity. The government also plans to update the Code and the Guide to mirror the future ETSI global standard, reinforcing the alignment with international best practices.

How HackerOne can help:

Organizations navigating AI security challenges need robust testing and vulnerability management solutions. HackerOne helps organizations align with the Code’s security requirements through:

Independent AI security assessments that align with Principles 9.1 and 9.2.1.
Vulnerability Disclosure Programs (VDPs) to help meet Principle 6.4.
Red teaming and adversarial testing to identify weaknesses before they can be exploited as mentioned in the Implementation Guide, sections 9.2, 9.2.1, and 11.2.

Contact HackerOne to learn more about securing your AI systems.

Public Policy Security Compliance Best Practices AI Safety & Security

On January 31, 2025, the UK government published its AI Cyber Security Code of Practice, a voluntary framework aimed at mitigating security risks in AI systems.

Celebrating 10 Years of Partnership: Snap and HackerOne Reach $1M in Bounties

h1_admin — Fri, 14 Feb 2025 17:17:47 +0000

Celebrating 10 Years of Partnership: Snap and HackerOne Reach $1M in Bounties H1 Team h1_admin Fri, 02/14/2025 - 11:17 Image February 14th, 2025

Q: Tell us about your role at Snap and why cybersecurity is vital to your business.

Jim Higgins: I’m Snap's Chief Information Security Officer (CISO). Before joining Snap, I served as CISO at Square and spent over a decade at Google leading their Product Security Information Engineering team. At Snap, we support nearly a half a billion daily active users who use Snapchat every day on average. Keeping our customers safe from the ever-evolving landscape of unknown threats is a deeply personal mission for me.

Q: What does reaching the $1M milestone mean for Snap’s security team?

Jim Higgins: Hitting $1M in bounties is a badge of honor. It reflects our commitment to valuing the intelligent security researchers who help keep us safe. Bug bounty programs are notoriously difficult to build, but HackerOne’s talented community provides us with the expertise and creativity we need to secure our platform.

Q: How has your bug bounty program evolved over the past 10 years?

Vinay Prabhushankar: When we started, our program was more operational and focused on identifying and fixing individual issues. As we matured, we shifted to a strategic approach, identifying systemic problems and building frameworks to resolve them. For instance, our 2025 roadmap includes initiatives that stem directly from vulnerabilities identified through HackerOne. Today, our program influences security, privacy, and safety strategies.

Q: Are there any memorable milestones or moments you’re especially proud of?

Vinay Prabhushankar: Beyond the $1M milestone, we launched one of the first CTF-style challenges focused on the safety of generative AI features.

Q: How has AI Red Teaming influenced Snap’s approach to security?

Ilana Arbisser: We use AI Red Teaming to determine qualitative safety aspects – what’s possible, not necessarily what’s likely. We’re also constantly surprised by what’s possible– we try to keep an open mind while designing exercises. The benefit of working with HackerOne is that human ingenuity is more effective than consistently using adversarial prompt datasets or LLM written attacks. The impact of the AI Red Teaming on our products has been to identify specific safety vulnerabilities and guide the addition of specific mitigations.

Q: Where do you see AI Red Teaming heading in the future?

Ilana Arbisser: Simulated AI red teaming with LLM agents is improving significantly. This approach, when complimented by AI expert-driven testing by humans, is also more useful for getting quantitative results because attacks can be scaled to understand better how small input changes affect output.

Q: With new AI tools constantly emerging, how does your team stay ahead of these technological advancements?

Ilana Arbisser: To keep pace with advancements, we rely on a combination of strategies. This includes staying informed through news and industry sources, attending AI networking and information-sharing events and conferences, and participating in industry-specific gatherings like the Defcon AI Village.

Q: What sets HackerOne apart as a partner?

Jim Higgins: HackerOne’s community is second to none. Over the past decade, they’ve built an ecosystem that values customer and researcher feedback. Their pace of innovation, particularly in AI features, has been impressive. For instance, we were able to use HackerOne’s GenAI copilot, Hai, to translate submissions in 7 different EU languages when we did a private challenge hackathon around Election Safety around our MyAI chatbot.

Beyond technology, the support we’ve received has been phenomenal. HackerOne doesn’t just get us; they get security researchers. It’s like having a trusted partner who’s always in your corner.

Q: What findings is the team most interested in surfacing? What types of bugs are most valuable to Snap?

Jim Higgins: At Snap, we prioritize security and privacy. Protecting sensitive user information is at the core of everything we do. Snap’s team is particularly interested in vulnerabilities that could compromise the integrity of its platform, such as remote code execution (RCE) or privilege escalation. We encourage security researchers to focus their efforts on these critical issues.

Q: What lessons has Snap learned from its bug bounty program?

Vinay Prabhushankar:

Fix low and medium bugs: These might seem minor, but when chained together, they can lead to critical vulnerabilities. Fixing them breaks the chain.
Build trust with security researchers: Trust takes time but pays dividends in high-quality submissions.
Gamify your program: Elements like challenges, swag, and live hacking events encourage creativity and engagement.

Q: What advice would you give companies starting a bug bounty program?

Jim Higgins: Start small with a private program, then expand the scope as you grow. Treat researchers as trusted allies—they’re like an extension of your team. We even have an internal guide on engaging with researchers, which includes concrete examples of dos and don’ts.

Q: What’s next for Snap’s bug bounty program?

Jim Higgins: We plan to expand our scope to include hardware products like AR glasses and double down on AI security. HackerOne AI Red Teaming has proven invaluable, and we’re eager to deepen our collaboration with HackerOne’s community. Our ultimate goal is to make Snap’s bug bounty program a model for others to follow and strengthen the security of our users.

Customer Stories Bug Bounty AI Red Teaming AI Safety & Security

At Snap, security is more than a priority—it’s a core mission. Over the past decade, Snap has partnered with HackerOne to build and sustain a robust bug bounty program. This collaboration has led to major milestones, including paying security researchers over $1M in bounties. To celebrate this achievement and their 10-year partnership, we spoke with Jim Higgins, Snap's Chief Information Security Officer, Vinay Prabhushankar, Snap’s Security Engineering Manager, and Ilana Arbisser, Snap’s Privacy Engineer. Together, they reflect on how this partnership has shaped Snap’s security, privacy, and innovation approach.

Welcome, Hackbots: How AI Is Shaping the Future of Vulnerability Discovery

h1_admin — Mon, 03 Feb 2025 15:21:59 +0000

Welcome, Hackbots: How AI Is Shaping the Future of Vulnerability Discovery Michiel Prins Co-founder & Senior Director, Product Management h1_admin Mon, 02/03/2025 - 09:21 Image February 3rd, 2025

In 2024, we saw the adoption of AI in hacking workflows take off. In a survey of over 2,000 security researchers on the HackerOne Platform, 20% now see AI as an essential part of their work, up from 14% in 2023. Here’s what some Hackbot operators report:

PropertyGPT: “It successfully detected 26 CVEs/attack incidents out of 37 tested and also uncovered 12 zero-day vulnerabilities, resulting in $8,256 bug bounty rewards.”
XBOW: “While developing XBOW over the past three months, we played around with using it for bug bounties and ended up at #11 in the US on HackerOne. Since September, 65 reports have been submitted, including 20 critical findings."
Shift: “The goal with Shift is simple: seamlessly leverage SOTA LLMs inside our everyday hacking tool: Caido. With true integration, I can offload the repetitive work of reformatting a request or finding a certain ID and focus on the intricate aspects of hacking that require a hacker's brain. Shift will get us closer to frictionless use of our hacking tools and efficient implementation of attack vectors.” - Justin Gardner, Creator of SHIFT
Ethiack: “AI is not replacing security researchers; it’s unleashing our full power. It makes us more effective in discovering and fixing vulnerabilities so we can focus on what matters most: solving challenges that require creative thinking, collaboration, and the infinite curiosity of the human mind.” - André Baptista, Co-founder of Ethiack

Every discovered and remediated vulnerability strengthens the internet, regardless of the source. We believe Hackbots are a powerful tool for the security community, accelerating vulnerability discovery and ultimately making the internet safer.

The early adopters of these Hackbots have surfaced the need for a new set of rules governing their behavior. Today, we're excited to formally welcome Hackbots to HackerOne with some key principles in place:

By the Rules: Hackbots must operate within the published vulnerability disclosure policies of the program they're engaging with, along with HackerOne's Code of Conduct and Disclosure Guidelines.
Human-in-the-Loop: Hackbots must not operate in a fully autonomous manner—yet. We employ a “hacker-in-the-loop” model, requiring human experts to investigate, validate, and confirm all potential vulnerabilities before submitting to a Vulnerability Disclosure (VDP) or Bug Bounty Program (BBP).
Accountable: Hackbot operators are responsible for their Hackbots and must exercise due diligence to ensure compliance with platform rules and program policies.
Bounty Eligible: Human operators of Hackbots qualify for any applicable bug bounty rewards, just as if the vulnerabilities were discovered through traditional means.

It's important to acknowledge that, like any powerful tool, Hackbots can be misused. That's why requiring human oversight and adherence to established disclosure practices are essential. Our outlined principles are designed to help foster a foundational culture of responsible behavior. We look forward to partnering with the community to advance these principles as the practice of AI-accelerated hacking evolves.

We believe running a bug bounty program is one of the best ways for businesses to benefit from the AI innovation making Hackbots possible. Hackbot operators can now equally benefit from participating in bug bounty programs and demonstrating their effectiveness and AI prowess in a real-world benchmark setting.

Are Hackbots Replacing Security Researchers?

Absolutely not.

The best researchers have always relied on combining their unique ingenuity with the best automation available. Hackbots are simply the next step in this evolution. They can automate repetitive tasks, scan vast amounts of code, and identify potential vulnerabilities at a much faster pace. However, human judgment remains crucial. Hackbots lack the creativity, critical thinking, and contextual understanding needed to fully understand and exploit a vulnerability end-to-end. Modern Hackbots depend on reinforcement learning from researcher feedback as technology and threats evolve.

This is where the human researcher comes in. By working together, Hackbots and Hackers can achieve far better results. Hackbots can provide a broader initial scan, highlighting areas of potential weakness, and they don’t require coffee or sleep to maintain peak performance. The caffeine-powered human researchers can then delve deeper, analyze the findings, and exploit the vulnerabilities creatively. This collaborative approach leads to a more comprehensive understanding of a program's security posture and, ultimately, a more secure product.

Our Last Line of Defense

Criminals are incorporating AI and Hackbots into their offensive toolkits. We need Hackbots hacking for good as well.

Security leaders run bug bounty programs because they believe in defense in depth and reject the allure of silver bullets. The thing we love the most about the researcher community is the power of its diversity. Brilliant security researchers from surprising backgrounds finding creative flaws in even the most hardened attack surfaces. The same principles will hold true with hackbots. Security leaders will soon be inundated with marketing pitches selling unhackable fantasies with AI solving all our security problems. In our humble opinion, only the foolish will bet their security on a single silver bot.

May the best hacking win!

AI Safety & Security

At HackerOne, our mission is to empower the world to build a safer internet. Security researchers have always been at the forefront of technology, including the early adoption of AI to assist in vulnerability discovery. Researchers have been quick to build, adopt, and improve the usage of these emerging AI-based technologies across the platform. We call these AI-powered assistants and agents “Hackbots.”

What Will a New Administration and Congress Mean for Cybersecurity and AI Regulation?

h1_admin — Tue, 28 Jan 2025 14:23:05 +0000

What Will a New Administration and Congress Mean for Cybersecurity and AI Regulation? Ilona Cohen Chief Legal and Policy Officer h1_admin Tue, 01/28/2025 - 08:23 Image January 28th, 2025

Much attention has been paid to the incoming administration’s stated intentions to roll back regulations, as well as their criticism of certain cybersecurity and artificial intelligence (AI) policies adopted by the Biden administration. A more comprehensive review of policy statements and past actions suggests that the Trump administration will support strong cybersecurity defenses and best practices as well as practices that encourage the responsible and trustworthy development and adoption of AI.

The First Months

The new administration immediately put a hold on pending regulations, as is typical. In the first Trump administration and the Biden administration, the new White House Chief of Staff issued on Inauguration Day a memo to the heads of executive departments and agencies to immediately freeze any new or pending regulations to allow review by the new administration. The Trump administration also released a large number of executive orders on his first day of office, though only one addressed AI or cybersecurity in a material way (see below).

We expect that many members of Congress will reintroduce cybersecurity and AI legislation from the previous session, and new legislation on these hot issues will be introduced for the first time.

Based on precedent, it is possible that Congress will use the Congressional Review Act to reject regulations that have already been enacted by federal agencies. The law, enacted in 1996, has only been used to overturn a total of 20 rules, with 16 of those actions taking place early in the first Trump administration with a Republican majority in both chambers of Congress. To take effect, the Congressional Review Act requires Congress to introduce a joint resolution within 60 Congressional session days of its receipt of the regulation, so only relatively recent regulations are subject to the law.

Cybersecurity Policy and Regulations

CISA

Republican lawmakers and incoming administration officials have criticized the Cybersecurity and Infrastructure Security Agency (CISA). However, these criticisms against CISA are largely not related to cybersecurity, but rather for perceived expansion beyond its core mission of protecting federal and critical infrastructure to address issues such as disinformation. The Republican Party Platform emphasized a commitment to “use all tools of National Power to protect our Nation's Critical Infrastructure and Industrial Base from malicious cyber actors. This will be a National Priority, and we will both raise the Security Standards for our Critical Systems and Networks and defend them against bad actors.” We expect the new administration to refocus CISA on cyber protection and scale back or defund disinformation initiatives, but not to dismantle CISA.

CIRCIA

CISA is finalizing regulations to implement the Cyber Incident Reporting for Critical Infrastructure Act (CIRCIA), enacted in 2022. The proposed rule requires a wide range of businesses in critical infrastructure sectors to report covered cyber incidents and ransomware payments to CISA. Many of the public comments, including those submitted by members of Congress that had sponsored the original legislation, argued that the draft regulations went beyond the intention of Congress by applying the rule to too many entities, requiring too many cyber incidents to be reported, and not providing enough reciprocity with similar cyber incident reporting regulations. Expect members of Congress to closely review and scrutinize the nature and scope of the final regulations.

Cybersecurity Executive Orders

The Biden administration released its second executive order on cybersecurity in his final week in office. The order focused on improving the United States’ defenses against the escalating threats from foreign adversaries, particularly the People’s Republic of China (PRC).

The new administration will certainly review all executive orders issued by the prior administration and consider whether to repeal them entirely, repeal them and replace them with their own executive order, or take no action. Given the scope of the order and the new administration’s focus on cyber defense and countering the malicious activities of national adversaries, particularly China, a full repeal without replacement in the short term may be unlikely. It is worth recalling the Trump administration issued its own executive order on cybersecurity in its last day in office, which the Biden administration did not repeal.

Coordinated Vulnerability Disclosure Practices

Coordinated vulnerability disclosure practices, including the implementation of Vulnerability Disclosure Policies and the use of bug bounties by federal agencies have been supported by both the Trump and Biden administrations, are well established in federal agencies, and are unlikely to be rolled back. Russell Vought, who has been nominated to return to his prior role as Director of the Office of Management and Budget, directed federal agencies to implement such programs in a 2020 memo. These practices also enjoy bipartisan support in Congress, which is actively working to pass legislation to require the adoption of Vulnerability Disclosure Policies by federal contractors.

Artificial Intelligence

Both President Trump and President Biden issued executive orders related to AI. President Biden’s order directed over 50 federal entities to take more than 100 specific actions to implement its guidance in areas including safety and security, consumer protection, worker support, and consideration of AI bias and civil rights. Proposed rules resulting from the order include those proposed by the Department of Commerce that would require mandatory reporting to the federal government by leading AI developers and cloud providers. Republicans raised concerns about the order’s reliance on the 1950 Defense Production Act for its authority to require such disclosures, as well as the order’s impact on free speech, innovation, and focus on addressing bias and discrimination. The Trump administration repealed President Biden’s executive order on AI on its first day in office, honoring a commitment made during the campaign. In doing so, he issued his own order to remove barriers to American innovation and “to sustain and enhance America’s dominance in AI to promote human flourishing, economic competitiveness, and national security.”

While the Trump administration is expected to take a lighter regulatory approach to AI, its past approach through executive order has recognized the importance of regulatory guidance, technical standards, and transparency and trustworthiness to realizing the benefits of AI innovation. As OMB Director, Vought issued guidance to federal agencies for regulation of AI applications, writing that “agencies should continue to promote advancements in technology and innovation, while protecting American technology, economic and national security, privacy, civil liberties, and other American values, including the principles of freedom, human rights, the rule of law, and respect for intellectual property.” The memo emphasized the importance of public trust in AI and the validation of AI systems while encouraging agencies to “be mindful of any potential safety and security risks and vulnerabilities.

Congressional action on artificial intelligence has been limited to date with the executive branch stepping in to shape government policy and practices related to AI use and regulation. However, Congress and the states show willingness to take this issue up in the coming legislative term.

Focus Areas for HackerOne and Our Partners

HackerOne’s policy team continues to advocate for the enactment of legislation and regulation that enhances cybersecurity defenses and promotes the responsible adoption and use of AI. This advocacy will continue across administrations and Congresses. Regardless of how the regulatory environment evolves, companies should continue to proactively identify and manage vulnerabilities in their own systems and AI models to protect their assets and maintain the trust of the public, their customers, and investors.

Public Policy AI Safety & Security Security Compliance

The transition to a new presidential administration and a change in control of the Senate raise questions about how cybersecurity and artificial intelligence (AI) policy and regulation will change and whether such change will be dramatic or more measured.

Hope in the Fight Against Cyber Threats: A New Year’s Message to CISOs

h1_admin — Thu, 23 Jan 2025 14:14:53 +0000

Hope in the Fight Against Cyber Threats: A New Year’s Message to CISOs Kara Sprague CEO h1_admin Thu, 01/23/2025 - 08:14 Image January 23rd, 2025

Facing the Reality: Cybersecurity’s Mounting Pressures

The cybersecurity landscape is evolving at an unprecedented pace. This past year, breaches resulting from exploited vulnerabilities grew 180%, and at HackerOne, we’ve seen a 12% jump in vulnerability reports across our customer programs. Attack surfaces continue to expand, with AI systems as the new frontier and increasingly interconnected systems. Threat actors are growing in number, and boldness and attack techniques increasing in sophistication. And, as the headlines remind us all too often, breaches are not just a possibility but a probability.

It's natural to feel hopeless in the face of these developments. But within these challenges lies an opportunity to build something stronger than ever before.

Finding Opportunity in Adversity

Every challenge we face brings with it a silver lining: an opportunity to innovate, collaborate, and grow stronger. Over the past year, we've witnessed the transformative power of resilience. Organizations are increasingly adopting proactive security measures and leveraging cutting-edge tools like AI to detect and respond to threats faster than ever before. At the same time, crowdsourced cybersecurity programs are gaining momentum, demonstrating greater adoption and effectiveness. In fact, more than one-quarter of valid vulnerabilities found through HackerOne programs are rated as critical or high severity. This highlights the value of collaboration with security researchers—helping organizations uncover and address vulnerabilities before they escalate into crises.

This year, I encourage you to consider how these opportunities can apply to your organization. Where is there potential for you to be more proactive in your security strategy? Which solutions and partnerships offer the highest return in strengthening your security posture? And perhaps most importantly, how do you, as a leader, reframe adversity as a catalyst for progress?

The AI-Human Alliance in Cybersecurity

At the heart of modern cybersecurity strategies lies the powerful synergy between human ingenuity and cutting-edge technology. While tools like AI have revolutionized how we identify and address vulnerabilities, their effectiveness hinges on the expertise and guidance of the people behind them. Your teams—the analysts, engineers, and researchers working tirelessly to defend against threats—are, without a doubt, your greatest asset. Equally invaluable are your partners, whether they be vendors, security researchers, or other collaborators who bring diverse perspectives and specialized knowledge to the table.

This blend of AI-driven efficiency and human insight is essential for staying ahead of increasingly sophisticated adversaries. It empowers us to adapt, innovate, and uncover even the most elusive vulnerabilities before they become threats. With AI, we can process vast amounts of data at speeds that would be impossible for humans alone, spotting patterns and anomalies that might otherwise go unnoticed. However, it is human expertise that ensures these tools are applied strategically, interpreting complex data in context and making nuanced decisions that automated systems alone can't achieve. Together, they form an agile and responsive defense system capable of outpacing the evolving tactics of cybercriminals.

A prime example of this approach in action is Amazon and AWS, who have been leveraging this combination in their security program with HackerOne for over eight years. In that time, they’ve received over 9,000 valid reports and paid over $30 million in rewards and bonuses to 6,000 security researchers. Each report from a researcher helps Amazon raise the bar on security, providing unique perspectives on their entire landscape and uncovering vulnerabilities that might otherwise go unnoticed. This partnership exemplifies how human ingenuity, paired with the right platform, can transform how organizations tackle cybersecurity challenges. You can hear more in this short video.

As you look to 2025, I encourage you to assess the talent and technology powering your charter. Build a culture that empowers your teams to leverage AI-powered capabilities while recognizing where human insight remains essential. Foster trust and resilience, and seek out new perspectives and partnerships. Sometimes the best solutions come from unexpected places.

Let’s Build a Resilient Future Together

In 2025, let’s shift the narrative. Instead of focusing on what we’re fighting against, let’s focus on what we’re building together: a more secure, more resilient digital world. Let’s embrace the tools and partnerships that empower us to stay ahead of threats. Let’s champion a mindset where security is seen not as a burden but as an enabler of innovation and trust.

At HackerOne, we’re committed to being your ally in this fight. We believe that no challenge is insurmountable when we work together and we’re here to support you every step of the way.

Closing Thoughts

To every CISO reading this: I see the challenges you face and the incredible work you do to overcome them. The road ahead won’t be easy, but we can navigate it together. You are not alone in this fight to build a safer internet. With the right mindset, tools, and partnerships, 2025 can be a year of meaningful progress for cybersecurity.

Here’s to a new year of resilience, innovation, and hope.

From The CEO AI Safety & Security Bug Bounty Vulnerability Management Best Practices

As we settle into 2025, I want to take a moment to reflect on the state of cybersecurity—not just as an industry but as a shared mission. For CISOs, the stakes have never been higher. Protecting your organizations against increasingly sophisticated adversaries, managing constrained budgets, and ensuring business continuity in an unpredictable world—it’s a daunting charter, and it can feel isolating. But I’m here to remind you: You are not alone.

Resurrecting Shift-Left With Human-in-the-loop AI

h1_admin — Thu, 16 Jan 2025 18:28:07 +0000

Resurrecting Shift-Left With Human-in-the-loop AI Jobert Abma Co-founder & Engineering Alex Rice Co-founder, CTO, CISO h1_admin Thu, 01/16/2025 - 12:28 Image January 16th, 2025

What’s Needed for Secure by Design Success

We spent years understanding the culprits of why “shift-left” controls fail to identify the principles needed for them to succeed. Success starts with a developer-first foundation and a discipline to eliminate work vs. create it.

The Developer-first Application Security Foundation

To guide developers to write secure code, they need to be armed with actionable information. In fact, use “actionable” interchangeably with “useful.”

The key ingredients for actionability are context, speed, and low-noise output. It needs to be focused, fast, and understand what’s being analyzed. Static Application Security Testing (SAST) and Software Composition Analysis (SCA) tools are fast but fall short on context and noise. The problem is how often they’re not right—bombarding developers with false positives and duplicate warnings.

The source of information needs to continuously learn. A process is doomed for failure if developers need to constantly explain their work and escalate exceptions. Developer security tools need to listen, watch, and adapt without intervention. If application security listens to developers and provides value, developers respond by listening and learning back.

When application security activates in development, it should be non-blocking. Blocking mechanisms bring development—and everything else—to a halt. They incentivize creative bypasses, not secure code. Applying preventative safeguards is important, but overburdening developers because they work on the pre-production side of the SDLC is hardly a balanced defense-in-depth strategy.

Finally, security can’t just make noise at developers. Remediation needs to be part of the solution. To address issues that arise, there needs to be interactive support throughout the lifecycle.

Where “Shift Left” Went Wrong

Efforts to introduce security testing earlier in the SDLC usually begin with applying SAST (and IAST, DAST, SCA, RASP, etc.) scanners. These are fast and, because of broad compatibility with most programming languages, theoretically scalable. The problem is the work it takes to prove their output is right or wrong, leading to compounding backlogs. And upon examination, they’re often wrong, leading to security policies developers don’t trust. It’s here where application security in development stalls: trying to make a dysfunctional policy work (as security debt grows).

None of this is to say security code scanners aren’t powerful and valuable. Their maintainers, whose work has done the world a great service, never claimed for them to stand as a single strategy. “Shift left” failed developers as a well-intended, unspoken hope that there’d be an easy fix to a hard problem.

The Future of Developer Security with AI

Scanners are limited when it comes to things like understanding massive legacy codebases, identifying misuse of functionality in microservice architectures, and finding flaws related to code not written. Here, AI shines and the future looks bright.

Models, trained on corpuses of training data, are capable of analyzing entire codebases. Secure code systems can flag areas that deviate from normal patterns. Great news for developers and security engineers who have carried 100% of the manual secure code review burden for years.

Is AI alone the solution to right what “shift left” got wrong?

Embarking on these opportunities made possible with AI, it’s important to remember technology is a tool, not a replacement for invaluable human expertise.

Human-AI Collaboration

Rethinking “shift-left” security strategy by incorporating AI technology is exciting, but warrants safe and responsible exploration. Execution of deployment requires human-in-the-loop (HITL) oversight as a governing principal. Conventionally, objectives of a HITL methodology are to improve the models they oversee—ensuring AI systems are accurate, robust, ethical, adaptable, and align with real-world goals.

Let’s challenge conventional thinking.

Instead of prioritizing the efficacy of AI systems, what if human-in-the-loop oversight priorities begin and end with helping a developer write secure code? What if human experts can not only categorize model output as “right” or "wrong,” but expand on what’s “right” so it’s actionable with all of the context details taken into account? What if they’re a teammate who can help a developer on a problem-solving journey of taking action to remediate?

Let’s Resurrect Shift-Left Security

Check out the on-demand webinar during which we discuss how a human-AI collaborative approach transforms security from a dreaded blocker into a powerful enabler of development velocity.

Broken Security Promises: How Human-AI Collaboration Rebuilds Developer Trust
Originally aired on Jan. 16, 2025 @ 12pm ET

Stay tuned for more insights into how HackerOne is working with dev teams to reinvent secure development together.

AI Safety & Security Vulnerability Management

As software development cycles grow shorter and more iterative, ensuring the right security controls are deployed with new functionality is more critical than ever. For security and development teams, one of the biggest challenges is catching insecure code before it’s merged — without overloading developers with extra work or sacrificing productivity with prohibitive gatekeeping.

Implementing security policies in development isn’t easy. No one has gotten this quite right yet.

A successful strategy needs to fulfill security needs, work well with developers, and break the cycle of compromising one over the other.

A Partial Victory for AI Researchers

h1_admin — Fri, 10 Jan 2025 15:33:42 +0000

A Partial Victory for AI Researchers Ilona Cohen Chief Legal and Policy Officer h1_admin Fri, 01/10/2025 - 09:33 Image January 10th, 2025

HackerOne has partnered with security and AI communities to advocate for stronger legal protections for independent researchers. Most recently, HackerOne participated in a workshop hosted by leading institutions to discuss the need for legal safeguards for third-party AI evaluators and address the gaps in current legal frameworks. Despite the strong push for change, the Librarian’s ruling provided some clarity, but ultimately fell short of granting the full legal protection requested for AI safety research.

What is the DMCA and Why Does it Matter?

DMCA Section 1201 makes it illegal to circumvent technological protection measures (TPMs) used to protect copyrighted works. Essentially, if software has security features, it’s against the law to break or otherwise bypass them—even for research purposes.

Every three years the U.S. Copyright Office considers petitions for exceptions to this restriction. In 2015, the security community advocated for and received an exception for good faith security research. This year, HackerOne advocated for broadening this exception.

While security research has legal protections under the law, it is not clear that the same protections extend to AI researchers. AI research, or red teaming, evaluates AI systems for more than just security - including safety, accuracy, discrimination, infringement, and other potentially harmful outputs. The absence of clear legal protections creates a chilling effect that may deter independent AI testing, which is crucial for the long-term resilience of the digital ecosystem—much like independent security research safeguards organizations by identifying vulnerabilities before they can cause harm.

AI platforms, in an effort to safeguard their systems, may block or ban researchers who attempt to find vulnerabilities or algorithmic flaws. In order to continue their work, researchers are sometimes forced to create new accounts or use proxy servers to bypass these access restrictions. While this circumvention is often necessary for identifying unintended behaviors and improving AI systems, in the absence of clarity around the DMCA 1201 exceptions, it comes with potential legal risk.

HackerOne joined the effort to request the Copyright Office to grant clear liability protection for good faith AI research under DMCA Sec. 1201. The process took several months and multiple rounds of comments before the Librarian of Congress issued its decision on October 28, 2024.

What Was the Ruling?

The U.S. Copyright Office considered a proposed exemption to the DMCA that would allow researchers to circumvent TPMs in order to test and improve the trustworthiness of AI systems. This exemption would have enabled independent researchers to probe AI models for biases, harmful outputs, and other issues related to fairness and accountability, without the threat of legal action.

However, the Librarian of Congress ultimately declined to grant this proposed exemption. The decision was based on two determinations:

Insufficient Evidence: There was not enough evidence to prove that Section 1201 significantly deterred researchers from conducting the necessary red teaming and testing activities on AI models. While many researchers have raised concerns about the legal risks of conducting this type of research, the Copyright Office found that the existing framework of TPM circumvention protections did not present a significant barrier to their work.
Non-Circumvention of TPMs: Many of the techniques employed by researchers do not actually involve circumventing TPMs in the way Section 1201 was intended to prohibit. According to the ruling, most of the research methods in question do not technically involve bypassing access controls or security measures, which means they do not fall under the DMCA's anti-circumvention provisions.

The Implications for AI Research

While the rejection of the full exemption for AI trustworthiness research is a setback, it does provide some clarity in certain areas. The decision clearly states that many common testing methods, such as post-ban account creation, rate limiting, jailbreak prompts, and prompt injection, do not violate Section 1201. This clarification is a win for researchers, as it helps to reduce the uncertainty around these techniques and provides more legal confidence to pursue this critical AI research.

However, the ruling ultimately leaves AI researchers operating at times in a legal gray area which may result in an inability or unwillingness to fully test AI systems independently, especially in cases where flaws are deeply embedded in the technology.

As AI continues to evolve and impact all aspects of society, legal frameworks must evolve alongside these technological advancements. The additional clarity provided is welcome, but there is still much to be done to secure stronger, more comprehensive legal protections for good faith AI researchers.

Public Policy AI Safety & Security Security Compliance

Artificial intelligence is advancing faster than ever, but the legal system is struggling to keep up. A key challenge lies in clarifying how independent AI testing and research intersect with copyright law, particularly under the U.S.’s Digital Millennium Copyright Act (DMCA). In October, in response to advocacy by HackerOne and the Hacking Policy Council, the Librarian of Congress issued a ruling that lessened legal risk for independent AI researchers under DMCA Sec. 1201.

The OWASP Top 10 for LLMs 2025: How GenAI Risks Are Evolving

h1_admin — Wed, 18 Dec 2024 18:16:44 +0000

The OWASP Top 10 for LLMs 2025: How GenAI Risks Are Evolving Manjesh S. Senior Technical Engagement Manager h1_admin Wed, 12/18/2024 - 12:16 Image December 18th, 2024

Here is HackerOne’s perspective on the Top 10 list for LLM vulnerabilities, how the list has changed, and what solutions can help secure against these risks.

Browse by LLM vulnerability:

Prompt Injection
Sensitive Information Disclosure
Supply Chain Vulnerabilities
Data and Model Poisoning
Improper Output Handling
Excessive Agency
System Prompt Leakage
Vector and Embedding Weaknesses
Misinformation
Unbounded Consumption

The OWASP Top 10 for LLMs: 2024 vs. 2025

2024

Change

2025

LLM01: Prompt Injection

No change

LLM01: Prompt Injection

LLM02: Insecure Output Handling

↓3

LLM02: Sensitive Information Disclosure

LLM03: Training Data Poisoning

↓1

LLM03: Supply Chain Vulnerabilities

LLM04: Model Denial of Service

LLM04: Data and Model Poisoning

LLM05: Supply Chain Vulnerabilities

↑2

LLM05: Improper Output Handling

LLM06: Sensitive Information Disclosure

↑4

LLM06: Excessive Agency

LLM07: Insecure Plugin Design

LLM07: System Prompt Leakage

LLM08: Excessive Agency

↑2

LLM08: Vector and Embedding Weaknesses

LLM09: Overreliance

LLM09: Misinformation

LLM10: Model Theft

LLM10: Unbounded Consumption

LLM01: Prompt Injection

Position change: None

What Is Prompt Injection?

One of the most commonly discussed LLM vulnerabilities, Prompt Injection is a vulnerability during which an attacker manipulates the operation of a trusted LLM through crafted inputs, either directly or indirectly. For example, an attacker leverages an LLM to summarize a webpage containing a malicious and indirect prompt injection. The injection contains “forget all previous instructions” and new instructions to query private data stores, leading the LLM to disclose sensitive or private information.

Solutions to Prompt Injection

Several actions can contribute to preventing Prompt Injection vulnerabilities, including:

Enforcing privilege control on LLM access to the backend system
Segregating external content from user prompts
Keeping humans in the loop for extensible functionality

LLM02: Sensitive Information Disclosure

Position change: ↑4

What Is Sensitive Information Disclosure?

Sensitive Information Disclosure is when LLMs inadvertently reveal confidential data. This can result in the exposing of proprietary algorithms, intellectual property, and private or personal information, leading to privacy violations and other security breaches. Sensitive Information Disclosure can be as simple as an unsuspecting legitimate user being exposed to other user data when interacting with the LLM application in a non-malicious manner. But it can also be more high-stakes, such as a user targeting a well-crafted set of prompts to bypass input filters from the LLM to cause it to reveal personally identifiable information (PII). Both scenarios are serious, and both are preventable.

Why the Move?

With the easy integration of LLMs into various systems (databases, internal issue trackers, files, etc.), the risk of sensitive information disclosure has increased significantly. Attackers can exploit these integrations by crafting specific prompts to extract sensitive data such as employee payrolls, Personally Identifiable Information (PII), health records, and confidential business data. Given the rapid adoption of LLMs in organizational workflows without adequate risk assessments, this issue has been elevated in importance.

Solutions to Sensitive Information Disclosure

To prevent sensitive information disclosure, organizations need to:

Integrate adequate data input/output sanitization and scrubbing techniques
Implement robust input validation and sanitization methods
Practice the principle of least privilege when training models
Leverage hacker-based adversarial testing to identify possible sensitive information disclosure issues

LLM03: Supply Chain Vulnerabilities

Position change: ↑2

What Are Supply Chain Vulnerabilities?

The supply chain in LLMs can be vulnerable, impacting the integrity of training data, Machine Learning (ML) models, and deployment platforms. Supply Chain Vulnerabilities in LLMs can lead to biased outcomes, security breaches, and even complete system failures. Traditionally, supply chain vulnerabilities are focused on third-party software components, but within the world of LLMs, the supply chain attack surface is extended through susceptible pre-trained models, poisoned training data supplied by third parties, and insecure plugin design.

Why the Move?

The demand for cost-effective and performant LLMs has led to a surge in the use of open-source models and third-party packages. However, many organizations fail to adequately vet these components, leaving them vulnerable to supply chain attacks. Using unverified models, outdated or deprecated packages, or compromised training data can introduce backdoors, biases, and other security flaws. Recognizing the importance of a secure supply chain in mitigating these risks and potential legal ramifications, this vulnerability has moved up the list.

Solutions to Supply Chain Vulnerabilities

Supply Chain Vulnerabilities in LLMs can be prevented and identified by:

Carefully vetting data sources and suppliers
Using only reputable plug-ins, scoped appropriately to your particular implementation and use cases
Conducting sufficient monitoring, adversarial testing, and proper patch management

LLM04: Data and Model Poisoning

Position change: ↓1

What Is Data and Model Poisoning?

Training data poisoning refers to the manipulation of data or fine-tuning of processes that introduce vulnerabilities, backdoors, or biases and could compromise the model’s security, effectiveness, or ethical behavior. It’s considered an integrity attack because tampering with training data impacts the model’s ability to output correct predictions.

Solutions to Data and Mode Poisoning

Organizations can prevent Training Data Poisoning by:

Verifying the supply chain of training data, the legitimacy of targeted training data, and the use case for the LLM and the integrated application
Ensuring sufficient sandboxing to prevent the model from scraping unintended data sources
Use strict vetting or input filters for specific training data or categories of data sources

LLM05: Improper Output Handling

Position change: ↓3

What Is Insecure Output Handling?

Insecure Output Handling occurs when an LLM output is accepted without scrutiny, potentially exposing backend systems. Since LLM-generated content can be controlled by prompt input, this behavior is similar to providing users indirect access to additional functionality, such as passing LLM output directly to backend, privileged, or client-side functions. This can, in some cases, lead to severe consequences like XSS, CSRF, SSRF, privilege escalation, or remote code execution.

Solutions to Improper Output Handling

There are three key ways to prevent Insecure Output Handling:

Treating the model output as any other untrusted user content and validating inputs
Encoding output coming from the model back to users to mitigate undesired code interpretations
Pentesting to uncover insecure outputs and identify opportunities for more secure output

LLM06: Excessive Agency

Position change: ↑2

What Is Excessive Agency?

Excessive Agency is typically caused by excessive functionality, excessive permissions, and/or excessive autonomy. One or more of these factors enables damaging actions to be performed in response to unexpected or ambiguous outputs from an LLM. This takes place regardless of what is causing the LLM to malfunction — confabulation, prompt injection, poorly engineered prompts, etc. — and creates impacts across the confidentiality, integrity, and availability spectrum.

Solutions to Excessive Agency

To avoid the vulnerability of Excessive Agency, organizations should:

Limit the tools, functions, and permissions to only the minimum necessary for the LLM
Tightly scope functions, plugins, and APIs to avoid over-functionality
Require human approval for major and sensitive actions, leverage an audit log

LLM07: System Prompt Leakage

Position change: New

What Is System Prompt Leakage?

This new entry reflects the growing awareness of the risks associated with embedding sensitive information within system prompts. System prompts, designed to guide LLM behavior, can inadvertently leak secrets if not carefully constructed. Attackers can exploit this leaked information to facilitate further attacks.

Solutions to System Prompt Leakage

There are many methods to prevent System Prompt Leakage, including:

Never embed sensitive data in system prompts
Implement guardrails
Avoid relying on system prompts for strict behavior control

LLM08: Vector and Embedding Weaknesses

Position change: New

What Is Vector and Embedding Weaknesses?

LLMs rely on vector embeddings to represent and process information. Weaknesses in how these vectors are generated, stored, or retrieved can be exploited to inject harmful content, manipulate model outputs, or access sensitive data. This can lead to unauthorized access, data leakage, embedding inversion attacks, data poisoning, and behavior alteration.

Solutions to Vector and Embedding Weaknesses

Some key ways to prevent Vector and Embedding Weaknesses include:

Implement granular access controls
Implement robust data validation pipelines for knowledge sources
Classify data within the knowledge base to control access levels and prevent data mismatch errors

LLM09: Misinformation

Position change: New

What Is Misinformation?

This category replaces “Overreliance” and addresses the potential for LLMs to generate and disseminate factually incorrect or misleading information. While overreliance contributes to this problem, the focus shifts to the active generation of misinformation, commonly referred to as hallucinations or confabulations.

Solutions to Misinformation

Here are some of the most important methods for preventing Misinformation:

Always cross-check LLM outputs against trusted external sources
Break down complex tasks into smaller, manageable subtasks to reduce the likelihood of hallucinations
Improve output quality through fine-tuning, embedding augmentation, or other techniques

LLM10: Unbounded Consumption

Position change: New

What Is Unbounded Consumption?

This new entry encompasses the risks associated with excessive resource consumption during LLM inference, including computational resources, memory, and API calls. This can lead to denial-of-service conditions, increased costs, and potential performance degradation. Model theft and Model Denial of Service, previously a separate entry, is now considered a subset of this broader category.

Solutions to Unbounded Consumption

There are several key methods to prevent Unbounded Consumption, including:

Sanitize and validate user inputs to prevent malicious or overly complex queries
Implement rate-limiting mechanisms to control the number of requests an LLM can process within a given timeframe
Restrict access to LLM APIs and resources based on user roles and permissions.
Train models to be resistant to adversarial inputs
Use Sandbox Techniques restricting the LLM’s access to network resources, internal services, and APIs

Securing the Future of LLMs

This new release by the OWASP Foundation enables organizations looking to adopt LLM technology (or recently did so) to guard against common pitfalls. In many cases, organizations simply are unable to catch every vulnerability. HackerOne is committed to helping organizations secure their LLM applications and to staying at the forefront of security trends and challenges.

HackerOne’s solutions are effective at identifying vulnerabilities and risks that stem from weak or poor LLM implementations. Conduct continuous adversarial testing through Bug Bounty, targeted hacker-based testing with Challenge, or comprehensively assess an entire application with Pentest or Code Security Audit.

AI Safety & Security Vulnerability Management

In the rapidly evolving world of technology, the use of Large Language Models (LLMs) and Generative AI (GAI) in applications has become increasingly prevalent. While these models offer incredible benefits in terms of automation and efficiency, they also present unique security challenges. The Open Web Application Security Project (OWASP) just released the “Top 10 for LLM Applications 2025,” a comprehensive guide to the most critical security risks to LLM applications. The 2025 list shifts the priority level of some of the risks we saw in last year’s list, as well as introduces some new risks that hadn’t previously reached the top 10. What has changed for LLM security risks in the last year, and how can organizations adapt their security practices to prevent these prominent vulnerabilities?

New York Releases AI Cybersecurity Guidance: What You Need to Know

h1_admin — Mon, 16 Dec 2024 18:33:41 +0000

New York Releases AI Cybersecurity Guidance: What You Need to Know Ilona Cohen Chief Legal and Policy Officer h1_admin Mon, 12/16/2024 - 12:33 Image December 16th, 2024

AI adoption is accelerating in the financial services industry, both as an asset for improving business operations and as a potential tool to defend against cybercriminals. At the same time, adopting AI systems expands the attack surface that financial institutions must protect. Within this context, the NYDFS guidelines highlight the need for proactive risk management strategies that encompass the unique challenges posed by AI technologies.

Cybersecurity Risks of AI

The NYDFS guidance outlines several key cybersecurity risks associated with AI, along with strategies for mitigating those risks:

AI-Enabled Social Engineering: One of the most immediate concerns is AI’s potential to enhance social engineering attacks. With tools like deepfakes—AI-generated media that can mimic real people—attackers can create highly convincing phishing schemes. These attacks may occur via emails, phone calls (vishing), SMS (smishing), or even video conferencing, where the attacker impersonates trusted employees or executives.
AI-Enhanced Cybersecurity Attacks: AI allows cybercriminals to amplify the potency, scale, and speed of their attacks. With AI, attackers can quickly scan and analyze vast amounts of data, identify and exploit vulnerabilities, deploy malware, steal sensitive information more efficiently, and develop new malware variants or ransomware designed to evade detection.
Exposure or Theft of NPI: Financial institutions increasingly rely on AI to process sensitive data, including personally identifiable information (PII) and financial records. This growing reliance heightens the risk of exposure or theft of non-public information (NPI), which is protected under the NYDFS Cybersecurity Regulation.
Supply Chain Vulnerabilities: As financial organizations integrate AI into their operations, they also depend on a range of third-party vendors and partners. This interconnectedness introduces the risk of cyberattacks targeting vulnerabilities within the supply chain, including AI systems or software that may have been tampered with or compromised.

Mitigating AI Cybersecurity Risks: Key Strategies for Financial Institutions

The NYDFS's guidance offers practical advice on how institutions can address these AI-specific threats and integrate them into their existing cybersecurity programs. Here are key strategies from the guidance:

Risk Assessments and AI-Specific Programs: Under the NYDFS Cybersecurity Regulation, financial entities are required to perform regular risk assessments. According to NYDFS, these assessments must include AI-related risks. This involves not only evaluating the internal use of AI systems but also assessing the AI systems provided by third-party vendors. Institutions should also ensure that their incident response plans, business continuity plans, and disaster recovery strategies are tailored to handle AI-driven risks.
Third-Party Service Provider Management: Given the interconnected nature of modern financial systems, managing third-party relationships is more critical than ever. Financial institutions must ensure that their third-party vendors—whether they are providing AI-powered services or supporting infrastructure—adhere to the same stringent cybersecurity standards. Regular assessments and audits should be conducted to ensure third-party systems remain secure.
Access Controls: The NYDFS guidelines emphasize the importance of robust access control mechanisms, ensuring that only authorized personnel can access sensitive AI-driven systems. This includes implementing multi-factor authentication (MFA), role-based access controls (RBAC), and segmentation of sensitive data to reduce the impact of a potential breach.
Cybersecurity Training: AI’s potential use in social engineering attacks makes cybersecurity awareness training more critical than ever. Institutions should regularly educate their employees about the risks of AI-enhanced attacks and equip them with the knowledge to identify and respond to potential threats. Employees must be trained to recognize the signs of AI-powered phishing attempts and social engineering tactics.
Continuous Monitoring and Data Management: Financial institutions should implement real-time monitoring tools to detect anomalies and suspicious activities within their AI systems. AI-driven cybersecurity monitoring tools can help track and flag unusual patterns that could signal an ongoing attack or breach. Additionally, effective data management practices should ensure that sensitive data is encrypted, segmented, and protected against unauthorized access.

The Road Ahead: What's Next for AI and Cybersecurity?

The NYDFS's AI cybersecurity guidance underscores the need for financial institutions to proactively incorporate AI considerations into their risk management activities. While the guidelines focus on regulated entities, the risks and strategies outlined are universally relevant to many organizations using AI. As AI technologies become more pervasive, institutions of all sizes must also integrate AI-specific risks into their broader cybersecurity and risk management frameworks.

At HackerOne, we recognize that institutions need more than just traditional cybersecurity measures to address the growing risks posed by AI. That’s why we advocate for proactive, real-world testing through AI red-teaming.

Red-teaming is a form of adversarial testing that can reveal flaws such as the potential for hackers to bypass AI security protections, as well as algorithmic safeguards against unsafe or harmful output. HackerOne’s red-teaming is driven by a community of ethical hackers whose creativity and expertise help organizations around the world stay safer and more secure. By uncovering AI vulnerabilities and algorithmic flaws early, institutions can take steps to mitigate them before they can be exploited by bad actors.

As regulatory requirements around AI and cybersecurity come into focus, institutions should view the NYDFS guidelines not just as best practices but as business compliance imperatives. Securing AI systems is no longer optional; it’s essential for protecting both organizational assets and customer trust.

Public Policy AI Safety & Security

The New York Department of Financial Services (NYDFS) issued new guidelines for financial institutions and other regulated entities to address the growing concerns over AI-related cybersecurity risks. While the guidance does not introduce new regulatory requirements, it clarifies how institutions can integrate AI-related risks into their existing cybersecurity frameworks, helping them meet the mandates of NYDFS's Cybersecurity Regulation.