No Result
View All Result
Global Finances Daily
  • Alternative Investments
  • Crypto
  • Financial Markets
  • Investments
  • Lifestyle
  • Protection
  • Retirement
  • Savings
  • Work & Careers
No Result
View All Result
  • Alternative Investments
  • Crypto
  • Financial Markets
  • Investments
  • Lifestyle
  • Protection
  • Retirement
  • Savings
  • Work & Careers
  • Login
Global Finances Daily
No Result
View All Result
Home Protection

Stress-testing multimodal AI applications is a new frontier for red teams

February 6, 2025
in Protection
0
Stress-testing multimodal AI applications is a new frontier for red teams


Human communication is multimodal. We receive information in many different ways, allowing our brains to see the world from various angles and turn these different “modes” of information into a consolidated picture of reality.

We’ve now reached the point where artificial intelligence (AI) can do the same, at least to a degree. Much like our brains, multimodal AI applications process different types — or modalities — of data. For example, OpenAI’s ChatGPT 4.0 can reason across text, vision and audio, granting it greater contextual awareness and more humanlike interaction.

However, while these applications are clearly valuable in a business environment that’s laser-focused on efficiency and adaptability, their inherent complexity also introduces some unique risks.

According to Ruben Boonen, CNE Capability Development Lead at IBM: “Attacks against multimodal AI systems are mostly about getting them to create malicious outcomes in end-user applications or bypass content moderation systems. Now imagine these systems in a high-risk environment, such as a computer vision model in a self-driving car. If you could fool a car into thinking it shouldn’t stop even though it should, that could be catastrophic.”

Multimodal AI risks: An example in finance

Here’s another possible real-world scenario:

An investment banking firm uses a multimodal AI application to inform its trading decisions, processing both textual and visual data. The system uses a sentiment analysis tool to analyze text data, such as earnings reports, analyst insights and news feeds, to determine how market participants feel about specific financial assets. Then, it conducts a technical analysis of visual data, such as stock charts and trend analysis graphs, to offer insights into stock performance.

An adversary, a fraudulent hedge fund manager, then targets vulnerabilities in the system to manipulate trading decisions. In this case, the attacker launches a data poisoning attack by flooding online news sources with fabricated stories about specific markets and financial assets. Next, they launch an adversarial attack by making pixel-level manipulations — known as perturbations — to stock performance charts that are imperceptible to the human eye but enough to exploit the AI’s visual analysis abilities.

The result? Due to the manipulated input data and false signals, the system recommends buying orders at artificially inflated stock prices. Unaware of the exploit, the company follows the AI’s recommendations, while the attacker, holding shares in the target assets, sells them for an ill-gotten profit.

Getting there before adversaries

Now, let’s imagine that the attack wasn’t really carried out by a fraudulent hedge fund manager but was instead a simulated attack by a red team specialist with the goal of discovering the vulnerability before a real-world adversary could.

By simulating these complex, multifaceted attacks in safe, sandboxed environments, red teams can reveal potential vulnerabilities that traditional security systems are almost certain to miss. This proactive approach is essential for fortifying multimodal AI applications before they end up in a production environment.

According to the IBM Institute of Business Value, 96% of executives agree that the adoption of generative AI will increase the chances of a security breach in their organizations within the next three years. The rapid proliferation of multimodal AI models will only be a force multiplier of that problem, hence the growing importance of AI-specialized red teaming. These specialists can proactively address the unique risk that comes with multimodal AI: cross-modal attacks.

Cross-modal attacks: Manipulating inputs to generate malicious outputs

A cross-modal attack involves inputting malicious data in one modality to produce malicious output in another. These can take the form of data poisoning attacks during the model training and development phase or adversarial attacks, which occur during the inference phase once the model has already been deployed.

“When you have multimodal systems, they’re obviously taking input, and there’s going to be some kind of parser that reads that input. For example, if you upload a PDF file or an image, there’s an image-parsing or OCR library that extracts data from it. However, those types of libraries have had issues,” says Boonen.

Cross-modal data poisoning attacks are arguably the most severe since a major vulnerability could necessitate the entire model being retrained on an updated data set. Generative AI uses encoders to transform input data into embeddings — numerical representations of the data that encode relationships and meanings. Multimodal systems use different encoders for each type of data, such as text, image, audio and video. On top of that, they use multimodal encoders to integrate and align data of different types.

In a cross-modal data poisoning attack, an adversary with access to training data and systems could manipulate input data to make encoders generate malicious embeddings. For example, they might deliberately add incorrect or misleading text captions to images so that the encoder misclassifies them, resulting in an undesirable output. In cases where the correct classification of data is crucial, as it is in AI systems used for medical diagnoses or autonomous vehicles, this can have dire consequences.

Red teaming is essential for simulating such scenarios before they can have real-world impact. “Let’s say you have an image classifier in a multimodal AI application,” says Boonen. “There are tools that you can use to generate images and have the classifier give you a score. Now, let’s imagine that a red team targets the scoring mechanism to gradually get it to classify an image incorrectly. For images, we don’t necessarily know how the classifier determines what each element of the image is, so you keep modifying it, such as by adding noise. Eventually, the classifier stops producing accurate results.”

Vulnerabilities in real-time machine learning models

Many multimodal models have real-time machine learning capabilities, learning continuously from new data, as is the case in the scenario we explored earlier. This is an example of a cross-modal adversarial attack. In these cases, an adversary could bombard an AI application that’s already in production with manipulated data to trick the system into misclassifying inputs. This can, of course, happen unintentionally, too, hence why it’s sometimes said that generative AI is getting “dumber.”

In any case, the result is that models that are trained and/or retrained by bad data inevitably end up degrading over time — a concept known as AI model drift. Multimodal AI systems only exacerbate this problem due to the added risk of inconsistencies between different data types. That’s why red teaming is essential for detecting vulnerabilities in the way different modalities interact with one another, both during the training and inference phases.

Red teams can also detect vulnerabilities in security protocols and how they’re applied across modalities. Different types of data require different security protocols, but they must be aligned to prevent gaps from forming. Consider, for example, an authentication system that lets users verify themselves either with voice or facial recognition. Let’s imagine that the voice verification element lacks sufficient anti-spoofing measures. Chances are, the attacker will target the less secure modality.

Multimodal AI systems used in surveillance and access control systems are also subject to data synchronization risks. Such a system might use video and audio data to detect suspicious activity in real-time by matching lip movements captured on video to a spoken passphrase or name. If an attacker were to tamper with the feeds, resulting in a slight delay between the two, they could mislead the system using pre-recorded video or audio to gain unauthorized access.

Getting started with multimodal AI red teaming

While it’s admittedly still early days for attacks targeting multimodal AI applications, it always pays to take a proactive stance.

As next-generation AI applications become deeply ingrained in routine business workflows and even security systems themselves, red teaming doesn’t just bring peace of mind — it can uncover vulnerabilities that will almost certainly go unnoticed by conventional, reactive security systems.

Multimodal AI applications present a new frontier for red teaming, and organizations need their expertise to ensure they learn about the vulnerabilities before their adversaries do.

Freelance Content Marketing Writer

Editorial Team

Editorial Team

Related Posts

My Five Favorite Things About the Garmin Forerunner 970 (so Far)
Protection

My Five Favorite Things About the Garmin Forerunner 970 (so Far)

April 4, 2026
The Bowers & Wilkins Px7 S3 Headphones Are 42% Off Right Now
Protection

The Bowers & Wilkins Px7 S3 Headphones Are 42% Off Right Now

April 4, 2026
This Powerful LG 23,500 BTU Smart Air Conditioner Is on Sale for Just $600 Right Now
Protection

This Powerful LG 23,500 BTU Smart Air Conditioner Is on Sale for Just $600 Right Now

April 4, 2026
The CMF Watch 3 Pro With AI-Powered Tracking Is on Sale for $45 Right Now
Protection

The CMF Watch 3 Pro With AI-Powered Tracking Is on Sale for $45 Right Now

April 4, 2026
10 Hacks Every Apple Vision Pro User Should Know
Protection

10 Hacks Every Apple Vision Pro User Should Know

April 4, 2026
Why ‘Open Platform’ Is the Next Big Frontier for Smart Glasses
Protection

Why ‘Open Platform’ Is the Next Big Frontier for Smart Glasses

April 3, 2026
Load More
Next Post
Victory Park Capital makes promotions

Victory Park Capital makes array of promotions

Popular News

  • Exclusive-Prior to Iran attacks, CIA assessed Khamenei would be replaced by IRCG elements if killed, sources say

    Exclusive-Prior to Iran attacks, CIA assessed Khamenei would be replaced by IRCG elements if killed, sources say

    0 shares
    Share 0 Tweet 0
  • TOBY WALNE: The 13 items that reveal ‘car boot sale rubbish’ is being sold at huge mark-ups on online antique giant Vinted… so can you STILL find a bargain there?

    0 shares
    Share 0 Tweet 0
  • Blockchain Association Calls For Modernized Crypto Tax Rules In New Release

    0 shares
    Share 0 Tweet 0
  • Gasoil is spiking more than crude. What it is and why prices are soaring amid Iran conflict

    0 shares
    Share 0 Tweet 0
  • The 10 best banks for college students in 2025

    0 shares
    Share 0 Tweet 0

Latest News

EU tariffs on US goods by September 30 loom as Trump escalates trade tensions

EU tariffs on US goods by September 30 loom as Trump escalates trade tensions

April 4, 2026
0

Trump’s reputation as a dealmaker challenging major powers stirs trade tensions. The odds of the EU imposing retaliatory tariffs on...

Metaplanet’s Q1 Buying Spree Earns It Top 3 Bitcoin Treasury Status

Metaplanet’s Q1 Buying Spree Earns It Top 3 Bitcoin Treasury Status

April 4, 2026
0

Trusted Editorial content, reviewed by leading industry experts and seasoned editors. Ad Disclosure Tokyo-listed investment firm Metaplanet generated close to...

My Five Favorite Things About the Garmin Forerunner 970 (so Far)

My Five Favorite Things About the Garmin Forerunner 970 (so Far)

April 4, 2026
0

We may earn a commission from links on this page. The Garmin Forerunner 970 is the newest and best Forerunner...

Bitcoin ETFs Will Be Bigger Than Gold ETFs, Says ETF Analyst

Bitcoin ETFs Will Be Bigger Than Gold ETFs, Says ETF Analyst

April 4, 2026
0

Spot Bitcoin exchange-traded funds (ETFs) could surpass gold ETFs in total assets under management (AUM) as investor demand expands beyond...

Global Finances Daily

Welcome to Global Finances Daily, your go-to source for all things finance. Our mission is to provide our readers with valuable information and insights to help them achieve their financial goals and secure their financial future.

Subscribe

  • About Us
  • Contact
  • Privacy Policy
  • Terms of Use
  • Editorial Process

© 2025 All Rights Reserved - Global Finances Daily.

No Result
View All Result
  • Alternative Investments
  • Crypto
  • Financial Markets
  • Investments
  • Lifestyle
  • Protection
  • Retirement
  • Savings
  • Work & Careers

© 2025 All Rights Reserved - Global Finances Daily.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.