Training data poisoning
As you have already learned in previous chapters, generative AI can be grounded and trained to achieve results specific to you and/or to your organization’s objectives. But what happens when LLMs can be trained to achieve objectives that are not aligned with your specific needs, resulting in misleading, false, or factually incorrect completions or output that is irrelevant or insecure? As we know, the output is only going to be as good as the input, and the output is only as good as the data the LLM was trained upon.
Training data poisoning is a concept where the training data itself may contain incorrect information or harmful and biased data. In this way, these training data have been “poisoned” and, thus, provide
bad results.
Important note
There are some platforms that provide crowd-sourced LLMs/models and datasets. Many of these platforms provide a way for any user to upload their own datasets and LLMs. To ensure your organization is safeguarded against training data poisoning, you should only use training data obtained from trusted sources, from sources that have high ratings, or from well-known sources. For example, the Hugging Face repositories use a rating system, and feedback is provided by the community. Moreover, they provide an LLM “leaderboard,” which identifies which LLMs are popular and widely used. Similarly, the Hugging Face “Hub” is home to a collection of community-curated and popular datasets. Hugging Face is also SOC2 Type 2-certified, meaning it can provide security certification to its users and actively monitor and patch any security weaknesses. Of course, always confirm and verify the integrity of any community datasets you use to ensure that the training data have not been poisoned or tampered with.
Insecure plugin (assistant) design
Plugins enhance the capabilities of LLMs by completing various steps or tasks to make them versatile. The names of plugins have changed a few times already over their brief existence, and depending on which vendor you are working with, they are sometimes known as connectors, tools, or, more recently, “assistants,” but we will use the word “plugins” to refer to how LLMs can be extensible in programmatic ways, as was covered in earlier chapters.
As a refresher, the following list provides a few examples of how plugins can extend LLM capabilities and how this can open the door for potential malicious activity, thus posing another security threat and potential attack vector:
- Plugins can execute code. As you already know, LLMs support prompt/completion sequences; thus, it is the plugins that enhance these capabilities by being able to execute code. Say you want to update a data record in a database based on interactions with the LLM. A plugin can help reference the database record, modify it, or even delete it, depending on how the plugin is written. As you can see, any code execution should have guardrails and protection in place to ensure the plugin is doing what it is designed to do and nothing more.