Training Data Poisoning Lab
Solution overview
Training Data Poisoning is a critical concern and refers to attackers deliberately corrupting the training data that LLMs learn from or utilize in the instances of retrieval augmented generation (RAG). This manipulation aims to introduce vulnerabilities, backdoors and/or biases to the models compromising not only the security but also the effectiveness of a model.
This lab will showcase an online forum that has decided to get ahead of the competition and implement a state-of-the-art AI that will function like a RAG using recipes on the website to provide users with a better experience. However, a disgruntled recipe author doesn't like this idea and decides to conduct some training data poisoning. The lab will walk through poisoning the RAG training data and take a look at potential security techniques to eliminate the risk of training data poisoning.