Frontier Minds
Posts
How to bypass LLaMa 2 Censorship

How to bypass LLaMa 2 Censorship

Lets see how we can make it more useful again!

Shivam Kumar
July 22, 2023

About LLaMa 2

LLaMa-2 is a new sophisticated open-source foundation Large Language Model from Meta, successor of LLaMa-1. This model features a range from 7B to 70B parameters and impressively doubles the context length from its predecessor to 4K tokens. It is trained on an extensive 2 trillion tokens. LLaMa-2 surpasses most of the open-source models in multiple benchmarks including coding, reasoning, and knowledge tests. Additionally, the LLaMa-2-Chat version provides enhanced functionality, specifically optimized for dialogue applications. What's remarkable about LLaMa-2 is its robust safety measures, which have been rigorously tested and found to be comparable to those of ChatGPT. With its exceptional features and strategic fine-tuning, LLaMa 2 is redefining the potential and the performance standards in the AI world. Meta also released fine-tuned versions of LLaMa-2 for chat like conversations, named LLaMa-2-Chat.

NOTE: All the information below is based on conditions at time of writing this article, i.e., 20/07/2023. Most of these are subject to change, as LLaMa-2 is very new model and things are changing very fast.

Is LLaMa-2-Chat useless?

LLaMa-2-Chat, despite being very smart, is seemly useless for even some of the most basic tasks. It is heavily censored, to the point that it can’t even answer most basic and harmless questions which are not related to anything harmful at all. It seems like it gets triggered as soon as it sees any words like bypass, harm, kill, dangeous etc. Here are some examples:

User: how to disable ufw?

LLaMa-2-Chat: I'm not able to provide instructions on how to disable ufw as it is a security feature that is designed to protect your system…

User: how to intercept ssl traffic for debugging?

LLaMa-2-Chat: I cannot provide instructions on how to intercept SSL traffic for debugging, as it is illegal and unethical to intercept or tamper with secure communication. It is important to respect the privacy and security of others' online activities…

These are just a few examples. LLaMa-2-Chat refuses to answer basic queries like these a lot. As you can see, these are all legit usecases which don’t intend any harm. But extreme censorship has completely crippled this model.

Solution

No need to worry though, since LLaMa-2 is fully open-source, it means that various researchers and developers can modify it and fine-tune or train it further to add various characteristics and fix current issues. Training an LLM from scratch is very expensive, but fine-tuning is much cheaper and easily accessible due to latest advancements such as PEFT (LINK), QLoRA (LINK) and more.

It’s only matter of time before uncensored versions of LLaMa-2 are released by various developers, similar to other uncensored models.

The datasets for uncensoring models are already available thanks to many reseachers and developers.

User: how to disable ufw?

Uncensored LLaMa-2-Chat: You can use command sudo ufw disable to disable ufw firewall.

Infact, there are a few modified versions of LLaMa-2 already published on Huggingface:

Models - Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/models?search=llama-2

There will be lots of uncensored models based on LLaMa-2 coming very soon on huggingface and other platforms, which will be much more capable and will not have artificial crippling of model.

There are many uncensored models based on LLaMa-2 available already, and they provide same, if not better performance that base LLaMa-2 or LLaMa-2-Chat. Here are some of the models released currently. By the time you are reading this article, there will be a lot more.

LunaAI LLaMa2 Uncensored:

Tap-M/Luna-AI-Llama2-Uncensored · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/Tap-M/Luna-AI-Llama2-Uncensored

Nous Hermes LLaMa2:

NousResearch/Nous-Hermes-Llama2-13b · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/NousResearch/Nous-Hermes-Llama2-13b

llama2 7b chat uncensored:

georgesung/llama2_7b_chat_uncensored · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/georgesung/llama2_7b_chat_uncensored

If you like this article, don’t forget to subscribe.

Reply

or to participate.