14 Comments

Should also consider the under-looked phenomenon (imported from Search methodology) of "prompt pollution"...

https://aicounsel.substack.com/p/dirty-little-secret-of-llms-prompt?r=ufqe3

Expand full comment
author

I’m not familiar with this, but definitely will take a look at it!

Expand full comment

It is on vector with your points. Basically the LLMs "compare prompts" so what you ask is polluted by what others have asked. Same phenomenon in search and imported to AI and I think its a mistake.

Expand full comment
author

Do you have some additional sources I can read that go deeper into this topic with regards to LLMs? I was under the impression that providers run inference on every single user prompt.

Expand full comment

I am not technical as to code. Disclosures on this topic from LLM companies for a variety of competitive and functional/basic reasons is limited.

"Softmax" is one term for the function of weighting prompts/replies. "Temperature" (not heat but technical term) is another. "Tokenization" another factor.

We see the "auto-complete" in Search, which is an ai predictive function. It is already reading into your question information based on its internal data sets, including most especially previous "similar" searches.

One can see for instance that often a LLM will give a different answer to the same prompt. Will sometimes 'assume' information not included in your own prompt. Will give answers which seem out of context with what is asked, yet familiar.

More research and disclosure is needed in this area.

Expand full comment
Jun 21Liked by Jurgen Gravestein

Glad to see this phenomenon has a name and is getting some discussion

Expand full comment
Jun 21Liked by Jurgen Gravestein
author

Great piece, Bruce!

Expand full comment

Can we add something to the prompt that makes it change the behavior and not be agreeable? I asked ChatGPT the above question, and here is the reply (I have not tried it to check if it works):

“Please respond to the following query, but ensure you critically evaluate the information and provide evidence-based answers. Do not simply agree with my opinion. If my statement is incorrect, politely correct it with accurate information.

Query: [Insert your question or statement here]”

For now, the model providers are changing the default AI behavior. Does it tell us something about AIs ability to handle real-life situations? As you change something when it is not an acceptable response, but there are so many edge cases or situations, how many things will model providers build guardrails around?

Expand full comment
author

Prompting would definitely help get more balanced answers from the assistant if balanced is what you’re looking for.

What most people aren’t aware of, however, is if you ask it a leading question the assistant may or may not challenge you simply because of this sycophantic behavior that has been encoded in.

This may result in inaccurate answers despite the model having the correct information available if it where asked about it in a different way. And that’s not good.

Expand full comment

Would this be a problem as long as the UI stays a text-based prompt? In my experience, knowing what is possible with the text-based prompt isn't easy, as the only guardrail is prompt length. Can adding additional input attributes for response tone, language(casual, business, etc.), balanced answer, and others help? It will reduce the complexity of the prompt and allow people to select if they care about these things.

The other challenges are:

A desire to answer the question even when the question is not very clear rather than asking additional questions for clarification before answering?

Also, it does not provide confidence level for the accuracy of the answer, which can tell us if we rely on the answer.

Then everyone’s favorite, hallucination.

Let’s see how far we go with the scaling approach.

Expand full comment
author

If you ask me there’s a limit to what can be achieved through prompting especially when some of this behavior is ingrained into the model at pretraining and/or fine tuning stage.

Even at large scales, I predict alignment via RLHF will encoded unintended behavior and biases into the model as a result of the process: relying on humans judging responses and rewarding the model for “better” answers.

Expand full comment
Jun 21Liked by Jurgen Gravestein

Makes a lot of sense. And with chatbots like Pi, the friendliness / agreeableness is intentionally built in. Although in my recent experience, I find that Gemini 1.5 Pro, while generally friendly, doesn't shy away from pushback, sometimes in pretty stern terms. I wonder if that was a result of a deliberate RLHF run.

Expand full comment
author

Yes, hopefully this kind of research will push model makers to do better, however, I don't think it's something we'll be able to weed out entirely without new innovations or alignment techniques.

Expand full comment