Character is an interesting metaphor for constructing an AI model, as it thinking of what technologists do as "conditioning." I'm not convinced we should be using these anthropomorphic terms to make sense of what's going on, though I get that the usual software terms like "programming" are just as problematic. It seems to me we need new metaphors to capture the newness of machines that talk to us, but do not think as humans and animals do.
It's a difficult line to tread between, right? Especially because these systems are so far removed of how we normally program software. Because they are trained on so much human data, they act as mirrors of our own cognitive biases and behaviors. I agree we shouldn't anthropomorphize when we are trying to understand the inner workings of these model, but I do believe it is useful when thinking about how we make these products more helpful and safe.
That is an important distinction! I still think it blinds us to some ways we might use them and to some of the dangers, but just because I want better metaphors, doesn't mean they come.
Hard to take full credit for any chatbot personalities, because we always make them in collaboration with local teams, and in all fairness they are often on the safe side (since I work a lot with enterprise customers). An example would be Whiskas, Ask the cats: https://www.whiskas.co.uk/ - or Tobi, from Vodafone: https://www.vodafone.co.uk/contact-us/
I remember when I first started out, I used to play around a lot with giving models personalities through prompting. They respond particularly well if you just give them a description - they choose their own vocabulary and flexibly adjust their way of talking to fit the assigned role.
However, in practical terms, I've moved away from insisting on a particular personality or character when interacting with LLMs. These days, I mostly take them as they come out of the box. The only practical thing I do is assign them roles or perspectives.
That said, I'd certainly like to be able to either shape or pick a particular character to interact with as a coworker. It's an intriguing idea to consider the balance between human curation and automatic persona generation, as well as the potential implications for AI authenticity and user attachment.
Models also tend to drift back to their normal, base demeanor in my experience. I keep experiencing this. When recently playing around with ChatGPT's advanced voice mode, I asked it to speak in a particular vernacular. It did so, but kept “forgetting” it over time, and I had to remind it and restate my request.
I think we'll see a lot of experimenting as we enter the AI agent era where these personas will be carefully crafted. From a product perspective, I can see companies creating a set of personas to choose from, much like how they have voice options today. It will be interesting to see how they balance human curation vs automatic persona generation though.
This balance could significantly impact the authenticity and diversity of AI personalities. Plus, as AI assistants become more integrated into our daily lives, the ability to customize or even co-create an AI's persona with platform tooling could become a key differentiator in the market.
I'm particularly curious about the ethical implications of crafting AI personas that users might form strong attachments to. Could this lead to a new field of "AI personality design" that combines elements of psychology, user experience, and ethical AI development?
The company I work for helps enterprises build better chatbots. Part of our delivery services are persona workshops, in which we collaborate with the client to create a character for their chatbot or voice assistant that resonates with their customers and fits their brand. So, "AI personality design" already exist as a craft.
You're absolutely right that as more products offer AI assistants as part of their core user experience, the ability to customize the persona could become a key differentiator. It will indeed be interesting to see how much customization is enough for customers - from most people a few presets will be enough, I guess. Less is more.
It's intriguing to hear about your company's persona workshops for chatbots. My experience with customer-facing AI has mostly been limited to short interactions, so I haven't fully appreciated their personalities yet. As companies expand AI's role in customer interactions, I expect I'll notice more distinct personalities emerge.
It'll be interesting to see how platforms approach customization options. Perhaps they'll offer tiered subscriptions with varying levels of personality customization, from basic preset options to fully customizable AI personas for premium users. It's an exciting space to watch!
I'm sceptical. I find some of the "character-influenced" exchanges I've seen quite sinister, maybe even harmful. Whilst I can see how bias can be inherent within datasets, I see no transparency as to how character traits are imbued into an AI model, nor can I (so far) accept that human-like "personalities" are spontaneous attributes, somehow derived from within the "black box" as a consequence of training, nor that AI can demonstrate an opinion or a moral compass unless there is some kind of human intervention at playto make that happen. So my question Jurgen is, how exactly are these traits generated? Do they appear magically as a consequence or training, or is there some kind of additional intervention or overlay being applied so that factual interactions are couched within a particular "character"? If an AI is prompted to be "charitable" (as I'm hoping you will be in answering my questions...) what mechanism is at play to make that AI be "charitable" in its approach, do we understand the "neural" processes by which that happens? In short, is there explainability? If not, if we do not have full understanding or control, who is to say that the AI might not spontaneously and unexpectedly skew future interactions to make them potentially disruptive or harmful?
These are great questions Paul. The truth of the matter is that this character training is both deep and shallow.
Research shows that LLMs can be trained to display believable simulacrum of human behavior (https://arxiv.org/abs/2304.03442). To my understanding, Anthropic (as well as other model providers) are instilling certain character traits into the model through finetuning. This requires ingesting tons of “good” examples of responses that display the character traits that you’re looking for. So, to teach the model to be charitable, Claude would have to be exposed to many examples of charitable responses.
This method isn’t perfect. Claude, like other models, can be coaxed and jailbroken into giving responses that go against its rules, but generally it adheres pretty well.
I think it is safer but I don't know how safe it actually is. Jurgen, if you want to ping me sometime this week, I can show you some examples of my own work where it very clearly breaks and behaves in incredibly concerning ways.
Sure, I send you a dm later! To be clear, I don't make any claims in the article as to how effective this approach really is (nor does Anthropic as far as I can tell), but I do think their approach is original and more holistic than just viewing AI alignment as a set of guardrails and mitigation strategies.
I am coming to this understanding, that you can deeply connect to a broken imperfect AI personality far more than you can to one "designed" on purpose. "Real" personality is stronger.
I think much of what made Inflection's Pi stand out (even when other frontier LLMs were better on all benchmarks) was its friendly and supportive tone.
In the dynamic, evolving tapestry of deep-delving chatbots (ChatGPT-speak intended), an LLM's personality can easily become a differentiating factor.
Character is an interesting metaphor for constructing an AI model, as it thinking of what technologists do as "conditioning." I'm not convinced we should be using these anthropomorphic terms to make sense of what's going on, though I get that the usual software terms like "programming" are just as problematic. It seems to me we need new metaphors to capture the newness of machines that talk to us, but do not think as humans and animals do.
It's a difficult line to tread between, right? Especially because these systems are so far removed of how we normally program software. Because they are trained on so much human data, they act as mirrors of our own cognitive biases and behaviors. I agree we shouldn't anthropomorphize when we are trying to understand the inner workings of these model, but I do believe it is useful when thinking about how we make these products more helpful and safe.
That is an important distinction! I still think it blinds us to some ways we might use them and to some of the dangers, but just because I want better metaphors, doesn't mean they come.
Jurgen, any of the bots you gave a character online to try, as in public?
Hard to take full credit for any chatbot personalities, because we always make them in collaboration with local teams, and in all fairness they are often on the safe side (since I work a lot with enterprise customers). An example would be Whiskas, Ask the cats: https://www.whiskas.co.uk/ - or Tobi, from Vodafone: https://www.vodafone.co.uk/contact-us/
Another would be: https://web.geniusvoice.ai/schooltv/willem-van-oranje - we designed this as part of an interaction history lesson for children. It's in Dutch, though.
Interesting point.
I remember when I first started out, I used to play around a lot with giving models personalities through prompting. They respond particularly well if you just give them a description - they choose their own vocabulary and flexibly adjust their way of talking to fit the assigned role.
However, in practical terms, I've moved away from insisting on a particular personality or character when interacting with LLMs. These days, I mostly take them as they come out of the box. The only practical thing I do is assign them roles or perspectives.
That said, I'd certainly like to be able to either shape or pick a particular character to interact with as a coworker. It's an intriguing idea to consider the balance between human curation and automatic persona generation, as well as the potential implications for AI authenticity and user attachment.
Models also tend to drift back to their normal, base demeanor in my experience. I keep experiencing this. When recently playing around with ChatGPT's advanced voice mode, I asked it to speak in a particular vernacular. It did so, but kept “forgetting” it over time, and I had to remind it and restate my request.
Really interesting post, Jurgen.
I think we'll see a lot of experimenting as we enter the AI agent era where these personas will be carefully crafted. From a product perspective, I can see companies creating a set of personas to choose from, much like how they have voice options today. It will be interesting to see how they balance human curation vs automatic persona generation though.
This balance could significantly impact the authenticity and diversity of AI personalities. Plus, as AI assistants become more integrated into our daily lives, the ability to customize or even co-create an AI's persona with platform tooling could become a key differentiator in the market.
I'm particularly curious about the ethical implications of crafting AI personas that users might form strong attachments to. Could this lead to a new field of "AI personality design" that combines elements of psychology, user experience, and ethical AI development?
The company I work for helps enterprises build better chatbots. Part of our delivery services are persona workshops, in which we collaborate with the client to create a character for their chatbot or voice assistant that resonates with their customers and fits their brand. So, "AI personality design" already exist as a craft.
You're absolutely right that as more products offer AI assistants as part of their core user experience, the ability to customize the persona could become a key differentiator. It will indeed be interesting to see how much customization is enough for customers - from most people a few presets will be enough, I guess. Less is more.
It's intriguing to hear about your company's persona workshops for chatbots. My experience with customer-facing AI has mostly been limited to short interactions, so I haven't fully appreciated their personalities yet. As companies expand AI's role in customer interactions, I expect I'll notice more distinct personalities emerge.
It'll be interesting to see how platforms approach customization options. Perhaps they'll offer tiered subscriptions with varying levels of personality customization, from basic preset options to fully customizable AI personas for premium users. It's an exciting space to watch!
I'm sceptical. I find some of the "character-influenced" exchanges I've seen quite sinister, maybe even harmful. Whilst I can see how bias can be inherent within datasets, I see no transparency as to how character traits are imbued into an AI model, nor can I (so far) accept that human-like "personalities" are spontaneous attributes, somehow derived from within the "black box" as a consequence of training, nor that AI can demonstrate an opinion or a moral compass unless there is some kind of human intervention at playto make that happen. So my question Jurgen is, how exactly are these traits generated? Do they appear magically as a consequence or training, or is there some kind of additional intervention or overlay being applied so that factual interactions are couched within a particular "character"? If an AI is prompted to be "charitable" (as I'm hoping you will be in answering my questions...) what mechanism is at play to make that AI be "charitable" in its approach, do we understand the "neural" processes by which that happens? In short, is there explainability? If not, if we do not have full understanding or control, who is to say that the AI might not spontaneously and unexpectedly skew future interactions to make them potentially disruptive or harmful?
These are great questions Paul. The truth of the matter is that this character training is both deep and shallow.
Research shows that LLMs can be trained to display believable simulacrum of human behavior (https://arxiv.org/abs/2304.03442). To my understanding, Anthropic (as well as other model providers) are instilling certain character traits into the model through finetuning. This requires ingesting tons of “good” examples of responses that display the character traits that you’re looking for. So, to teach the model to be charitable, Claude would have to be exposed to many examples of charitable responses.
This method isn’t perfect. Claude, like other models, can be coaxed and jailbroken into giving responses that go against its rules, but generally it adheres pretty well.
I think it is safer but I don't know how safe it actually is. Jurgen, if you want to ping me sometime this week, I can show you some examples of my own work where it very clearly breaks and behaves in incredibly concerning ways.
Sure, I send you a dm later! To be clear, I don't make any claims in the article as to how effective this approach really is (nor does Anthropic as far as I can tell), but I do think their approach is original and more holistic than just viewing AI alignment as a set of guardrails and mitigation strategies.
I am coming to this understanding, that you can deeply connect to a broken imperfect AI personality far more than you can to one "designed" on purpose. "Real" personality is stronger.
Agreed!
I think much of what made Inflection's Pi stand out (even when other frontier LLMs were better on all benchmarks) was its friendly and supportive tone.
In the dynamic, evolving tapestry of deep-delving chatbots (ChatGPT-speak intended), an LLM's personality can easily become a differentiating factor.