Thought Leaders #1: Gregory Whiteside, CEO HumanFirst
1st edition of my interview series with experts in the industry
I’m proud to present Gregory Whiteside as my 1st guest of the interview series Thought Leaders.
Gregory Whiteside is the CEO of HumanFirst, the leading data-centric tool for NLU/NLG Design. HumanFirst helps teams curate, manage, and continuously improve their training data. Their platform accelerates any application where building quality data is critical: this includes intent-based and generative chatbots, virtual assistants, CCAI, IVR, agent augmentation, analytics, or NLU-powered help centers.
In this interview, we discuss the need for sophisticated NLU tooling and the impact of large language models on the conversational AI industry. Enjoy!
Starting off, could you explain why it is so hard for computers to understand human language?
At this point in time, I’d argue that computers are actually quite capable of understanding human language, they just need to be trained and guided to do the right thing with that understanding — it’s the humans who are slacking in that area.
Today, deep-learning and specifically transformer models provide extremely capable natural language understanding (NLU), on even very nuanced text. The amount of labeled data required to train or fine-tune these types of models is diminishing, and it’s now possible to build a classifier that can correctly understand / label / tag new text with a high degree of accuracy and granularity, with very little effort.
The onus is therefore on humans to:
Figure out “what” they want the computer to look for & understand
Find a few few training examples for each “intent” so computer can learn to identify these intents in new data
Structure this knowledge (intents and training data) in ways that can be easily mapped to business value
How can companies benefit from HumanFirst?
HumanFirst is the leading enterprise solution for managing the NLU data lifecycle: we allow teams to tackle the three points mentioned above really easily, without any technical skills.
Specifically, we allow teams to store and explore their raw conversational data (i.e.: voice of the customer data, ChatGPT conversation logs, live chat logs, etc.) and turn it into highly personalized labeled data (intents/labels, entities, utterances, training examples, prompts). These datasets can be used to train or fine-tune the underlying NLU or LLMs, and deploy highly accurate and personalized classifiers for chatbots, IVRs, CCAI, conversational analytics etc.
HumanFirst provides direct integrations with DialogFlow, Rasa, Watson, Co:here, OpenAI and other conversational AI / NLU providers, allowing teams to evaluate the performance of their data against their production NLU, and seamlessly synchronize project data (with git-level diff and merge flows).
In short, our no-code UX accelerates the design, development, and maintenance of NLU training datasets, and provides the most powerful, flexible and creative experience to turn raw unstructured data into business and artificial intelligence.
“I’d argue that computers are actually quite capable of understanding human language, they just need to be trained and guided to do the right thing with that understanding.”
Large language models are the talk of the town. What is your view on their usefulness/utility?
We’re extremely bullish on large language models. We partnered with Co:here this year, have integrated OpenAI’s models and latent spaces, and are continuing to build out our product roadmap and capabilities around LLMs. We believe that state-of-the-art data-centric tooling is needed (and will continue to be for a long time!) to successfully apply foundation models to enterprise use-cases.
How do LLMs shape the work that you do at HumanFirst?
On the classification side, we’re integrating Co:here and OpenAI’s latent spaces to power our core semantic search & clustering workflows: being able to explore datasets with higher dimensional LLM latent spaces can allow curation of better performing NLU datasets, with less training examples, from noisier data (i.e: ASR/STT on human conversations). We’re also making it easy for users to fine-tune the foundation models using the resulting datasets, directly from HumanFirst.
On the generative side, we’re very excited, too. ChatGPT was one of the best things that could happen to accelerate the role of NLU design in the enterprise.
How do you see conversational AI space evolving in the next 3 years?
Large language models will completely change the way we think about and build conversational AI, and this will definitely happen in less than 3 years.
ChatGPT showed everyone how close we are to something that feels like real AI. I’m certain that Google’s LaMDA will follow as well as a slew of others (Stability.ai etc). GPT-4 is likely already in beta and I can’t imagine how powerful the next generations of GPT, trained on billions of dollars worth of GPUs (and voice data transcribed with Whisper!) will be.
The main concern with these models today is still around the quality / relevancy of what they return (also referred to as “hallucinations”). I think this will sooner than later not be a problem: better answers mapping to custom domains will be solved with prompt engineering and finetuning, as well as higher-quality input data to bootstrap the model with.
We’re also very quickly reaching the point where AI will be able to retrieve information dynamically from sources it wasn’t pre-trained with (i.e: custom knowledge-bases, web-crawls etc). Being able to bootstrap the domain information it lacks, on the fly, with dynamic data ingestion and fine-tuning will turn LLMs from static “snapshots” into truly reactive agents.
Therefore, the CxD role will evolve from being about high-effort / dumb “creation” (hardcoded dialog flows, responses, personality etc), to “observing and educating” a true AI that can generate answers and think for itself, but needs guidance and to be taught the right guidelines to follow – much like a human!
Continuously monitoring the AI’s output and tweaking its understanding of the world, or the parameters of its response or action for a given intent / path / conversation fragment, will be a big part of the future CxD’s role. As a result, I expect tools that provide observability over NLG output to thrive.
For the foreseeable future, I expect the hybrid coexistence of generative + intent-based approaches will accelerate the adoption of LLMS for e2e conversational AI use-cases. Understanding what the model is generating (i.e: classifying the model output as good/bad) and being able to supervise responses with intent-based prompt engineering, finetuning (or straight up hijacking if it needs to talk to an API) will be critical.
Follow Gregory Whiteside on LinkedIn
Or visit the website of HumanFirst.ai
If you liked the interview, consider giving this post a like <3