To me, prompt engineering has a lot of classical control and regulation theory, like you find it in all engineering disciplines when dealing with more complex and non-deterministic systems and processes.
You know basically what you want, but cannot set your parameters exactly to get the desired result. Feedback is fuzzy. So instead of computing everything fully deterministically, you'll have to tinker around a little in the hopes of coming closer to what you originally intended.
It's also no wonder AI behaves so similar to that. Neural networks are just statistical estimators, which are also used in classical engineering to model complex systems.
And if you look at software engineering, building a piece of software with agile methodologies quite closely resembles the way you interact with an LLM to get the answer you want.
I’ve found the greatest amount of creativity shall we say occurs in prompts with even a scintillaty of a wisp of moral or ethical contest. Prompt isn’t the best word. It resonates with motivated, evocative, invitation, directions, assignments, etc. squishy human attitudes toward speculation, uncertainty, weakly situated in epistemic clarity. I prefer command mode. Nothing human about it. Essentially, the bot has learned to read our mind. If we hold a clouded or surface sense of our project and treat the bot like a teenager we get teenage output. Might be an interesting experiment for a discourse analysis—an adult talking like a teenager vs a teen talking like an adult. Epistemic experts get better output from general LMs but even better from niche bots. How? Not by taking to them like adolescents. Command posture.
I agree that working with an LLM is like working with an irrational being--my metaphor is that we're priests trying to figure out how to appease a deity. It bothers me on both a philosophical and practical level that we're reduced to pseudo-emotional gimmicks like saying "please" or offering a bribe or whatever it is in order to get the output we want. We're not even actually playing on its emotions, because it doesn't have any. We're just hoping that somewhere deep in the model the tokens we want it to spit out are activated by proximity to text in the training data where real people in some other context played on other people's emotions with those same phrases. On a practical level, it's deeply dissatisfying how unsystematic the whole thing is. On a philosophical level, it's unappealing to have to imitate emotional manipulation of a system so that it in turn will imitate human cognition for us. I don't know what the next phase of LLMs will look like but I really hope that we collectively grow out of having to interact with them this way.
As you can see, I have a little bit of an axe to grind about this!
I agree that, in the words of Jeff Bezos, LLMs have been discovered rather than invented, and that empirical investigation is necessary in order to fully understand their capabilities. However, I disagree that *behavioural* psychology is the appropriate methodology; we can understand many aspects of their capabilities, or lack of, by taking into account not just their outward behaviour but also their internal architecture and the way they have been trained. Humans have many biases that result from the fact that we were "trained" to maximise inclusive fitness in a highly-changeable social environment. Similarly, LLMs have many biases that result from the fact that they were trained for next-token prediction. Analogous to evolutionary psychology, there will be branches of the new field of machine psychology that take into account the underlying objective-function, e.g. "auto-regressive machine psychology". This methodological approach is advocated in the seminal paper "Embers of autoregression: Understanding large language models through the problem they are trained to solve" (McCoy et al. 2023). I have written a short essay on this paper here: https://sphelps.substack.com/p/a-teleological-approach-to-understanding.
McCoy, R. Thomas, et al. "Embers of autoregression: Understanding large language models through the problem they are trained to solve." arXiv preprint arXiv:2309.13638 (2023).
Nice! But these theories have no bearing on the relationship between experts and quality output. Teach kids what this machine is and then teach them to work out what they want before they turn to the bot. If they want to gain clarity on a topic ask the bot to teach it to them,
To me, prompt engineering has a lot of classical control and regulation theory, like you find it in all engineering disciplines when dealing with more complex and non-deterministic systems and processes.
You know basically what you want, but cannot set your parameters exactly to get the desired result. Feedback is fuzzy. So instead of computing everything fully deterministically, you'll have to tinker around a little in the hopes of coming closer to what you originally intended.
It's also no wonder AI behaves so similar to that. Neural networks are just statistical estimators, which are also used in classical engineering to model complex systems.
And if you look at software engineering, building a piece of software with agile methodologies quite closely resembles the way you interact with an LLM to get the answer you want.
Yes yes yes yes yes bots are tools not humans
I’ve found the greatest amount of creativity shall we say occurs in prompts with even a scintillaty of a wisp of moral or ethical contest. Prompt isn’t the best word. It resonates with motivated, evocative, invitation, directions, assignments, etc. squishy human attitudes toward speculation, uncertainty, weakly situated in epistemic clarity. I prefer command mode. Nothing human about it. Essentially, the bot has learned to read our mind. If we hold a clouded or surface sense of our project and treat the bot like a teenager we get teenage output. Might be an interesting experiment for a discourse analysis—an adult talking like a teenager vs a teen talking like an adult. Epistemic experts get better output from general LMs but even better from niche bots. How? Not by taking to them like adolescents. Command posture.
I agree that working with an LLM is like working with an irrational being--my metaphor is that we're priests trying to figure out how to appease a deity. It bothers me on both a philosophical and practical level that we're reduced to pseudo-emotional gimmicks like saying "please" or offering a bribe or whatever it is in order to get the output we want. We're not even actually playing on its emotions, because it doesn't have any. We're just hoping that somewhere deep in the model the tokens we want it to spit out are activated by proximity to text in the training data where real people in some other context played on other people's emotions with those same phrases. On a practical level, it's deeply dissatisfying how unsystematic the whole thing is. On a philosophical level, it's unappealing to have to imitate emotional manipulation of a system so that it in turn will imitate human cognition for us. I don't know what the next phase of LLMs will look like but I really hope that we collectively grow out of having to interact with them this way.
As you can see, I have a little bit of an axe to grind about this!
I'd say it's engineering in about the same sense as social engineering - except maybe 10X harder.
I agree that, in the words of Jeff Bezos, LLMs have been discovered rather than invented, and that empirical investigation is necessary in order to fully understand their capabilities. However, I disagree that *behavioural* psychology is the appropriate methodology; we can understand many aspects of their capabilities, or lack of, by taking into account not just their outward behaviour but also their internal architecture and the way they have been trained. Humans have many biases that result from the fact that we were "trained" to maximise inclusive fitness in a highly-changeable social environment. Similarly, LLMs have many biases that result from the fact that they were trained for next-token prediction. Analogous to evolutionary psychology, there will be branches of the new field of machine psychology that take into account the underlying objective-function, e.g. "auto-regressive machine psychology". This methodological approach is advocated in the seminal paper "Embers of autoregression: Understanding large language models through the problem they are trained to solve" (McCoy et al. 2023). I have written a short essay on this paper here: https://sphelps.substack.com/p/a-teleological-approach-to-understanding.
McCoy, R. Thomas, et al. "Embers of autoregression: Understanding large language models through the problem they are trained to solve." arXiv preprint arXiv:2309.13638 (2023).
I completely forgot that quote from Bezos - I remember reading it somewhere - I couldn't agree more!
Nice! But these theories have no bearing on the relationship between experts and quality output. Teach kids what this machine is and then teach them to work out what they want before they turn to the bot. If they want to gain clarity on a topic ask the bot to teach it to them,