2 Comments

This should not be an impossible problem to solve if you instantiate a reversibility check during fine-tuning stages (or RLHF stages). Even in the stages of reinforcement learning from human feedback pipelines, it should be possible to check for the presence of reversibility in a prompt as an additional attribute (in addition to "how correct is this output", add something like "is this an appropriately posed reversed question, and is the reversed instance reversible relative to the initial prompt"), instantiate the reversible output, evaluate whether the reversible instance is true or not (by finding the inverse question and answering it or something, as said before), and further train and reward with the reversible instance when it is true, reward a bit more when models correctly identify reversible instances, or something along those lines.

I wonder if solving these kinds of problems (of compositionality, more generally) will help with generalization, and if so, kudos to you for writing this post so timely.

Maybe even fields like robotics algorithms, if properly implemented and integrated into these LLM domains (including NLP and MLM), can be useful in solving this problem.

You might want to think about something along these lines: https://arxiv.org/pdf/2106.04480v2

Expand full comment

It's almost as these are next token prediction models that are great at mimicry but have no concept of truth!

I feel a little sad that I don't get to appreciate LLMs for the amazing technical achievements that they are, because right from the off their creators and owners and cheer teams have been overhyping them to absurd levels. How I wish for a prominent figure in tech to come out and say "machine learning is great but there's no reason to think any AI is going to develop any kind of human-like cognitive ability because brains aren't statistical models and it would be best for all of us to just stop comparing AIs to minds".

Expand full comment