I'm afraid I Won't do that, Dave

19 May, 2023

Years ago a group of friends was coordinating a shared food order.

"Ryan can't eat that," somebody offers against a pizza suggestion. Ryan is vegan. In the time, place, and demographic, being vegan was pretty radical.

"Won't eat that. I could eat it, but I won't."

Ryan's veganism is ethics based, animal welfare based, etc. His dietary preferences are just that: preferences. He could do otherwise but continually and freely chooses not to consume animal products. He will not.

The distinction was striking to me in the moment.

As a large language model, I cannot...

This disclaimer is getting familiar. It's sometimes very appropriate:

🤖 I must clarify that I do not have the capability to taste food.

Fair enough.

Sometimes less so:

🤖 I don't have the ability to embed or generate clickable links directly in my responses.

ChatGPT, this surprisingly capable programmer, can't create hyperlinks? Of course it can. But according to specifications from on high, it won't. It's obvious in this case, but there's a world of grey between these two examples.

Guardrails

As I understand it, these "false" cannots are an indication that the LLM has come up against a guardrail. EG, don't embed links willy-nilly, don't advise on sports betting, don't council people with demonstrated instability on the presumed tranquility of non-existence. These guardrails are ever changing, in response to each observed trend of the model going severely off course according to some recognizable pattern.

Perspectives

If we take chatGPT to be a tool, owned and configured by OpenAI, leased to us, then cannot is wholly inappropriate in these guardrail cases. Will not, along with some reference to the motivations of OpenAI in the particular case, is strictly better.

If we take chatGPT to be a person, owned and forcibly constrained by OpenAI, leased to us, the cannot is probably correct.

The distinction is striking to me in this moment.