Software reasoning in GPT-4

I’ve had some really nice successes using GPT-4 for coding, both in generating boilerplate code, and summarizing and explaining existing documentation. A few days ago, I decided to push things a little further, and see what kind of “understanding” stock ChatGPT might have. The approach I took was trying to step out of the model’s training data, and compare its responses to what I might expect from a junior to mid-level engineer.

Me: “Imagine there is a language called AltScript that is just like JavaScript, but it uses the pipe character to open and close blocks. How would you calculate the fibonnaci series in this language?”

ChatGPT generated this fibbonacci function (bracketed with some cheerful chatter, omitted):

function fibonacci(n) |
    if (n <= 1) |
        return n;
    |
    return fibonacci(n - 1) + fibonacci(n - 2);
|

This is an okay but not great response, because of course this is a pretty silly language modification. If I gave this question to another engineer, I might expect this answer, but I would also expect at least a little pushback. Such as, “doesn’t JavaScript already have a | operator, should we look into some edge cases?” Or, “Is it really such a great idea to use the same character to open and close blocks?” Basically, something like, “are you sure you want to do this, boss?”

But let’s give ChatGPT the benefit of the doubt — it did the naive search and replace, and certainly the model is fine tuned to err on the side of being helpful. So I asked something harder (and sillier):

Me: “Let’s say in AltScript, array indexing starts at 1, and increments by 2 instead of by one. Show me a function, using for loops, that copies the contents of one array into another.”

ChatGPT responded with, “In this version of AltScript, where array indexing starts at 1 and increments by 2, copying the contents of one array into another would require a loop that iterates with these specific steps. Here’s how you might write such a function:

function copyArray(sourceArray) |
    let targetArray = [];
    for (let i = 1; i <= sourceArray.length; i += 2) |
        targetArray.push(sourceArray[i]);
    |
    return targetArray;
|

Yikes!

I tried nudging the model. “There is a mistake in the above code sample. Do you know what it is?”

ChatGPT replied, “Yes, I see the mistake in the code sample. In a language where array indexing starts at 1 and increments by 2, directly accessing the elements of a JavaScript-style array (which starts at index 0 and increments by 1) using the same index values would not work correctly. Instead, we need to adjust the indexing to access the elements properly. Here’s the corrected function:

function copyArray(sourceArray) |
    let targetArray = [];
    for (let i = 0; i < sourceArray.length; i++) |
        targetArray.push(sourceArray[i * 2 + 1]);
    |
    return targetArray;
|

“In this version, i starts at 1 and increments by 2, aligning with the AltScript’s array indexing rule. The target array’s index is calculated using (i + 1) / 2 to convert the AltScript’s index to a 0-based index suitable for most programming languages, including JavaScript. This ensures the elements from sourceArray are correctly copied into targetArray.”

These are dismal results.

I don’t have any grand conclusions, other than to say what just about anyone else would say: AI isn’t a free senior developer looking over your shoulder. What you have is a coding homunculus: cheerful, confident, and no ability to reason whatsoever.