LLMs for coding - surprisingly helpful
There's been a lot of talk about AI, and specifically LLMs (large language models, like ChatGPT). Much of this talk is polarised into "it's going to change the world (and perhaps make me redundant)" to "it's massively over-hyped and is essentially useless". I probably sat towards the over-hyped end of this spectrum, until I started properly trying to use an LLM to help with coding.
I got into this through this excellent article from The Guardian, which led me to this article by Simon Willison, which includes a lot of sage advice on how to get the most out of LLMs for programming. Like most tools, to get the most out of it you have to know how to use it, and understand its limitations.
One useful way of thinking about this is that the LLM is good at coding - leaving you to think about the programming. Simon Willison likens the LLM to an intern - you might have to specify in detail what you want them to do, but they (mostly) will be able to do it. If you give the LLM vague instructions, you'll probably not get what you
want. This is the first mistake I made - I assumed I could use a prompt
something like "write me some code to do x..." and got essentially
garbage.
My first try at this used Claude (free version, Google account needed) from Anthropic. I needed some code to investigate how effective raising floor levels in properties would be at reducing flood risk. Conceptually this is reasonably straightforward - I can implement it in about 10 minutes in a spreadsheet for one property. But to reproduce the analysis across a few 100,000 points I need some Python - and decided to see if the LLM could help.
To start, I used this prompt:
This produces some Python and an explanation of what it's doing:
This is quite helpful - the code is clear, it's got the correct docstring, error handling, and a little bit of (commented out) code to give you an example usage. I wasn't very specific in my prompt - but got pretty much what I wanted.
The only thing is it doesn't work if you don't specify the layer name - so feed the error message back to Claude to see if it can be fixed:
That sorts it - it generally fixes issues if you feed them back into the LLM.
Python type hints seem to be particularly useful in giving the LLM a better idea of what you're looking for (or rather picking out examples in previous code bases of what you're trying to do) - here's my prompt to create a function to calculate the expected value from a probability distribution:
I also gave it the hint about end members (something I'd probably need to tell a human too if they weren't familiar with the application). The resulting code worked - but I had to fix a couple of things (this is where having my spreadsheet to compare things to helped).
I've probably written code to do this dozens of times before - but was too lazy to look for it - and Claude came up with a new way using the numpy trapz function that I wasn't aware of. It did, however, use some redundant interpolation, which I queried:
At this point Claude did something I wasn't expecting - as well as fixing the function, it produced code to compare the time taken to run these:
(Gratifyingly I was right - Claude's code was slower.) So the LLM can produce code to check its outputs too.
Another good use case is when I asked it to produce a function to plot results from my analysis:
The results were what I wanted - and this is the sort of thing that would take me a while to Google all the options and arguments.
Note that this is all part of one conversation - the LLM uses previous prompts and responses to condition its answer - so the conversation will tend to become more relevant as you go on. The down side to this is that responses take longer, and you tend to reach usage limits faster. But the context can really help - I told it I'd corrected a bug and it seemed to take that on board in future versions of the code it produced.
I think there's a benefit to productivity - for the example here, I reckon I saved a 1-2 hours in the day. This isn't going to put me out of a job, but it is going to make me more efficient. It avoids a lot of the boring stuff in programming - so I can concentrate on the interesting stuff (the hydrology and risk analysis) rather than spending a load of time figuring out why my code isn't doing what I think it should. It also produces neater code than I would - which colleagues, and my future self, will probably appreciate. As Simon Willison notes: "[LLMs are] much less lazy than me—they’ll remember to catch likely exceptions, add accurate docstrings, and annotate code with the relevant types."
I think it's particularly useful for someone like me for whom programming isn't my day job - it's something I need to do quite a lot of, but not every day. I therefore don't develop any fluency in some aspects of coding - I tend to forget stuff before I have chance to use it again, so end up looking it up every time I need to use it.
So looks like LLMs could give an increase to productivity and cut out some boring repetitive stuff - it's certainly been worthwhile exploring what it can do for my type of programming. It means I more time thinking than typing - which is good, as it maximises what humans are meant to be good at. Writing good prompts requires thought - and this encourages me to think carefully about what I actually want to do - also a good thing.
It'll definitely be part of my workflow in the future.
Comments
Post a Comment