EVC #10: Large Language Models for Developers
How LLMs can help us adopt more DSLs, make code reviews easier, and improve documentation of open source
In 2022, we saw a lot of activity in the large language models and generative AI space. There were two pivotal moments: the release of Stable Diffusion on Github, and OpenAI releasing ChatGPT to the public. Instead of paying OpenAI on each run of DALL-E, I was able to run Stable Diffusion on my GTX 2060 Super, which was far from the best consumer GPU out there. Similarly, I didn’t have to pay $0.02/1000 token for OpenAI’s davinci, but I could interact with ChatGPT for free. This allowed millions of people to utilize these models for free, for the first time, and it’s going to be hard to put the genie back in the bottle.
For the past few months, I’ve been using Github Copilot daily, which has been an amazing experience. As of July 2022, over 1.2 million developers used Github Copilot and wrote ~40% of their code automatically. Microsoft’s offering isn’t the only tool out there though, and developers can try out different alternatives:
Tabnine, which was one of the first products in this space.
Codeium, which recently launched its first product.
Kite, started building this in 2014 but unfortunately shut down recently.
Various open source projects like gpt-code-clippy, emacs-secondmate, etc. None of them seems to have an active community though.
Once I got comfortable with an AI writing most of my code, I started to think through what else this technology can improve in the development workflow, and came up with some ideas. Looking forward to hearing everyone’s thoughts and ideas!
Note that all screenshots in this post will be from ChatGPT; that’s mostly for readability and ease of use.
The DSL is dead, long live the DSL!
These AI assistants can save you time by automatically generating your code outline. Time savings is one of the main benefits, but in my mind, the most exciting application of this is removing the need to understand domain-specific languages (DSLs).
For example, I’ve written Ruby for most of my career, and always used Rails as my web framework. I have the right mental models for data structures, API design, etc, but I’m not super comfortable with implementing them in a different stack like Python and Django, or Node and Express. By having a model generate the basic outline for me, I can get up to speed much more quickly.
Just like they can understand the abstract syntax tree (AST) of a piece of code, language models can also understand database schemas, so this same concept applies to querying a database. I could have a lot of experience writing SQL queries, but I now need to find some data in a MongoDB database. It’s really easy to have the model generate that for you.
The fun part is that you could get the same query simply by using natural language. In order to run this query I didn’t need to know either SQL or MQL, I just had to ask nicely:
In the short term, this technology can be really helpful in large enterprises that have many different projects in different languages. It unlocks new possibilities for cross-team collaboration, and it also makes it easier to support legacy projects written in languages that aren’t as popular anymore.
In the long term, I’m really curious to see what this does to DSLs and programming languages as a whole. A big drawback of creating your own has always been the lack of adoption and documentation / community; but if your language is extremely powerful at doing a specific task, and all of a sudden every developer in the world can translate their code into it, how many more would we see pop up? On the other hand, if language models cannot be trained in your new language and these tools don’t work on it, will anyone ever use it?
Language models as “reviewer zero”
A constraint of large engineering teams is the availability of senior developers to review pull requests and mentor the juniors. This can be really inefficient as many times junior developers might not have thought through the edge cases for their code, or they are stuck in very simple things. A language model can be a great rubber duck; for example, you could ask it to generate tests for you. If some of those tests fail, you can ask it to explain the error message; you could even ask it to fuzz your code to make sure that you’re handling all edge cases correctly.
Teams can leverage LLMs as a “reviewer zero” in their development process; before any team member is looped into a PR review, the model can run a first “health check” on the pull request. You could imagine a commit-hook that does something like “generate tests for all new functions and classes added in this commit”, which would then be included in the next CI run. Once the build is green, it creates a summary of the changes so that the reviewer can get up to speed quickly.
Docs generation for open source
Language models are usually good at summarizing non-ambiguous information provided to them. I maintain and contribute to a variety of open source projects, and I’ve seen how difficult it can be to keep documentation up to date (if there’s even any!).
About 7 years ago I created a project called `nba_rb`, which was an open source wrapper for the NBA API. I took one of my test files (https://github.com/FanaHOVA/nba_rb/blob/master/spec/game_spec.rb) and asked the model to read it. I then asked to summarize what classes are present, and what methods they take:
Since this is an API wrapper, a lot of the context comes from the response we get, not from the code itself. Luckily I used the open source project `vcr` to store these requests for testing. I can now feed the raw data into the model, and ask the question again. You will see that instead of just getting the names of the methods, we also get what data they return. There are still some ways you could improve the formatting of it, but in just 5 minutes we went from nothing to a decent explanation of how that class works. This can now be used as a README for the project, or it could be taken one step further by having the model automatically add it as docstring comments in the code.
What’s next?
The technology in this space is moving so quickly that by the time this post goes live, there might be even cooler stuff out there! At Decibel, we’re spending a lot of time staying up to date on the latest research and product developments, as well as implementing them in our internal software platform. If you’re building something in this space, we’re always happy to chat! My DMs are open on Twitter @fanahova or email me at alessio at decibel.vc!