Code is written by humans for humans and machines. By learning from the human-oriented components of code, recent research has invented models that start to “understand” some aspects of source code. This opens the exciting possibility of using machine learning to assist developers in their everyday tasks, such as writing new code and finding bugs.
In this talk, I will give a brief tour of our lab’s recent explorations in this area. Then I will focus on a specific kind of neural networks, namely graph neural networks (GNN). These networks allow us to learn from the rich semantic relationships within code and, by training them on a self-supervised task, they have allowed us to find bugs in open-source projects. I will conclude with a brief discussion of the practical challenges in using machine learning on source code.