Practical adoption of generative AI as part of your software development lifecycle

As the pace at which new technology ships increases, so do the chances that businesses will opt out or disable new capabilities because they don't feel they have time to adequately assess the impact before people have access. A downside to this approach is that those businesses who have slow or no adoption of new technology will quickly be outpaced by their competitors (emerging and exitsing) that are better able to assess and adopt.

Over the past decade I have experienced this first hand with cloud adoption, where many large organisations now restrict access to new services and features as standard operating procedure. More mature businesses put a structured assessment process in place, but it still feels like a barrier to innovation from potentially beneficial new technology.

More recently I am seeing the same behaviour with the introduction of generative AI, so in this post I am going to concentrate on adoption of this emerging technology as part of the softare delivery process. Examples of generative AI services you have access to today include GitHub Copilot, Amazon CodeWhisperer and Tabnine.

TL;DR - here's a basic scaffold GitHub Project to help with your adoption process.

The more things change, the more they stay the same

This famous quote ("Plus ça change, plus c'est la même chose" ) from 19 Century French journalist Jean-Bapiste Alphonse Karr is quite apt for the situation in which we find ourselves.

Developers have been using assistive technology for many years to help reduce time to ship software or to remove the tedium of repetitive work. Code generation has been around since at least the late 1970s with the likes of MATLAB plotting the course we find ourselves on today.

Intelligent code completion has also played a big part in modern developer experience with most Integrated Developer Environments (IDEs) supporting some form of code completion. Code completion is also not a new idea and traces its history all the way back to the 1950s.

Clearly the idea that every single line of code in any application codebase is 100% produced only through direct human edeavour has been outdated for a while!

Protecting your IP from generative AI

Many businesses have invested substantial amounts in building software that solve their business problems or those of their customers. It's understandable busineses want to protect their Intellectual Property (IP).

However, the reality is that the IP is not just the sourcecode, it's in the business logic and the data, and the unique aspects of the user experience the software delivers. The code is just a means to an end. While it's true some companies such as Microsoft (amongst others) have built their business on IP that is in code, the majority of businesses are not in this position.

I know this is perhaps a controversial opinion (and likely your lawyers will disagree), but if your only reason for not adopting new technology is the perception you are protecting IP in your source code, you might be focusing on the wrong things. You might also want to stop reading here! 🙂

Still with me? Good, let's look at how you can manage your adoption of generative AI services as part of your software delivery lifecycle.

Legal review of service prior to roll-out

You should sit down with the appropriate members of your legal department or advsior and formally review the terms of service and privacy policy of the generative AI service(s) you are considering. This way you have a clear understanding of what data is being collected and how it is being used by the service provider.

If you don't have a legal capability then you need to either engage one, or make best efforts to decide on your intrepertation of the terms of service and privacy policy. Make sure to note down the process used to decide and justifications for future reference!

For example, GitHub Copilot for Business clearly states it doesn't retain any code snippets sent to the service as a prompt. If the services you are considering have no clear declaration then you should ask the service provider for one. Depending on their answer you might not want to adopt their service!

This is also not a one-time process. It should be reguarly reviewed to ensure the service provider has not changed their policy in a way that is not acceptable to your business. Watch out for "we've changed our terms of service" or "we've updated our Privacy Policy" notifications!

Limit use of generative AI to specific codebases, tasks or teams

If you have a large codebase, you may want to limit the use of the generative AI service to a specific part of a codebase or a specific team. This way you can control the amount of data being sent to the generative AI service and also the amount of generated code being included in your solution. You can also have confidence that certain parts of your codebase, potentially those with what you consider your actual IP, are not exposed to the generative AI service or will contain generated code.

If you are unsure where in your codebase this thinking applies, I think we can agree that the Data Access Layer (DAL) of your solution is likely not to be where your IP lives and it's also not where you want your teams to be spending their time writing code. 😉

There is a swirling debate around whether or not the outputs of generative AI models or services can be copyrighted, and if they can, who holds that copyright? If you want to ensure you hold copyright, limiting the places where this technology is used can be a good start.

Require human explainability and review

This might be a tough ask depending on how much code generation you are prepared to allow, but ultimatley your business will be responsible for any experiences resulting from this code. If your code generation service produces code nobody can explain then you bear the risk of something unexpected happening.

If you consider the previous point (limited use) then you can ring fence certain key parts of your business logic and require human explainability and review of generated code in these areas if that helps limit the scope.

It's important to be aware of what is happening in this space of human explainability from a regulatory standpoint too. The Europe Union's General Data Protection Regulation (GDPR) includes mentions of automated decision making and the need for human explainability. The GDPR covers this most clearly in Article 22 and Recital 71. It's safe to say other duristictions will likely follow suit in due course and you need to be ready!

Leaning on your DevOps practices

If you are still not using DevOps practices widely, then now is the time to start! If you have quality gates in place for human-created code and you run your machine-generated code through the same process, you can have a degree of confidence around the quality and functionality of the code. There are a few aspects to this suggestion:

Small commits containing only generated code: avoid mixing human and machine-generated code in commits. Ideally these commits are associated with a specific task or user story so you already have that context. Any Pull Request resulting from these commits must be human reviewed.
Only generate what you need: I can foresee a situation where a developer is using a code generation service to generate a complete API or library for, say, database access. They will generate all the normal Create, Read, Update and Delete (CRUD) methods and add the code to a commit. Perhaps they just needed a Read operation? This is all the code that should have been kept and committed to the repository. If we require humans to review code changes, then we want to limit the volume of code to review and reduce the chances that problems will arise later when unreviewed code is used in the application.
Static code analysis or linting: just because code compiles doesn't mean it is correct! You should be running a static code analysis tool as part of your CI/CD pipeline. This will help identify any issues with the generated code. If you are using a generative AI service that is integrated with your IDE then you should be running the same analysis tool or linter locally. This will help you identify issues before you commit the code.
Clearly identify code as generated: today there is no automated way to provide a clear indication that code is machine-generated. Depending how widely you decide to use code generation,s you can add a developer standard that requires developers to clearly mark code as machine-generated. This could be as simple as a comment at the top of the file or a section of a file. This will help you identify generated code and avoid the risk of human-generated code being overwritten or vice versa. If the code is manually modified it should be noted in the comment. I asked the team at GitHub if Copilot could add comments around generated code like the below sample (today it's not possible was the answer).

SampleCode.cs

//GHCP: Start
/*
Model release: x.xxx.xx
Model date: 2023-03-28
Prompt: a function to calculate the sum of two numbers
*/
public int Sum(int a, int b)
{
    return a + b;
}
//GHCP: End

Test coverage: ensuring all key parts of your codebase have code coverage will provide confidence that changes that are machine-generated are not introducing any regressions. This is also a good practice to have in place for any code changes regardless! I know you are probably thinking about machine-generated unit tests at this point... I'll refer you to the human explainability and review point above. If you are using code generation to create unit tests simply to hit code coverage targets you are looking at this the wrong way!
Revisit generated code as part of BAU or updates: don't assume that today's best suggestion will be tomorrow's - just because it's generated doesn't mean it should be frozen in time. It is worth revisiting generated code over time to see if they are better ways to implement the logic. This is no different to using new language features or libraries as they become availbale and can replace bespoke code you have developed.

The bottom line is, if you are already allowing changes into production without the necessary quality gates, then all adding generative AI to the mix is going to do is increase the volume of change you are now submitting to your CI/CD pipeline or requiring human review of!

Generative AI doesn't mean less people

Let's not overlook the obvious concern many people have around the introduction of generative AI technology into the workplace - job security. I think it's important to understand that generative AI is not a replacement for humans. It's a tool that can help developers be more productive and deliver more value to their business while at the same time helping them to learn and grow their skills.

If you are a manager or business owner and you are thinking you can replace your developers with generative AI sevices like Copilot you are going to suffer in the long term. You will notice that a lot of what I've written about so far requires human oversight or intervention. Smart people want interesting and challenging work, and if you can use generative AI to remove the busy work from their lives they will be more productive and happy. This also provides them with space to help grow new team members who you should absolutely be hiring!

It's worth the time and the (perceived) risk

I know there is a lot of hype (and a fair amount of concern) floating about on the impacts of AI at the moment, but I honestly believe the benefits of generative AI will far outweigh the risks.

Like any new technology it's worth taking a considered approach to adoption. To this end I created a basic scaffold GitHub Project which you can copy to your organisation and use to get started with adoption of generative AI in your enviornment. Make sure you select "Draft issues will be copied if selected" so you get the baseline tasks I created for you to use. You can read about copying projects on GitHub's documentation.

I recently recorded a session with Damian Brady where we discussed risks associated with pushing Copilot code to production in more detail. You can watch the discussion below.

I will leave you with this thought...

Let's be honest, the only time you review generated SQL from an Object Relational Mapper (ORM) is when there are performance issues, right? And that code is typically generated by a machine at runtime! 😜

😎

P.S. - off the back of this post I was invited by a few different groups to do a talk on this subject. I've created the "Can you become a 10x developer with generative AI?" talk which I've done a few times.

You find one of these sessions that was recorded from the Coding Nights NZ meetup in August 2023.