ivanch.me/content/posts/automated-changelogs-gitlab.md

179 lines
6.5 KiB
Markdown

---
title: "Automated Changelogs on GitLab"
date: 2023-05-15T22:38:55-03:00
draft: false
summary: "Changelog automation on GitLab CI"
---
Changelogs are good, mainly if you need to keep track of what was changed on a release. But they can be a pain to write, especially if you have a lot of commits, people working on the same project, lots of tasks, and so on. A good spot to put some **automation**.
There are a couple of ways we could make an automated changelog system, we will focus on making one that uses GitLab CI and the commit messages from the project. We will also take into consideration that *releases are made through git tags*.
For this, we will start with a few requirements:
* Plan on a commit message pattern, for example: "[TASK-200] Fixing something for a task on Jira";
* Have the release notes/changelogs on a specific part of pipeline (for example production release);
* The release notes generation will take part when creating a tag.
We will take advantage of these two commands:
1. `git log --pretty=format:"%s" --no-merges <tag>..HEAD` - This will give us the commit messages from the last tag to the HEAD;
2. `git describe --abbrev=0 --tags` - This will give us the latest tag.
## Creating a basic pipeline
Let's start by creating a basic pipeline that will run on the production release.
```yaml
run:
script:
- echo "Running the pipeline"
.generateChangelog:
image: python:latest
stage: test
script:
- echo "Generating changelog..."
# Generate changelog here
artifacts:
name: changelog.txt
paths:
- changelog.txt
when: always
expire_in: 1 week
deploy:
stage: deploy
extends:
- .generateChangelog
rules:
- if: $CI_COMMIT_TAG
when: manual
environment: production
```
We will output the changelog into a file named `changelog.txt` and then we will use the `artifacts` keyword to save it.
## Generating the changelog
Note that we set the image to be `python:latest` on the `.generateChangelog` job, this is because we will use a Python script to generate the changelog. Inside the code we will set two functions: one that will return the latest tag, and another that will get the commits between the latest tag and the HEAD.
To call commands on the OS we will use the `subprocess` module, and to get the output from the command we will use the `communicate()` function. In case of an error, we can further add some error handling (more on this later).
```python
def get_last_tag():
pipe = sp.Popen('git describe --abbrev=0 --tags', shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
prev_tag, err = pipe.communicate()
# If it returns 0, it means it was successful
if (pipe.returncode == 0):
return prev_tag.strip()
def get_commits():
prev_tag = get_last_tag().decode('utf-8')
print('Previous tag: ' + prev_tag)
pipe = sp.Popen('git rev-list ' + prev_tag + '..HEAD --format=%s', shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
commits, err = pipe.communicate()
# Only dealing with 0 for now
if (pipe.returncode == 0):
commits = commits.strip().decode('utf-8').split('\n')
return commits
```
Now we should get a list of the commits that we want. Calling the function `get_commits()` will return a string list with all the commits, but there could be some commits that we don't want to show on the changelog, for example: `Merge branch 'master' into 'develop'`. **This is where having a pattern will help.**
```python
def get_formatted_commits():
commits = get_commits()
formatted_commits = []
for commit in commits:
if commit.startswith('[TASK-') or commit.startswith('[BUG-'):
formatted_commits.append(commit)
return formatted_commits
```
This will give us only the important commit messages with the pattern that we want. We can further improve this by adding a regex, transforming `formatted_commits` into a `set` of Task Numbers, do some parsing, API calls, whatever we want. For now, we will keep simple and do the basic.
## Writing the changelog
Now that we have the commits that we want, we can write them to a file. We will use the `open` function to open the file and write the commits to it.
```python
def write_changelog():
commits = get_formatted_commits()
with open('changelog.txt', 'w') as f:
for commit in commits:
f.write(commit + '\n')
```
## Putting it all together on the pipeline yaml file
Now that we have the everything we want, we can put them all together on the pipeline yaml file.
```yaml
run:
script:
- echo "Running the pipeline"
.generateChangelog:
image: python:latest
stage: test
script:
- echo "Generating changelog..."
- git tag -d $(git describe --abbrev=0 --tags) || true
- python changelog.py
artifacts:
name: changelog.txt
paths:
- changelog.txt
when: always
expire_in: 1 week
deploy:
stage: deploy
extends:
- .generateChangelog
rules:
- if: $CI_COMMIT_TAG
when: manual
environment: production
```
Note that we had to add `git tag -d $(git describe --abbrev=0 --tags)` command there to delete the latest tag. This is because we are using the `git describe` command to get the latest tag, and if we don't delete it, the changelog will be empty. The `|| true` is there to make sure that the pipeline doesn't fail if a tag doesn't exist.
## Error handling
We can further improve this by adding some error handling. For example, if we don't have any tags, we can set a default hash (which would be the start of git history).
```python
def get_last_tag():
pipe = sp.Popen('git describe --abbrev=0 --tags', shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
prev_tag, err = pipe.communicate()
# If it's successful, we return the tag name
if (pipe.returncode == 0):
return prev_tag.strip()
else:
# If it's not successful, we return the first commit hash
pipe = sp.Popen('git rev-list --max-parents=0 HEAD', shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
first_commit, err = pipe.communicate()
# If it's successful, we return the first commit hash
if (pipe.returncode == 0):
return first_commit.strip()
else:
# If it's not successful, we print the error and exit, there's something else wrong
print('Error: Could not get the last commit hash')
print(err.strip())
sys.exit(1)
```
Further error handling or improvements can be done, this is just a proof of concept. On another note, the code hasn't been tested *as is*, so there might be some errors.