Opengrok is a code search engine. I have been using this tool from a few years and my dependency on this tool has only increased over time.
Setting up opengrok has been a pain point for me in the past. Even recently when I tried upgrading from version 1.4 to 1.7, there were a number of environment variables and configurations which had changed. I had to go through a lot of documentation and issues listed on opengrok repository to figure out workarounds. In this post, I will share some simple setup details which hopefully make it easier for you to setup opengrok.
Docker compose file
version: "3"
# More info at https://github.com/oracle/opengrok/docker/
services:
opengrok-1-7-11:
container_name: opengrok-1.7.11
image: opengrok/docker:1.7.11
restart: on-failure
volumes:
- 'opengrok_data1_7_11:/opengrok/data'
- './src/:/opengrok/src/' # source code
- './etc/:/opengrok/etc/' # folder contains configuration.xml
- '/etc/localtime:/etc/localtime:ro'
ports:
- "9090:8080/tcp"
- "5001:5000/tcp"
environment:
SYNC_PERIOD_MINUTES: '30'
INDEXER_OPT: '-H -P -G -R /opengrok/etc/read_only.xml'
# Volumes store your data between container upgrades
volumes:
opengrok_data1_7_11:
networks:
opengrok-1-7-11:
Let’s go over the above docker-compose file
Volumes section
- restart: on-failure – From docker documentation
on-failure – Restart the container if it exits due to an error, which manifests as a non-zero exit code.
- opengrok_data1_7_11:/opengrok/data – I used a named volume since I want the indexed data to be saved between container restarts.
- ./src/:/opengrok/src/ –
srccontains the source code that I want to be indexed. - ./etc/:/opengrok/etc/ –
etccontains configuration related files e.g. read_only.xml, configuration.xml, mirror.yml - /etc/localtime:/etc/localtime:ro – This is being done to match the host’s localtime with that of the container’s.
- Additional optional volume configurations –
./ssh/:/root/.ssh/and.gitconfig:/root/.gitconfig. These options need to be used carefully. You must understand the consequencies of using ssh keys such as what all permissions/accesses those keys allow.
Environment section
- SYNC_PERIOD_MINUTES – This option dictates how often sync + reindexing is triggered.
- INDEXER_OPT – This can be used to pass configuration flags to the indexer. These flags with their description are listed at Opengrok’s documentation.
Challenges faced
Problem
If sync broke for any project, it would not get indexed anymore.
Solution
Add mirror.yml with ignore_errors options set to true in /opengrok/etc directory(refer volumes under the services section of compose file)
# Config file for opengrok-mirror
ignore_errors: true
With this configuration present, any repo sync errors would be ignored and the source code would be indexed.
With these steps, I was able to setup my opengrok instance.
I can be reached at @simranzchawla
