Opengrok is a code search engine. I have been using this tool from a few years and my dependency on this tool has only increased over time.
Setting up opengrok has been a pain point for me in the past. Even recently when I tried upgrading from version 1.4 to 1.7, there were a number of environment variables and configurations which had changed. I had to go through a lot of documentation and issues listed on opengrok repository to figure out workarounds. In this post, I will share some simple setup details which hopefully make it easier for you to setup opengrok.
Docker compose file
version: "3"
# More info at https://github.com/oracle/opengrok/docker/
services:
opengrok-1-7-11:
container_name: opengrok-1.7.11
image: opengrok/docker:1.7.11
restart: on-failure
volumes:
- 'opengrok_data1_7_11:/opengrok/data'
- './src/:/opengrok/src/' # source code
- './etc/:/opengrok/etc/' # folder contains configuration.xml
- '/etc/localtime:/etc/localtime:ro'
ports:
- "9090:8080/tcp"
- "5001:5000/tcp"
environment:
SYNC_PERIOD_MINUTES: '30'
INDEXER_OPT: '-H -P -G -R /opengrok/etc/read_only.xml'
# Volumes store your data between container upgrades
volumes:
opengrok_data1_7_11:
networks:
opengrok-1-7-11:
Let’s go over the above docker-compose file
Volumes section
- restart: on-failure – From docker documentation
on-failure – Restart the container if it exits due to an error, which manifests as a non-zero exit code.
- opengrok_data1_7_11:/opengrok/data – I used a named volume since I want the indexed data to be saved between container restarts.
- ./src/:/opengrok/src/ –
src
contains the source code that I want to be indexed. - ./etc/:/opengrok/etc/ –
etc
contains configuration related files e.g. read_only.xml, configuration.xml, mirror.yml - /etc/localtime:/etc/localtime:ro – This is being done to match the host’s localtime with that of the container’s.
- Additional optional volume configurations –
./ssh/:/root/.ssh/
and.gitconfig:/root/.gitconfig
. These options need to be used carefully. You must understand the consequencies of using ssh keys such as what all permissions/accesses those keys allow.
Environment section
- SYNC_PERIOD_MINUTES – This option dictates how often sync + reindexing is triggered.
- INDEXER_OPT – This can be used to pass configuration flags to the indexer. These flags with their description are listed at Opengrok’s documentation.
Challenges faced
Problem
If sync broke for any project, it would not get indexed anymore.
Solution
Add mirror.yml
with ignore_errors options set to true in /opengrok/etc
directory(refer volumes
under the services
section of compose file)
# Config file for opengrok-mirror
ignore_errors: true
With this configuration present, any repo sync errors would be ignored and the source code would be indexed.
With these steps, I was able to setup my opengrok instance.
I can be reached at @simranzchawla