Getting Started
Testing and development so far has only been done on native Linux. MacOS is supported by InvenioRDM but has not been tested with this project. Development in WSL may be possible but working natively in Windows is not supported.
Requirements
The requirements for working with InvenioRDM are laid out in detail in the
InvenioRDM System Requirements Docs. You're ready to go once you have cloned the code
repository and can run invenio-cli check-requirements --development
in the project
directory and all requirements are met. Below are some tips and specifics for this
project:
- Start by installing
invenio-cli
and run the requirements check above to see what's missing. - Both
pipenv
andinvenio-cli
are best installed with pipx. These need to be discoverable on your path. - We are currently pinning to Python 3.9 for compatibility to the deployment base image
so you'll need this available.
invenio-cli
will be satisfied with anything 3.9 or newer but you need 3.9. - Cairo and DejaVu are listed in the InvenioRDM Docs but are not checked for by
invenio-cli
. The direct impacts of not having these is unclear but you'd probably get by. - ImageMagik is checked for by
invenio-cli
but similarly you'd probably get by without it.
Tooling Overview
A combination of tools are used to manage the project. Their different roles are
summarised below but most operations use invenio-cli
which wraps the other tools as
required and is covered in more detail below.
pipenv
is used to manage Python dependencies and the virtual environment used for development.node
andnpm
are used to manage JavaScript dependencies and the build process for the frontend.- Docker and Docker Compose are used to manage the services required to run the application, namely the database, OpenSearch, Redis and RabbitMQ.
invenio
- is a command line that can be used to interact with some Invenio components. It is installed within the virtual environment managed bypipenv
so must be invoked viapipenv run invenio
.
invenio-cli
As mentioned above invenio-cli
is the primary tool for managing the project and most
operations are performed by invoking it. It's main subcommands are sumarised below:
invenio-cli install
- Installs the project and its dependencies. Creates the virtual environment if necessary, syncs the dependencies with Pipfile.lock, builds the frontend and copies/symlinks the assets to the correct location in the virtual environment.invenio-cli services
- Manages the Docker services required to run the application. Can be used to setup, start, stop and teardown the services.invenio-cli run
- Starts the Flask development server and a set of Celery workers.invenio-cli packages
- Wrapspipenv
to manage Python dependencies. Can be used to install, uninstall and update packages.invenio-cli pyshell
- Starts a shell in the virtual environment with an initialised Flask app.invenio-cli assets
- Manages static files and frontend assets. Can be used to build the frontend, watch for changes and clean up.
Local Installation
Initial setup of the project can be done with the following commands:
This will:
- Create a virtual environment and install the Python dependencies.
site/ic_data_repo
is installed in editable mode so changes to the source code are immediately available. - Install the JavaScript dependencies and build the frontend assets.
- Copy/symlink the staticfiles and Javascript assets to the correct location in the virtual environment.
- Start the Docker services required to run the application and ensure they are healthy. This includes the database, OpenSearch, Redis and RabbitMQ.
- Create the database schema, initialise the Opensearch indices and various other one-off setup tasks.
- Populate the database with some default data e.g. default user roles and permissions.
The
--no-demo-data
flag is used to prevent the creation of demo data records. Remove it if you want the instance to be populated with example deposit data. - Creates a number of Celery tasks to populate the database with controlled vocabulary data. Note that there are no Celery workers running yet to process these tasks so they are just waiting in a queue.
Note that the above leaves the services running. You can stop them with
invenio-cli services stop
. Either way you can then start the Flask server with:
This runs the Flask development server and creates a number of Celery workers in the background. If the services are not already running then they will be started. The first time this is run after setup there will be a backlog of Celery tasks that starts executing. This can be a bit resource intensive and make things a bit sluggish.
Once the Flask server has started visit https://127.0.0.1:5000 in your browser. The
development setup uses a self-signed TLS certificate so may need to bypass a security
warning. Once finished, stop the running Flask server and use
invenio-cli services stop
to bring down the running seOrder complete
We’ve emailed you these order details and will text you about your order.rvices.
If you want to restart the setup process from scratch you can use
invenio-cli services destroy
remove all the services and data.
Logging In
In order to log in to the application you will need to create a user account:
You can also optionally make this user an admin with:
Development
QA
It is strongly recommended to use pre-commit to check your individual commits meet the
QA standards of the project. These are enforced via GitHub Actions and it's easiest to
make sure you're compliant as you go along. Details of the QA tools can be found in
.pre-commit-config.yaml
.
Continuous Integration
A simple Continuous Integration setup is provided via GitHub Actions. This checks the target commit against the project QA tooling and for commits to the main branch builds and pushes Docker images for the web application and frontend.
Tests
A test suite is provided in the tests
directory. Assuming services have already been
setup, tests can be run with:
All development work should be supported by an appropriate set of tests. Best practices around testing are expected to evolve as the project develops.
The pytest-invenio plugin is provided to support test development. This extends pytest-flask to provide fixtures and support for testing Invenio.
Backend Development
Using invenio-cli run
will start the Flask development server and a set of Celery
workers. Debugging is enabled and it the server will automatically reload when changes
are made to the source.
Frontend Development
The frontend is built with Webpack and the assets are managed by invenio-cli
. Any
changes made to the css or javascript assets will require a rebuild of the assets. As a
one-off operation this can be done with invenio-cli assets build
. To watch for changes
and rebuild automatically use invenio-cli assets watch
.
Note that the above is not required for any changes to the html templates which are processed by the backend.
Troubleshooting
InvenioRDM is a sophisticated application with many moving parts. If you encounter issues the below information may help with troubleshooting:
invenio-cli
stores some state about the project (e.g. whether setup has been performed for the services) in the file.invenio.private
. The file is gitignored but avoid deleting it. If you're worried it has gotten out of sync then runinvenio-cli destroy
to completely remove all services, data and resources.- If you encounter errors about missing indexes (for Opensearch) or database tables (for
postgres) then setup may not have completed successfully. You can try
invenio-cli services destroy
to do a complete teardown then setup the services again. - You can check the status of the services with
invenio-cli services status
. This will show which services are running and whether they are healthy. If a service is having issues you can use Docker Compose to check the logs e.g.docker compose logs opensearch
. - The celery workers started by
invenio-cli run
can be a bit verbose and polute the logs in the console. You can redirect the celery logs to a file withinvenio-cli run --celery-log-file /path/to/logfile
. invenio-cli pyshell
can be used to start a shell in the virtual environment with an initialised Flask app. This can be useful for debugging issues with the application code or inspecting config.
Configuration
This project extends the configuration approach used by Invenio RDM.
Inspired by Django the following changes have been made:
- Configuration is stored in the module
ic_data_repo.config
. - The module to use as settings can be specified at runtime via the environment variable
INVENIO_SETTINGS_MODULE. This defaults to
ic_data_repo.config
. - The standard InvenioRDM config file (
invenio.cfg
) now contains only the necessary import machinery to facilitate the above.
Note that overriding settings by environment variable still works.
The default configuration is suitable for development. A production oriented settings
file is also provided in ic_data_repo.config.production
.
Test Data
Note
This functionality is not currently working.
Instructions for accessing and working with realistic test data records are provided in the test_data directory.