repo_man - Repo Tools Framework

repo_man - Repo Tools Framework

repo_man (or “Repo Man”) – is a framework for building and running command line tools for code repositories (e.g. build, format, package etc.).

Documentation

Getting Started: Using repo tools

The main goal is to provide one single entry point for any repository: repo.

User clones any repository and runs a single command: repo.bat (windows) / ./repo.sh (linux). repo_man bootstraps itself (users don’t need to install anything) and lists all available tools (subcommands) with a description.

Example of repo output:

λ repo
usage: repo [-h] [-v] [-p] [-pr] [-tb] [TOOL] ...

Repo Tool (repoman):
    One entry point for all repo tools. Pass one of the tools and -h to get help.

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         Increase verbosity of logging. Pass -v for
                        info, -vv for verbose.
  -p, --print-config-file
                        Output tool default config file and exit.
  -pr, --print-resolved-config
                        Output tool resolved config and exit.
  -tb, --tracebacks     Enable Python traceback logging to console
                        + TeamCity buildProblem reporting.

Found tools:
    update           Tool to update packman project files with newer versions of packages. By default major part of version is kept the same when looking for a newer version.
    publish          Publish archives (packages) and labels to packman remote.
    packman          Shortcut to packman.
    build            Build system main command.
    format           Format all C++ code (with clang-format) and all python code (with black).
    lint             Run python linters.
    ...

Run as 'repo [TOOL] -h' for more information on a specific tool.

Then a user can run repo [tool_name] to run a particular tool, e.g.: repo build.

Users can learn more about any tool repo [tool_name] -h.

Getting Started: Configuration

Another important goal of repo_man is to provide a common configuration system for all repo tools.

Each repo has repo.toml file in the root where any setting for any repo tool can be overridden.

Each tool has repo_tools.toml file with default settings for that tool (or collection of tools).

Each tool has a namespace in the config. For instance, repo build would be in [repo_build] section. Usually, all used tools are linked into _repo/deps/[tool_name] folder. This way you can find a configuration file for each tool in _repo/deps/[tool_name]/repo_tools.toml. Any setting can then be copied in repo.toml and changed.

Design Goals: Code Sharing (A Story)

Providing a single entry point for all repos greatly improves discoverability and new user experience. However, that was not the only motivation for repo_man. Another important goal of all repo tools is to encourage code reuse across many repositories, their tooling, and scripts. Here is a story to illustrate this, it captures history of “Repo Tools” in a simplified version, explains motivation for each feature:

It all starts with one repo, where repo [tool_name] just calls into tools/tool_name.py. Everyone loves that, Python is easy to change and cross-platform. Any feature or new tool can quickly be added right there. Writing more Python is always preferred over adding configs because it is easy for developers. Code quickly becomes messy, hard to read, and non-reusable.

Then, a few new repos get created. They do the same from a tooling perspective (use the same build system, format code the same way etc.). But each now has a copy of that code. Which quickly starts to diverge. At this point, the only option is to move all tools into packages, shared by all repos. All the custom code is replaced with configuration (like a list of folders to package, instead of nested for-loops). If any bug is found it is fixed once. If a new feature is requested it is implemented once. Code is reused, other repos just update tool versions.

Then, one repo diverges and needs a slightly different behavior. This is where the configuration system comes in. Each new feature can be hidden behind the setting. All repos can safely update, but this one repo can toggle that new feature in repo.toml. The configuration works great here, everyone can opt in and opt out of new features, while most code is reused.

Then, one repo wants something completely different and custom. For example, some code generation during the build. It doesn’t make sense to put it into shared code and pollute shared tools. Creating a new tool in a new repo might be an option, but developers want to quickly iterate on that code. Also, other parts of the build tool are still reused. In this situation, repo_man provides a way to override any tool and write a custom Python code in the repo. Configuration is just not enough here, and we need custom code.

Then, a new repo is created. It is written in a new language and uses a different build system. In that case, repo_man provides a way to just call into an existing tool. If a repository is written in rust and uses cargo build to build, it should be possible to run repo build and it will just run cargo build under the hood.

Finally, we have 100 repositories. And we notice that 50 of those set exactly the same set of settings in repo.toml, while others don’t. Once any setting is changed or added just updating tools is not enough and you want to share those settings, not tools. repo_man provides a way to import shared configuration in repo.toml, which comes from one of the tools those 50 repos use, while others don’t.

Bottom line is that repo_man nudges developers to do more code sharing. Reusing tools is preferred over committing them in each repo. Configuration is preferred over code. However, developers can always escape that when needed and do something completely custom.