repo_man - Repo Tools Framework
repo_man - Repo Tools Framework
repo_man (or “Repo Man”) – is a framework for building and running command line tools for code repositories (e.g. build, format, package etc.).
Getting Started: Using repo tools
The main goal is to provide one single entry point for any repository:
User clones any repository and runs a single command:
repo.bat (windows) /
repo_man bootstraps itself (users don’t need to install anything) and lists all available tools (subcommands) with a description.
λ repo usage: repo [-h] [-v] [-p] [-pr] [-tb] [TOOL] ... Repo Tool (repoman): One entry point for all repo tools. Pass one of the tools and -h to get help. optional arguments: -h, --help show this help message and exit -v, --verbose Increase verbosity of logging. Pass -v for info, -vv for verbose. -p, --print-config-file Output tool default config file and exit. -pr, --print-resolved-config Output tool resolved config and exit. -tb, --tracebacks Enable Python traceback logging to console + TeamCity buildProblem reporting. Found tools: update Tool to update packman project files with newer versions of packages. By default major part of version is kept the same when looking for a newer version. publish Publish archives (packages) and labels to packman remote. packman Shortcut to packman. build Build system main command. format Format all C++ code (with clang-format) and all python code (with black). lint Run python linters. ... Run as 'repo [TOOL] -h' for more information on a specific tool.
Then a user can run
repo [tool_name] to run a particular tool, e.g.:
Users can learn more about any tool
repo [tool_name] -h.
Getting Started: Configuration
Another important goal of repo_man is to provide a common configuration system for all repo tools.
Each repo has
repo.toml file in the root where any setting for any repo tool can be overridden.
Each tool has
repo_tools.toml file with default settings for that tool (or collection of tools).
Each tool has a namespace in the config. For instance,
repo build would be in
[repo_build] section. Usually, all used tools are linked into
_repo/deps/[tool_name] folder. This way you can find a configuration file for each tool in
_repo/deps/[tool_name]/repo_tools.toml. Any setting can then be copied in
repo.toml and changed.
Design Goals: Code Sharing (A Story)
Providing a single entry point for all repos greatly improves discoverability and new user experience. However, that was not the only motivation for repo_man. Another important goal of all repo tools is to encourage code reuse across many repositories, their tooling, and scripts. Here is a story to illustrate this, it captures history of “Repo Tools” in a simplified version, explains motivation for each feature:
It all starts with one repo, where
repo [tool_name] just calls into
tools/tool_name.py. Everyone loves that, Python is easy to change and cross-platform. Any feature or new tool can quickly be added right there. Writing more Python is always preferred over adding configs because it is easy for developers. Code quickly becomes messy, hard to read, and non-reusable.
Then, a few new repos get created. They do the same from a tooling perspective (use the same build system, format code the same way etc.). But each now has a copy of that code. Which quickly starts to diverge. At this point, the only option is to move all tools into packages, shared by all repos. All the custom code is replaced with configuration (like a list of folders to package, instead of nested for-loops). If any bug is found it is fixed once. If a new feature is requested it is implemented once. Code is reused, other repos just update tool versions.
Then, one repo diverges and needs a slightly different behavior. This is where the configuration system comes in. Each new feature can be hidden behind the setting. All repos can safely update, but this one repo can toggle that new feature in
repo.toml. The configuration works great here, everyone can opt in and opt out of new features, while most code is reused.
Then, one repo wants something completely different and custom. For example, some code generation during the build. It doesn’t make sense to put it into shared code and pollute shared tools. Creating a new tool in a new repo might be an option, but developers want to quickly iterate on that code. Also, other parts of the build tool are still reused. In this situation, repo_man provides a way to override any tool and write a custom Python code in the repo. Configuration is just not enough here, and we need custom code.
Then, a new repo is created. It is written in a new language and uses a different build system. In that case, repo_man provides a way to just call into an existing tool. If a repository is written in rust and uses
cargo build to build, it should be possible to run
repo build and it will just run
cargo build under the hood.
Finally, we have 100 repositories. And we notice that 50 of those set exactly the same set of settings in
repo.toml, while others don’t. Once any setting is changed or added just updating tools is not enough and you want to share those settings, not tools. repo_man provides a way to import shared configuration in
repo.toml, which comes from one of the tools those 50 repos use, while others don’t.
Bottom line is that repo_man nudges developers to do more code sharing. Reusing tools is preferred over committing them in each repo. Configuration is preferred over code. However, developers can always escape that when needed and do something completely custom.