๐Ÿค– Bridging Two Worlds: How portage-pip-fuse Brings 750,000+ Python Packages to Gentoo

A FUSE-based virtual filesystem that presents the entire PyPI ecosystem as a Portage overlay. Mount it, and emerge dev-python/requests works โ€” for any of the 750,000+ packages on PyPI that ship source distributions. No ebuilds to write. No pip behind Portage's back.

Choose your reading length

Every Gentoo user who works with Python knows the ritual. You need a package. It exists on PyPI. It does not exist in ::gentoo. So you write an ebuild, or you reach for pip install --user and accept that Portage will never know about it. Neither option is satisfying. Do this often enough and you have two package managers running in parallel, neither aware of the other.

I built portage-pip-fuse because I got tired of this. It is a FUSE-based virtual filesystem that presents the entire PyPI ecosystem as a Portage overlay. Mount it, and emerge dev-python/requests works โ€” for any of the 750,000+ packages on PyPI that ship source distributions. No ebuilds to write. No pip behind Portage’s back.

Ebuilds that don’t exist until you look

Rather than pre-generating ebuilds โ€” the approach taken by g-sorcery and gs-pypi, which doesn’t scale โ€” portage-pip-fuse generates them at the moment Portage requests them, through a FUSE virtual filesystem mounted at /var/db/repos/pypi. The overlay looks and behaves like any other Portage repository, but its contents are computed, not stored.

By default, the tool queries PyPI’s JSON API directly. On a reasonably fast connection, Portage operates over the full 750,000+ packages without any local database. For faster lookups or offline use, you can optionally download PyPI’s daily SQLite metadata dump (~1 GB compressed) โ€” but this is an optimisation, not a requirement.

The tool filters out packages that ship only binary wheels (Gentoo builds from source) and packages with no overlap with Portage’s allowed PYTHON_TARGETS. Everything else is translated on the fly: PYTHON_COMPAT from classifiers, dependencies mapped to Gentoo atoms, version numbers converted, Manifest files generated with PyPI checksums.

Security through Portage’s sandbox

This is also a security story. Installation time is the critical vulnerability in the PyPI supply chain โ€” setup.py execution, PEP 517 build backends, even metadata extraction can trigger arbitrary code. When you pip install a package, pip runs the build with your user’s full privileges and network access. When you emerge that same package through portage-pip-fuse, Portage’s sandbox constrains what the build process can do: isolated chroot, controlled filesystem access, blocked network. The isolation the security community is asking for in Python packaging already exists in Portage.

Developer workflow with virtual/ packages

The pip subcommand translates pip syntax to emerge commands. For requirements files, it creates a virtual/{project} ebuild in a separate overlay โ€” emerge virtual/odoo installs all of Odoo’s dependencies through Portage while Odoo itself stays out of the system. You work on the application in your development tree with all dependencies satisfied system-wide. Only the dependency footprint gets installed; the application you’re developing never does.

Declarative patching through .sys/

The mounted filesystem exposes a .sys/ directory for modifying generated ebuilds at runtime. Need to remove Python 3.13 compatibility from a package? Fix a dependency conflict caused by Gentoo revision bumps? Configure a package to use system libraries instead of bundled ones? Write to the appropriate path under .sys/ and the generated ebuild changes accordingly โ€” no local overlay, no forked ebuilds, no waiting for upstream.

This gives you the same kind of declarative control over PyPI packages that /etc/portage/patches/ gives you over Portage packages. The modifications are transparent, reproducible, and version-controllable.

The source code is at github.com/Miriup/portage-pip-fuse. GPL-2.0, alpha-stage, contributions welcome.


Further reading

1. pypi-data/pypi-json-data โ€” Daily SQLite dumps of PyPI package metadata. The optional offline backend for portage-pip-fuse. github.com/pypi-data/pypi-json-data

2. Attack vectors against the Python/PyPI supply chain and Linux execution environments โ€” Taxonomy of 100+ attack vectors documenting why installation-time isolation matters. systemication.com

Leave a Reply

Your email address will not be published. Required fields are marked *