My poor Portage deconstruction

Here's what I have so far in my "walk" through the Portage code (specifically, portage-2.0.49-r1). Right now I've enumerated ebuild, the effects of importing portage, emerge, the Portage portage.doebuild() function, and the ebuild.sh script.

ebuild

The ebuild script is a simple interface to portage.py. It's a python script.

The only option is --debug, which sets debug=1 if set (else debug=0).

  • Script imports portage (and doing all of the setup that entails).
  • Script sets an environment variable: PORTAGE_CALLER="ebuild".
  • Script reads environment variable ROOT (uses "/" if ROOT not set in env)
Example: ebuild --debug /path/to/foo.ebuild fetch unpack
Script loops over arguments after the ebuild name:
  • execute a=portage.doebuild(ebuild name, argument, ROOT, debug=debug)
  • exit if a

importing portage

Quite a few things happen when portage is imported. Most of portage.py goes about defining a boatload of classes and functions, but there's also a fair amount of information that is assembled and made available by the import process. Here's a (probably incomplete) list of those items:

  • Identify ostype from uname, set some os-related defaults, export USERLAND to the environment
  • Check to make sure user is in the portage group (or root)
  • List of "incrementals" and "stickies" keys
  • List possible ebuild version suffixes ("pre", "alpha", ...)
  • List ebuild keys ("DEPEND", "LICENSE", ...)
  • Check environment for ROOT, use "/" if not found
  • Create tmp and var/tmp directories (below ROOT)
  • Read use.defaults and set variables appropriately
  • Check for selinux
  • Set up initial sandbox vars
  • mtime db stuff???
  • Get FEATURES
  • Get OVERLAY
  • Set up porttree and bintree dbs
  • Get thirdpartymirrors
  • Check for prelink
  • Perform PORTAGE_CACHEDIR and PORTAGE_TMPDIR sanity checks
  • Get category list from file
  • Get package.mask
  • Get packages file
  • Get unmask list, mask list, "revmask"?
  • Get ACCEPT_KEYWORDS

emerge

This script is much more complicated than the ebuild script (portage.py is barely 2x as many lines as this script).

  • Set environment variable PORTAGE_CALLER="emerge"

  • Import portage <-- portage.settings and other vars now exposed

  • Check if output should be colored (if not, call nocolor() ).

  • Check portage.settings for PORTAGE_NICENESS; if so, nice portage

  • Freeze portdb api (don't know what this does just yet)

  • Turn off "noauto" if set in FEATURES

  • Set # of merged ebuilds = 0

  • Set up lists of valid emerge parameters (params), actions, and options (all hard-coded lists); also map short-form of options to long-form versions.

  • Process command line (ugly!)
    • process short-form to convert to long-form
    • Process long-form arguments
    • Create list of options (myopts), actions (myactions), & files (myfiles)
    • minor fixes ( -U implies -u, etcetera)
    • handle --noconfmem and --debug flags
  • Define log, exit, countdown, signal handling, help, and "emerge info" helper functions.

  • Check permissions; exit if invalid.

  • Start logging (unless --pretend was set)

  • Decide what type of dependencies to use (self, selective, recurse, deep, or empty)

  • Define spinner func

  • Define the search class

  • Build package digraph
    • Define getlist function for world or system
    • Define genericdict function
    • Define depgraph class (huge)
  • Define unmerge function

  • Define post_emerge function (finish logging, handle info stuff, config protection)

  • If --debug set, edebug=1 (not sure why two different debug flags set)

  • Handle emerge sync (big if, elif, else construction)

  • Handle emerge regen

  • Handle emerge config

  • Handle emerge info

  • Handle emerge search

  • Handle emerge inject

  • Handle emerge unmerge, emerge prune, and emerge clean
    • Call unmerge(myaction, myfiles)
  • Handle emerge depclean

  • Else: Handle update, system, or just process files
    • Do some setup stuff if --pretend set

    • If --resume and portage.mtimedb has a "resume" key, then update opts, create a "resume" depgraph

    • Else, create depgraph, error out if something is wrong

    • If --pretend, show depgraph

    • Else
      • If --buildpkgonly, check that deps are satisfied
      • If --resume, do merge using mydepgraph.merge(portage.mtimedb["resume"]["mergelist"])
      • Else, handle --digest or perform merge, the latter with mydepgraph.merge(mydepgraph.altlist())
    • Run autoclean

    • Run post_emerge()

portage.doebuild()

Much of what happens during the actual merge of a package happens in portage's doebuild function. The interface is:

doebuild(myebuild, mydo, myroot, debug=0, listonly=0, fetchonly=0)

where myebuild is the path to the ebuild to be merged, mydo is the action to be performed (one of the commands that follows ebuild foo.ebuild ...), and debug, listonly, and fetchonly are flags that can be true (1) or false (0).

Here's what happens:

  • Check to make sure the input is valid

  • Set a boatload of keys in the settings instance of the config class
    • PORTAGE_DEBUG, ROOT, STARTDIR, EBUILD, O, CATEGORY, FILESDIR, PF, ECLASSDIR, SANDBOX_LOG, P, PN, PV, PR, PVR, SLOT, BUILD_PREFIX, PKG_TMPDIR, BUILDDIR, KV, KVERS
  • If the action is depend, then return dependency info (eclasses?). If --debug was also set, then the deps are printed on the cmd line.

  • If necessary (meaning not fetch, digest, or manifest), create the build directory (/var/tmp/portage/whatever) and the temp directory (and add T to settings). Also create a ccache directory, if necessary, along with WORKDIR and D.

  • If unmerge, call unmerge() and return.

  • Setup logging if requested

  • If help, clean, setup, prerm, postrm, preinst, postinst, or config, then pass the action to ebuild.sh (using "ebuild.sh mydo,debug,free=1") and return.

  • Generate A from SRC_URI

  • Ensure that DISTDIR exists; change permissions for ${DISTDIR}/cvs-src if necessary.

  • Run fetch(); return error on failure.

  • Generate digest if digest in FEATURES or called explicitly (return in the latter case).

  • If manifest then generate manifest and return.

  • Check manifest and die on failure if strict in FEATURES

  • If fetch, return (since already done above, and it must have worked if we're here)

  • Set up an "actionmap" of dependencies (i.e., unpack requires dep and setup).

  • If package, ensure that PKGDIR is created, call spawnebuild with the current action and the actionmap, and return.

  • If qmerge, call merge w/o checking for dependencies, and return.

  • If merge, call spawnebuild with "install" and the actionmap, and then call merge and return.

  • Else, die because the action didn't match any of the above. Oops!

ebuild.sh

The ebuild.sh bash script predominantly gets called by the portage.doebuild function to handle the bash functions in an ebuild. Following the bouncing ball....

The first block of code is initialization stuff:

  • If not depend or clean, then
    • clean the "successful" file from ${T} if it exists
    • Set up logging if desired (including punching a hole in sandbox for the log file)
    • If ${T}/environment exists, source it
  • unalias everything

  • Set "expand_aliases" and set up die and assert aliaes

  • Source /etc/profile on GNU userland systems

  • Export proper CC and CXX variables.

  • Set up the a hardcoded PATH, tack ${ROOTPATH} to the end of it, and add ${PREROOTPATH} to the beginning of the PATH if it exists.

  • Source helper functions files (functions.sh and extra_functions.sh).

Then a great number of functions are defined:

  • Define a do-nothing esyslog() override function.
  • Define a variety of USE flag functions: use, has, has_version, best_version, use_with, and use_enable functions.
  • Define diefunc function.
  • Set umask and a number of installation defaults.
  • Define check_KV function.
  • Define the keepdir function to put ".keep" files in directories that portage should not automatically remove if empty.
  • Define a series of sandbox helper functions.
  • Define the unpack function.
  • Define the econf convenience function. Notice that econf automatically dies on error(s).
  • Define the einstall convenience function.
  • Define pkg_setup as a placebo function to be overridden.
  • Define pkg_nofetch.
  • Define minimal, but functional, src_unpack and src_compile functions.
  • Define src_install, pkg_preinst, pkg_postinst, pkg_prerm, and pkg_postrm skeleton functions that, when needed, should be overridden.
  • Define the deprecated try function.
  • Define the gen_wrapper function to generate lib/cpp and /usr/bin/cc wrappers.
  • Define some of the dyn-* functions: dyn_setup, dyn_unpack, dyn_clean.
  • Define some install helper functions: into, insinto, exeinto, docinto, insopts, diropts, exeopts, and libopts. Note that a number of additional helper functions, such as the do* functions, are actual executable scripts in /usr/lib/portage/bin.
  • Define a number of abort handler functions: abort_handler, abort_compile, abort_unpack, abort_package, and abort_install.
  • Define the rest of the dyn_* functions: dyn_compile, dyn_package, dyn_install, dyn_spec, dyn_rpm, and dyn_help.
  • Define some eclass debugging functions: debug-print, debug-print-function, and debug-print-section.
  • Define the eclass inherit function.
  • Define some eclass helper functions: EXPORT_FUNCTIONS, newdepend, newrdepend, newcdepend, newpdepend, and do_newdepend.

Finally, the main part of the script is reached:

  • If not depend or clean, then
    • cd into the /var/tmp/portage/whatever directory
    • export USER=portage if the effective user ID is "portage"
    • Set up distcc and ccache PATHs, environment variables, and sandbox holes
  • Turn on sandbox flag

  • Set the S default and unset a handful of variables

  • Source the ebuild

  • Set the S default if S not defined in the ebuild. (redundant?)

  • Set TMP and TMPDIR to be ${T} (otherwise sandbox might error out)

  • Set RDEPEND is not set, set it to be DEPEND.

  • Add eclass deps to {R,C,P,}DEPEND.

  • Loop over arguments passed to ebuild.sh, using a big case statement
    • nofetch -- call pkg_nofetch
    • prerm, postrm, preinst, postinst, config -- turn off sandbox, handle debugging, and call the eponymous pkg_* function.
    • unpack, compile, clean, install -- handle sandbox and debugging, then call the eponymous dyn_* function.
    • help, clean, setup -- Turn off sandbox, handle debugging, and call the appropriate dyn_* function.
    • package, rpm -- Turn off sandbox, handle debugging, and call the eponymous dyn_* function.
    • depend -- Turn off sandbox and write dependency info to /var/edb/cache/dep/CATEGORY/PF.
    • Else, error that the command wasn't valid.
  • If clean, clean that package's temp directory

  • Finish by touching "successful" in the package's temp directory.