Supporting Arbitrary Modules Order In SCons

By Itamar Ostricher Thursday, December 18, 2014 2 Software Engineering build, howto, scons Permalink 0

This is the eighth post in my SCons series. In this post, I describe how to support arbitrary modules order.

In an earlier episode, I presented the multi-module C++ SCons project. In that episode, I explained that modules need to be specified at the order of dependence.

This restriction can be annoying, and painfully limiting, once your project “gets serious”.

I promised a better solution, and now I provide one 🙂 . In a nutshell, the solution is based on a two-pass approach. In the first pass, all library-targets are processed and collected across all modules. In the second pass, all program-targets are processed, using the libraries already collected.

The rest of the post goes into further detail about the solution. The result, as usual, is available on my GitHub scons-series repository. It builds on top of the SCons shortcuts enhancement, in case you need to refresh your memory 🙂 .

My two-pass SCons strategy allows processing modules in any order

Understanding the original restriction

The site_config.modules() function generates the modules to process:

def modules():

    yield 'AddressBook'
    yield 'Writer'

The Writer.writer program uses the addresbook library defined in the AddressBook module:

Prog('writer', 'writer.cc', with_libs='AddressBook::addressbook')

To figure out how to build writer, my SCons shortcuts looks up AddressBook::addressbook in the internal libs dictionary. This works only if the Lib(…) target was processed first, and added to that dictionary.

If the order of modules is reversed, the library will not be found.

def modules():
    yield 'Writer'
    yield 'AddressBook'

itamar@legolas sconseries (episodes/07-weakorder) $ scons
scons: Reading SConscript files ...
scons: + Processing flavor debug ...
scons: |- Reading module Writer ...
scons: *** Library identifier "AddressBook::addressbook" didn't match any library. Is it a typo?  Stop.

It makes sense that targets that depend on other targets cannot be built before all their dependencies are built. But this restriction is stronger than that. It applies at the module-level, instead of the target-level.

In small projects, like my running address book example project, it’s almost equivalent. Once the project becomes more complex, it is not uncommon to see structures like this:

Module common with logging library that uses io::fs.
Module io with fs library.
Module io with png_parser library / program that uses common::logging.

This structure breaks my assumption, because it’s not possible to process common before io (for common::logging) and also process io before common (for io::fs). It’s not reasonable to make the project change its design and move libraries around just to satisfy my order requirement, is it?

Desired result of weakened restriction

The basic requirement from this enhancement is to weaken the restriction to the bare minimum. I want to move from module-level to target-level. As long as targets don’t create cyclic dependencies loops, I want to be able to process the project, regardless of the order the modules were specified in the config.

The example I used above, with the reversed Writer and AddressBook is a good test case. It should work!

New assumptions: Programs and Libraries are essentially different!

Program targets in SCons eventually generate linker commands. For linking, all objects and libraries are required. Program targets do not depend on other program targets – it doesn’t mean anything.

Library targets, on the other hand, generate only compilation commands, transforming source files into object files (and packing them into lib-files). Essentially, it’s meaningless to say that library A depends on library B. Even if library A uses symbols defined and implemented in library B, the actual library file for B is not needed to compile library A. Library A can be compiled and packed before library B, as long as A has access to the h-files it needs from B.

When a program that uses A is linked, then the linker will need the library files of A and B. Even then, the linker will need B only if the program actually calls something from A that uses something from B…

It is useful for humans to say that A depends on B. As far as the build-chain is concerned, this dependency is “virtual”, and has no meaning until a binary is linked.

Summing up, the assumptions I will be using in my enhancement are based on the observation described above. The assumptions are:

All library targets can be processed in any order, across all modules.
Once a dictionary of all libraries is complete, then all program targets can be processed in any order, across all modules.

The two-pass solution

Given the assumptions and observations described above, the solution is apparent:

Go over all modules, processing only library targets. Populate the internal libraries dictionary.
Go over all modules again, processing program targets, using the complete libraries dictionary.

This solution is implemented exactly as described here, in the `build() method in site_init.py (with minor changes):

    def build(self):
        """Build flavor using two-pass strategy."""
        # First pass over all modules - process and collect library targets
        for module in modules():
            print '|- First pass: Reading module %s ...' % (module)
            shortcuts = dict(
                Lib       = self._lib_wrapper(self._env.Library, module),
                StaticLib = self._lib_wrapper(self._env.StaticLibrary, module),
                SharedLib = self._lib_wrapper(self._env.SharedLibrary, module),
                Prog      = nop,
            )
            self._env.SConscript(os.path.join(module, 'SConscript'), variant_dir=os.path.join('$BUILDROOT', module), exports=shortcuts)
        # Second pass over all modules - process program targets
        shortcuts = dict()
        for nop_shortcut in ('Lib', 'StaticLib', 'SharedLib'):
            shortcuts[nop_shortcut] = nop
        for module in modules():
            print '|- Second pass: Reading module %s ...' % (module)
            shortcuts['Prog'] = self._prog_wrapper(module)
            self._env.SConscript(os.path.join(module, 'SConscript'), variant_dir=os.path.join('$BUILDROOT', module), exports=shortcuts)
        # Add install targets for programs from all modules
        for module, prog_nodes in self._progs.iteritems():
            for prog in prog_nodes:
                assert isinstance(prog, Node.FS.File)
                # If module is hierarchical, replace pathseps with periods
                bin_name = path_to_key('%s.%s' % (module, prog.name))
                self._env.InstallAs(os.path.join('$BINDIR', bin_name), prog)
        # Support using the flavor name as target name for its related targets
        self._env.Alias(self._flavor, '$BUILDROOT')

The module-level SConscripts are not aware of this two-pass strategy. As far as they are concerned, when they are executed, they execute all the targets defined. To implement the two-pass strategy I described, I need to execute each SConscript twice, each time “enabling” only some of the shortcuts it may use.

I did this by manipulating the shortcuts dictionary I export each pass. During the first pass, the library-related shortcuts are “on”, as they were in the SCons shortcuts episode, but the Prog shortcuts is “off”. I implement this “off” behavior by using the nop function:

def nop(*args, **kwargs):
    """Take arbitrary args and kwargs and do absolutely nothing!"""
    pass

This way, the Prog shortcut is actually executed during the first pass in each SConscript, but it doesn’t do anything!

In a similar fashion, during the second pass, the Prog shortcut is “on”, and the library-related shortcuts are “off”.

With the two-pass strategy implemented, the “wrong-order” example works as expected:

itamar@legolas sconseries (episodes/07-weakorder) $ scons
scons: Reading SConscript files ...
scons: + Processing flavor debug ...
|- First pass: Reading module Writer ...
|- First pass: Reading module AddressBook ...
|- Second pass: Reading module Writer ...
|- Second pass: Reading module AddressBook ...
scons: + Processing flavor release ...
|- First pass: Reading module Writer ...
|- First pass: Reading module AddressBook ...
|- Second pass: Reading module Writer ...
|- Second pass: Reading module AddressBook ...
scons: done reading SConscript files.
scons: Building targets ...
...... (snipped) ........
Install file: "build/release/Writer/writer" as "build/release/bin/Writer.writer"
scons: done building targets.

Summary

Once again, this episode brings no change in functionality, but makes the build framework more flexible and developer-friendly.

The two-pass strategy, as described and implemented, removes the module-dependency-order restriction. The solution replaces the old restrictions with much more reasonable and natural assumptions about the nature of libraries and binary programs.

The final result is available on my GitHub scons-series repository. Feel free to use / fork / modify. If you do, I’d appreciate it if you share back improvements.

My two-pass strategy for SCons modules makes development simpler, more natural

See the scons tag for more in my SCons series. This episode opens the door for automated module discovery. Another interesting future episode will deal with propagating required libraries.

2 Comments

Harti Brandt
January 8, 2015
Your series on scons is really helpful. With the automatic module dependencies however I see a problem when the libraries are shared libraries (which is the common case today). For shared libraries one creates linker commands to create the library and shared libraries may depend on other shared libraries (built in the same project). Do you have any idea how to handle that?

The dependency information is already there in principle because when linking a library you need to specify the libraries it depends on. It should be just a matter of collecting this information in the first pass.

- Itamar Ostricher
  January 8, 2015
  Thanks for the feedback!
  
  I didn’t get a chance to work with shared libraries, so didn’t have this use-case in mind.
  
  One approach would be to apply the same strategy, using two different methods to process shared library targets. The first one would collect dependencies information (like with static libs), and the second one would generate linker commands using that dependencies information (like with program targets).
  
  Hope this helps! 🙂