Application code with cloud link

Should an application or service in production aim to keep its library dependencies as current as possible? Or should they go to the other extreme, and update only as a last resort? What tactic yields the best result?

These questions turn out to be surprisingly central to DevOps practice in 2020. Resolution of dependency conflicts is one of the most frequent stumbles I see. Even those living in a continuously integrated, fully containerized environment with modern package management and Everything-as-a-Service at least occasionally fight off the latter-day infections of what the 1990s called “DLL Hell.”

Image source: https://www.netlify.com/blog/2018/08/23/how-to-easily-visualize-a-projects-dependency-graph-with-dependency-cruiser/

Software has so many dependencies, and our requirements on them are so complex, that conflicts at least occasionally require expert intervention.

A first step to any resolution is a systematic approach to dependencies. Consider these three broad strategies:

  • Import or include or reference libraries without qualification. Pick up whatever the operating environment thinks is best, or at least most recent. Leave details of release management to operating-system specialists who can best handle them.
  • “Pin” dependencies to specific releases that are “known good,” and change that configuration only when security patches for those releases become unavailable. Programmers concentrate on their own code and accept that, when a migration to new versions is necessary, it is likely to involve shutting down all other progress, for anything from a weekend to a month.
  • “Pin” dependencies, but update them on a short cycle. Refresh dependencies every week or even daily.

Each of these strategies has its place. Let’s take a deeper look.

Comparing strategies for updating

First, understand that one team might even juggle distinct strategies at different levels. It can manage its operating system conservatively, expecting to minimize and even eliminate updates during the lifetime of a server used largely to run containers, at the same time as programmers aggressively require the latest versions of the packages for their chosen languages. Or the rhythms can go the other way: Every weekend patches for the operating system are checked and installed, while programmers code against language-specific modules from months or even years earlier.

The computing world is large and complex, and it’s difficult to generalize about all the ways security, compliance, marketing, operations, engineering culture, quality assurance and other dimensions come together to decide these choices.

To help appreciate the possibilities, consider a few concrete instances. A high-volume web application balances load across a fleet of servers, each running a special production container. The attack surface of the container runners is minimal; they’re automatically provisioned and deployed, with rare need to update or maintain them.

The containers themselves execute large Java-based monoliths. Gradle manages packages the application requires. This particular application relies on a combination of public and private repositories. The repositories are generally fixed for three months at a time. What happens if an error turns up in a library on which the application depends?

One possibility is that the error’s resolution might be scheduled for a release a few quarters in the future. More severe defects can be fixed while maintaining the rigor of the system by coding around dependence of the approved library. Suppose, for instance, that an open-source cryptographic function has been discovered flawed. Rather than update the dependence directly, and risk a cascade of transitive dependence changes, the application might supply its own patched version of the corrected implementation of the function. A few release cycles later, the public version of the function presumably will be approved for use in the application’s configuration, and the local copy of the implementation can be discarded as redundant.

An internally used extract-transform-load (ETL) tool for business intelligence reporting coded in Python might be handled quite differently. In this case, the source code barely exceeds 10,000 lines, and it emphasizes use of the latest data science libraries. Maintainers keep a requirements.txt with specific version numbers, to help ensure results are reproducible. At the same time, the tool’s unit tests and other tooling are strong enough to allow its programming team to advance those versions routinely and aggressively once a week. When new functionality becomes available in an external data-science library, that functionality can be available for production-level experiments just a day or two later.

Advice

The first architectural goal for dependence management should be clarity. No one configuration is best for all situations, and effort to optimize an abstract ideal is nearly always misplaced. Instead, concentrate on documenting existing practices, ensuring their consistency and sustainability. Expose assumptions with explicit language and even tooling, so they become more manageable.

Do developers program in the same operating system as the production data center? Is a dedicated staging environment available to the quality assurance team? What mechanisms keep dependences uniform across environments? Do requirements issued as security or business continuity dictates properly respect that the mainstream cultures around different programming languages often set different defaults for package maintenance? Does someone on the team understand how PHP packages are managed differently from Rust ones?

Once accurate information of all these sorts is available, it becomes easier and more effective to decide between different alternatives.

Whenever possible, use computers to help with these tasks and constraints. Write simple tools to scan configurations and report dependencies:

  • On libraries or versions that appear no longer to be actively maintained
  • On releases that are near an end-of-life maintenance deadline
  • On packages that are in conflict
  • On the distance between current package releases and those in use

This kind of reporting helps inform higher-level planning, as well. Your programmers might be eager for a new release of JavaScript that will help them write functionality crucial for a new marketing campaign. If a targeted browser won’t implement that JavaScript version for another 14 months, though, or if the new JavaScript obsoletes an implementation of an external dependence … well, it’s far better to recognize such conflicts before a major project starts.

Modern programming depends on a dense forest of dependencies across languages, operating systems and other technologies. Manage those dependencies with care and precision to minimize unpleasant surprises. Take initiative to control dependencies, rather than have dependencies control your programming.

All-in-one Test Automation

Cross-Technology | Cross-Device | Cross-Platform

About the Author

Cameron Laird is an award-winning software developer and author. Cameron participates in several industry support and standards organizations, including voting membership in the Python Software Foundation. A long-time resident of the Texas Gulf Coast, Cameron's favorite applications are for farm automation.

You might also like these articles