Linux Software-Installation Basics

SOLUTIONS SNAPSHOT

PROBLEM:
It's no longer unusual to find Linux systems in Windows envionments, so you should know the basics of install Linux applications—a somewhat different process than installing Windows apps.
SOLUTION:
Learn the three main ways to install Linux applications.
WHAT YOU NEED:
A system with Linux installed, the Linux application you want to install, and any or all of the following: a source-code-configuration utility, a package manager, and a binary-only release of a Linux application.
DIFFICULTY:
2 1/2 out of 5

SOLUTION STEPS:

1. Obtain the Linux application you want to install.
2. If installing by source, download a source-code–configuration utility, such as GNU Autoconf.
3. Run commands to extract, configure, and install source code.
4. If installing by package manager, run the rpm command, query the package, and install it.
5. If installing by binary-only method, obtain a binary only release of the Linux application, run the tar command to extract the application, and install it.

In "A Linux Primer for Windows Administrators" (November 2004, InstantDoc ID 44104), I discussed basicinformation that Windows administrators need to know about Linux. Because Linux is everywhere these days,you need to keep up with current Linux management trends and techniques, especially when it comes to howLinux works in your Windows environment. In this article, I consider an important part of administering Linux: application-installation management (i.e., how to install, remove, and clean up Linux software). I discuss installing by source, installing by using a package manager, and binaryonly installation, including the advantagesand disadvantages of each method. I use Red Hat Linux as the example system.

Installing by Source
You install the majority of your Windows applications, both free and commercial, by using Windows Installer or InstallShield (or a similar installation package). These applications let you automatically install, remove, and clean up applications Linux offers automated installers as well, but not for all applications. Many Linux applications, particularly open-source applications, rely on systems administrators to compile source code and install the resulting binary files. This installation technique is knownas installing by source. Although you won't find it particularly difficult to install software-by source, you'll find it more difficult to remove the installed software, which I address later.

Why is installing by source popular? First, it lets developers focus on the software they're writing rather than the minutiae of installation differences between different versions of Linux and UNIX. Second, it lets systems administrators highly tune software compilation for their systems. When you compile the code, you can customize various options for your needs. In a high-performance environment, this capability can dramatically affect the usefulness of a given software package. Third, having the source code for an application can be important to security, again because you can optimize features to meet your needs. Being able to review source code before compilation gives systems administrators a measure of control over what's actually installed. (The majority of systems administrators, however, even those who strongly push open-source software, don't actively review source code before compiling and installing it.) Finally, installing by source lets end users have access to a huge range of software because most available Linux software is available as source code. To find out about the many open-source applications for Linux, see the sidebar

Be aware that when you install by source, you essentially copy source code to your computer, then run commands to compile it, typically by using the C++ compilers from the Linux distribution. However, you'll usually find it easier to use a precompiledpackage for installation, which I address in the "Installing by Using a Package Manager" section.

Installing source-based software. In the past, it was often difficult to compile and install software from source codebecause Linux systems differ greatly. Despite the fact that Linux systems share the same core programs and features (i.e., the kernel and system libraries), distributions (e.g., Red Hat, Debian) vary so much that source-code compilations would often work on one systembut not on another.

The most popular solution to this problem is the GNU Autoconf system. Developers use Autoconf, a source-code-configuration utility, to smooth out the differences between Linux distributions so that you can easily compile source code on almost any system (including other types of UNIX). In general, expect to use Autoconf when you install by source, although small software projects might not require it.

When you compile source code, first read the included documentation. It's pretty standard to have a text file named Install included with any source code. The Install file contains instructions about how to compile the program on your system. Another common file is Readme, which often contains additional instructions, usually for configuring the software after installation.

Developers shoulder most of the work of using Autoconf. As a systems administrator, your responsibility is to read the Install document, then run the configure command, which is the enduser portion of the Autoconf system.

Before you run the configure command, first extract the software source code by using the tar command, which is much like unzipping a .zip file by using WinZip:

# tar -xzf software.tgz

Typically, when you extract software source from a tar file, a subdirectory is created with the software name and version. You need to change into that subdirectory to actually compile the software, as follows:

# cd ./software-1.0

(For software, type the name of the software as it's shown in the directory.) Now, you're ready to run the configure command:

# ./configure

The configure script performs a series of tests against the system and decides how best to compile the software. If Autoconf determines that the source code won't compile properly against the system, the configure script will abort with an error message. The most common error is that the server doesn't have required software installed (e.g., the GNU Compiler Collection—GCC—compiler). After you install the necessary software, rerun the configure script to let it finish.

After the configure script finishes, issue the make command. The make command reads the file Makefile, which the configure script generated. This file contains a set of instructions that dictate how the source code will be compiled.

New Linux administrators often make the mistake of issuing the make command even if the configure script fails to finish. The result is that the make command either fails or compiles the source code with a default setting the administrator didn't intend to use. To prevent this situation, always use the command-line And operation, which ensures that each command runs only if the preceding command succeeded:

# ./configure && make

The make command will compile the software but won't install it. After the software is compiled, run the make install command to install the software files, including libraries and documentation, to the required location:

# ./configure && make && make install

Uninstalling source-based software. The major caveat to managing software installed by source is that you'll find it laborious if not impossible to remove the software. Most software installed by source is installed to various locations under /usr/local. It can be difficult to determine where files were installed and whether newer versions were installed over older versions. (Package managers, which I address later, effectively address the challenge of removing software.)

If you're lucky, the developer will include an uninstall option in the Makefile you use to originally install the software. The following example removes source code by using the uninstall directive with the make command:

# make uninstall

Some developers use a deinstall directive instead of an uninstall directive. If the uninstall directive fails, try

# make deinstall

If both fail, you'll probably have to delete each installed file manually or leave the software on the system. To determine which files were installed, you can examine Makefile by using a text editor, which is tedious at best.

You can track which files are installed by source or altered by a source-based installation one of two ways. The first is to use a file integrity tool, such as Tripwire's Tripwire for Servers (http://www.tripwire.com), to track file changes on the server. In any case, you should deploy file-integrity tools enterprisewide to increase your ability to monitor file systems for unauthorized changes. The second is to use a software manager, such as GNU Stow (http://www.gnu.org/software/stow) or Depot (http://asg.web.cmu.edu/depot). These tools work like package managers to monitor which files are changed by source-based installations and to help automate software removal.

Installing by Using a Package Manager
Although installing by source permits a lot of customization, package managers offer a more administrator-friendly installation method. Linux distribution vendors provide package managers that, like Windows Installer, automate most of the software installation management task.

I'll discuss the venerable Red Hat Package Manager (RPM) because of its widespread use, because both Red Hat Enterprise Server and Novell's SUSE Linux use RPM, and because most package managers behave similarly. Keep in mind, however, that you can find other excellent package managers for Debian-based Linux (e.g., Advanced Packaging Tool—APT).

The RPM system relies on three components: an installation file known as an RPM package (in Windows, comparable entities are Windows Installer and InstallShield files); the rpm command; and the RPM database. Figure 1 shows the RPM installation process. The administrator runs the rpm command to read the RPM package. The RPM package installs files on the system and possibly modifies existing files. The rpm command then updates the RPM database with information about the installed RPM package.

When you use any package manager (e.g., RPM), you typically have three major operations available: queries, installation, and removal.

Querying packages. Windows administrators will envy one feature of most Linux package managers: the ability to query a package. That is, package managers such as RPM let you query a package or the RPM database to see which files are being installed and which scripts are being executed to install or remove the software.

Let's take as an example the rcs-5.7-860.i586.rpm RPM file, which installs several binaries and documentation files. (Software developers use the Revision Control System—RCS—to control revisions of their code.) To determine exactly which files are contained in the RPM package, you can run the following rpm command:

# rpm -qlp rcs-5.7-
 86Ø.i586.rpm
 /usr/bin/ci
 /usr/bin/co
 /usr/bin/ident
 ...

(The command wraps to several lines here because of space constraints.) Be aware that the RPM files have the format application-version.architecture.rpm. When you work with RPM files, you specify the full filename, but when you work with installed RPMs, you need to specify only application (you'll see this feature in action in subsequent examples).

Before you install RCS, you can see that the RPM will install /usr/bin/ci, /usr/bin/co, and /usr/bin/ident, along with several other files that the sample command doesn't show. You perform the query operation by specifying the -q (query), -l (list files), and -p (specify a package file to examine) options.

You can also use RPM to determine which package owns a given file. For example, you might perform a file audit to determine whether a given file is on your server. To determine which package owns a file, specify the options -q and -f (from file), as follows:

# rpm -qf /usr/bin/ci
rcs-5.7-860

Other query options are available (e.g., you can review which scripts a package will run to complete an installation). To learn more about the rpm command and various query options, read the rpm manpage.

Installing packages. With the knowledge you've gained from querying a package, you can now move forward to installing the package on the server. RPM makes this process simple—you simply specify the -i (install) option of the rpm command, as follows:

# rpm -i rcs-5.7-860.i586.rpm

If all goes well, no output occurs. (This approach is typical of Linux tools—only errors are printed.) Many administrators prefer to see the status of RPM installs, especially if the RPM is large and will take several minutes or more. To watch the progress of an installation, use the -h (hash) and -v ( verbose) options:

# rpm -ihv rcs-5.7-860.i586.rpm

You'll see on-screen output similar to this:

Preparing...
################################# \[1ØØ%\] 
   1:rcs 
################################# \[1ØØ%\]

A major problem that package managers help solve is the dependency problem. As a Windows administrator, you deal with dependencies every day. Certain applications won't work with a given version of a DLL, or you must have one application installed before another will work. Dependency problems also occur in Linux.

RPM alone doesn't totally solve the problem; it provides the information you need to solve the problem. With RPMs, if you try to install software that has a failed dependency, (i.e., you don't have the required software installed), the rpm command will fail, and it will list which dependencies caused it to fail. You must then determine what to install to solve the dependency, which isn't necessarily a simple task. Fortunately, most package managers provide a way to discover dependencies. For example, various add-ons to RPM (e.g., Novell's ZENworks Linux Management) can address dependencies, as can the APT system (which is based on Debian). APT tells you which dependencies you need to address and automatically installs the software needed to address them. APT (http://www.debian.org/doc/manuals/apt-howto/index.en.html) offers an excellent way to learn about dependencies.

Removing packages. When you use a package manager to install software, always use the package manager to remove the software. Doing so removes the software cleanly and restores any original settings. (This practice is akin to running Uninstall for a Windows program rather than merely deleting the application directory under \Program Files.)

With RPM, you use the -e (erase) option to remove a package, as follows:

# rpm -e rcs

Dependencies might again come into play. Suppose you have tools installed that rely on the RCS package (e.g., a graphical RCS front end). If you used RPM to install the graphical front end, it will have registered RCS as a dependency. RPM will abort if you try to remove RCS while the graphical front end is installed, as the following error message shows:

# rpm -e rcs 
error: Failed dependencies: 
  /usr/bin/ci is needed by (installed) 
  graphic-rcs-1.1.0

(The message wraps to several lines here because of space constraints.) When you encounter this situation, you have options. The first is to remove graphic-rcs, then remove RCS. This option ensures that you don't have orphaned applications installed that won't run correctly because they don't have required software installed. The second option is to force rpm to remove RCS despite the dependency error, as follows:

# rpm -ef rcs

Note that the use of the -f (force) option is typically not recommended. The only time this option might be desirable is when you need to remove the existing RPM installation to install an application from source.

The kinds of dependencies I've described rarely occur in Windows. Most Windows applications include all the tools they need to run, although you might experience problems with DLL versions. Linux and UNIX, however, can have a much larger set of dependencies between software packages. Keep this difference in mind when you install and remove Linux applications. Also, note that RPMs can conflict with one another (i.e., different versions of software ask for different versions of RPMs).

Binary-Only Installation
Finally, commercial software vendors often offer binary-only releases for Linux systems. That is, they offer an installation package that doesn't use a package manager but also doesn't include the source code to compile.

By using a binary-only installation method, vendors don't have to release several versions of the software to support different Linux distributions. Instead, they compile the application so that it doesn't depend heavily on the underlying system. They often use static linking of necessary libraries instead of relying on the underlying dynamic library loader (the dynamic library loader works similarly to a DLL in Windows). Still, some vendors support binary-only releases only on specific versions of Linux (e.g., Red Hat, SUSE).

Most binary-only installations are contained in tarballs, archives of files created with the UNIX tar utility. A tarball is a gzipped (compressed) tar package, with an extension of .tgz or tar.gz. Binary-only installations typically contain an install script that installs the package, as follows:

# tar xfz application.tgz 
# cd application/ 
# ./install

where application is the name of the application file that you're installing. The actual details for an application can differ. Read the accompanying Install file or Readme file as well as any other documentation the application vendor supplies.

You'll rarely find an easy uninstall option. Some binary-only installation packages include an uninstall script, but most require that you manually delete files, as I mentioned previously.

Installation Made Easy
By using package managers to automate much of the software-installation?management task, you'll find managing software installation in Linux somewhat similar to managing that process in Windows. And with knowledge of source-installs, package manager installs, and binary-only installs, you can manage even the thorniest of software installations.

Comments

Plain text