Macro photography under Linux

Stanislav Bokser,123RF

Stanislav Bokser,123RF

Sharp Contrast

In macro photography, multiple images are merged together in order to generate a single picture. Under Linux, the enfuse tool performs this task. An alternative approach is to use GIMP, which lets you merge images manually, an approach that brings with it additional possibilities.

Macro photography is fascinating for many. This form of photography often presents details that would otherwise be overlooked, because it is able to put them into proper focus. However, it is impossible to make a sharp macro photo of an entire object because not all of its depth of field can be in focus at once. This is mostly due to restrictions imposed by the optical properties inherent in the combination of camera and lens. The larger the sensor and the focal distance, the smaller the depth of field will appear. Beyond a certain point, an open aperture will only capture a sharp region that is just a few millimeters in size. As an aside, these optical properties also explain why small compact cameras with small sensors and lenses frequently do a surprisingly good job of making photographs with a large depth of field. Even so, the quality of the sensors and lenses in these cameras is normally not as good as it is in larger cameras.

Strong stopping down improves depth of field (see the "Stopping Down" box). However, this approach only works for smaller regions of an image. Contemporary lenses have a maximum aperture size of 22. The f-number is equal to the ratio of the focal length to the entrance pupil. It used to be that lenses had larger f-numbers. This quickly led to flexion and in turn, to blurring. As a result, stopping down may not be a good solution. Problems that occur with this approach include:

Stopping Down

Stopping down is a technique by which the aperture of a camera's iris is reduced with the aim of increasing the depth of field (i.e. a longer range is in focus). The more you close the iris, the longer the focal length, but the less light reaches the image sensor of a digital camera or film of a film camera, resulting in darker pictures. You can compensate for this by lengthening the exposure time, that is, increasing the time the shutter is open, which is inconvenient without a tripod or for moving objects moving; or increasing the ISO on a digital camera; or using a film with higher speed with an analog camera.

  • The sharpness is limited to a relatively narrow region even with a large stop down.
  • The prolonged exposure times mean that even slow moving objects are often impossible to photograph.
  • Sharpness is not optimal with a small aperture opening, and it is often not possible to manually adjust for optimization.

Photographers have long been thinking about resolving the depth of field issues by combining multiple images with different focus points. This technique involves using sharp regions of images that are extracted from multiple, nearly identical images, all taken from the same angle with the most similar lighting possible in order to replace and cover regions that are out of focus. This approach assumes that you have the necessary images and that you also have the means for cutting small parts out of the images. Gimp's mask technology provides the tool you need for performing this type of digital image processing. The term "mask" refers to an image's regions that are displayed or hidden on a layer depending on the color used by the mask. The color white on a layer mask causes the mask to display the contents. Black makes them invisible. All of the regions from the various images will have to be put together as layers in order to generate a single image that is sharp through and through.

Modern RAW converters can help to detect the sharp regions of an image in digital negatives. Programs like darktable and RawTherapee have indicators for marking the regions that are in focus (Figure 1).

Figure 1: RawTherapee has a function that shows where an image's sharp regions can be found in a digital negative. Activate this function using the button shown by the yellow arrow.

Creating Images

There are some things to watch out for when making the images to ensure that you get good results. For one, you definitely will need a tripod when using focus stacking. The algorithms in the program are not good enough to make larger variations among the image parts invisible. Artifacts will remain behind and mar the results.

Many focus stacking images are generated in studios and laboratories (see the "Focus Stacking and Microscopy" box). This is because it is possible to impose controls over the conditions of these environments. Even the smallest movement of air can have an impact on an extremely lightweight object. Likewise issues can arise with changes in lighting. Consequently, a tripod proves itself very helpful for both camera and light sources. Therefore, it makes sense to place both camera and light sources on tripods and then make sure they don't interfere with each other. Camera manufacturers recommend that you switch off the image stabilizer for the camera when using a tripod.

Focus Stacking and Microscopy

The focus stacking method was originally developed for the field of microscopy, where the depth of field is ordinarily just a few micrometers in size. Given this limitation, it is practically impossible to generate a sharp image without using microscopy. It should not come as a surprise that some of the best instructions and information comes from related Internet sites. This site [1] provides a short and easily understood set of instructions, which are also based on enfuse and other tools.

You should use a monochromatic black background in order to achieve good contours and contrasts. You should not point the lighting directly at the background . A side light avoids hard shadows and garish colors; it also helps emphasize the three dimensional quality of an object. Moreover, it is a very good idea to set the exposure compensation for the aperture on at least minus one. This will darken the background accordingly without making the object itself dark.

A number of different factors influence the sharpness or depth of field. In order to generate suitable images, you should become familiar with the following:

  • Distance between lens and object
  • Focal length
  • Sensor size
  • F-number
  • ISO value

Currently, there are at least three methods for creating images for use with the focus stacking technique:

  • Manual focus : Focus on the closest part of the object, make an image, and then focus on a part of the image that is further away, etc. The advantage to this method is that you can find out right away where everything is and how sharp it is. The disadvantage is that it is rarely possible to generate sharp regions if you have failed to make an image that would be necessary for doing so. This would mean that the images you have made turn out to be useless.
  • Touchscreen : Lots of modern cameras have a touchscreen for setting focus and resolution. This simplifies the creation of images when compared to manual procedures. The touchscreen method also lowers the risk that you will overlook a region. A word of advice: Make more images than you need rather than too few. A possible downside here is that this method offers only limited precision when capturing regions.
  • Camera focus stacking functions : Better cameras today come with special focus stacking functions as part of their menu. These functions vary from manufacturer to manufacturer and according to the camera model. One camera from Olympus works as follows: You specify a focus step and the number of steps, both as absolute values. You then locate the closest point on the object. When releasing the exposure, the camera generates the number of images you have specified in the steps specified. The disadvantage to this camera is that it takes some experience to figure out suitable parameters. The advantage is that the camera is easy and fast to use.

It is a good idea to pay careful attention to the first image, which is the one that is closest to the lens. It is easy to make mistakes while creating this image, because it often contains non-visible regions in the foreground. A good habit here is to make additional images of this region for the stacking series. These prove helpful if you want to manually correct something (Figure 2).

Figure 2: Four layers and some edge definition were sufficient for this image. This is due to the relatively small lens in comparison to the size of the image.

Another piece of advice is that you should take lots of photographs. You should try out different variations instead of sticking with just one. The best thing to do is to make 12 photos, or better yet 20. It takes some time for you to develop an eye for the correct angle, lighting, etc. Practice makes perfect.

Combining Images

The partial images have to be aligned on top of one another as precisely as possible even if you want to process the images automatically or just very quickly. Alignment can prove to be more difficult than it might at first seem even if you have used a tripod when making the images. Figure 4 shows that the smallest of flaws stands out on a macro photo.

Figure 4: Even small flaws stand out on composites that have been generated automatically. Large flaws require manual corrections.

The task of alignment can often be accomplished using simple tools like the command-line program align_image_stack from the Hugin suite of programs. This program has a simple syntax:

align_image_stack <options> <input files>

The input files are read and analyzed and then written to the current directory as numbered output files, potentially with a prefix. In the process, the command considers movement and distortions existing between the input files, and then if possible, it generates the output files so that they are congruent.

Table 1 shows the most important options of the command.

Table 1

align_image_stack Options

Option Function
-o <output file> Merges the files into one high dynamic range (HDR) image. This option is NOT used for focus stacking.
-a <prefix> This option produces aligned image files. Typically the term "align" or something similar is used. If you specify a directory/ then align_image_stack will set this up and write the output files to the directory.
-v Verbose – triggers detailed messages, which can be important when problems arise.
-e Interprets the input files as "full frame fish eye" images when the images are made with a corresponding lens. The default interpretation is for rectilinear images.
-l Interprets the input files as linear.
-t <pixel> Sets an error limit in units of pixels for control points. Large deviations cause the program to ignore the command and look for other control points.
-c <number> Defines the number of control points that are used, by default this will be 8 .
-m Generates an optimized field of vision except for the first image shot. It is used during the focus stacking process and helps to even out small deviations in the images.
-d Distortion – lets you have minor deviations in rotated and distorted images for the purpose of accommodating lens distortions.
-i Smooths out minor movements of the input files.
--distortion Takes the information from the Lensfun database into account in order to remove distortions. This command is useful if the object you are using is included in this database.
--gpu Uses the GPU for computing. The command can deliver results much more quickly, or not, depending on your hardware.
--use-given-order By default the align_image_stack option arranges the input files according to the lighting stored in the Exifdata . Using this option, you can specify the sequence directly in the command line, a feature that is potentially helpful when dealing with alignment problems.

A practical tip: Except for the options -m , -a , and possibly also -v , you will not need any additional options to achieve sufficiently good alignment if you have good input files. However you should try out --gpu to see whether it increases the processing speed significantly. Should problems occur, it is a good idea to turn to -d (--distortion ), -i , and -t , as well as --use-given-order .

A typical sample example in the directory for the input files looks as follows.

align_image_stack -v -m -a align *jpg

This generates the aligned files align0000.tif , align0001.tif , align0002.tif , etc. The output format TIFF is a good choice here. This type of file supports an alpha channel that in turn uses enfuse if it is available. If you use this command call frequently, then you should define an alias:

alias ALIGN='align_image_stack -v -m -a align'

If align_image_stack does not produce good results, then you still have the possibility of putting the image parts in place manually. This is not a problem with Gimp. First, you should load one partial image and then the next one as a new layer. This is done using Open as Layers … in the File menu. You should now change the layer Mode of the uppermost layer to Extract grain . Image parts in the layers that are identical will then appear in gray (Figure 5).

Figure 5: Gimp lets you align layers manually. You should use the Extract grain layer mode for the top layer .

You can use the dragging tool and the cursor buttons to put the upper layer in place as best you can, as shown in Figure 5. If the layers lay exactly on top of one another, then the contours will disappear into a uniform gray tone. However, it is often not possible to place the layers perfectly. If you have this problem, place the images as nearly as possible to the regions where you want to use sharp regions. Then you should change the layer mode back to Normal and right-click on the topmost layer and add just an alpha channel with Add alpha channel from the pop-up menu. Then erase the image regions that are unsuitable or not sharp. Figure 6 shows the result.

Figure 6: The result of a focus stacking series that underwent a moderate amount of manual processing.

Merging with enfuse

The standard tool for actually merging partial images into a composite is a free tool called enfuse [2]. This program is unique in that it has a special algorithm, known as the Burt-Adelson [3], which can detect regions that are in focus and also create seamless transitions in brightness across across multiple images. You need both of these capabilities when focus stacking.

The enfuse program is available as command-line software under Linux, Windows, the Mac. Using it is easy:

enfuse <options>... <input files>

The input files must be aligned exactly. Otherwise, you will get shadow images like the ones shown in Figure 4. Of course, TIFF files generated with align_image_stack make for a good choice. Normally, enfuse processes the images in the sequence you have specified. Sometimes, however, this does not make sense. If so, you can define the sequence as desired using the Response-Files . Response-Files are used in place of the input files and tagged with an @ , as in:

enfuse ... @<response-file>

enfuse was developed as a global program for merging images. It is also used for HDR images. Consequently, numerous options come into play for optimizing outcomes. Many of the program options, for example those for focus stacking, have strange default settings. Table 2 shows enfuse 's most important options.

Table 2

enfuse Options

Option Function
-l <level> Defines the number of levels used for the analysis. By default, enfuse uses automatic recognition ("auto") of the required level. Typically, you do not have to manually adjust the level in the range between 1 and 29. Larger values do not automatically produce better results, but they always use up much more computing time. Sometimes though it is also very helpful to merge using more levels.
-o <output file> Sets the name and the format of the output file. The default format is a TIFF. With the help of compression , you can also generate compressed TIFFs (compression= deflate, lzw, packbits, none ). jpeg=<compression> accepts compression levels from 0% to 100%.
-d <intensity> Defines the color intensity for TIFF output. Achievable levels of intensity include 8, 16, 32 and the variations r32 and r64 that are based on floating point numbers.
--show-image-formats Displays the supported image formats.
-v [<level>] Verbose – causes enfuse to annotate the work in more detail. This is often quite useful.

The options shown in Table 3 control the visual results. Many of the options rely on arguments in the form of floating point numbers between 0 and 1. When using the focus stacking function, you should always state the options --hard-mask and --contrast-weight=1 . The options --exposure-weight=0 and --saturation-weight=0 support these. Some of the options presented here can only be used in the command-line version as they are not currently supported by GUIs.

Table 3

enfuse Options (II)

Option Function
--exposure-weight= <floating point number> Determines how much the lighting will be considered. The default value is 1.0 for focus stacking.
--saturation-weight= <floating point number> Controls the influence of saturation on the algorithm. The default value is 0.2.0 for focus stacking.
--contrast-weight= <floating point number> Defines the influence of the contrasts. The default value is 0.1 for focus stacking.
--contrast-edge-scale= <floating point number> Controls edge recognition. The floating point number should normally fall in the range between 0.1 to 0.5. Each unit represents a pixel. A beginning value of 0.3 is recommended for experiments. This parameter should lead to improved sharpness under certain circumstances. Some users recommend not indicating the parameter --contrast-window-size= when using --contrast-edge-scale= . This parameter cannot be used with the enfuse GUI because it is only possible to enter whole numbers in this field. This restriction is a well-known bug.
--save-masks Prevents enfuse from automatically deleting masks that have been used. You can analyze, manipulate, and reuse these masks. The program stores them in <type>-<number>.tif format. You can assign another argument to this option with a different naming scheme (not the file names for the masks), which would then have to also be stated with --load-masks .
--hard-mask Uses "hard masks" as needed for focus stacking.
--load-masks Lets masks load instead of recomputing them. This means you can use masks that have been manipulated.

Apart from the options shown in Tables 2 and 3, there are a series of additional and very special options that can improve results:

  • --contrast-window-size=<size> defines the size of the "local contrast analysis" windows in pixels. An increase of 5 can sometimes improve the results, and it helps to prevent halos.
  • --gray-projector=<type> controls the way that the relevant regions of the images are computed during the contrast analysis. The default setting is average . Other possibilities include anti-value , l-star , lightness , luminance , pl-star , value , or channel-mixer . The last alternative is supposed to be the best, but it requires advanced knowledge of the channel mixer, which is controlled using three parameters. Otherwise, the l-star alternative produces fairly good results.

There are many other options that are only used in isolated cases. The original documentation [4] describes these in detail. Some tips on these parameters are also found in Pat David's blog [5].

MacroFusion

Since most users do not get overly excited about using a command-line program, a graphical interface like MacroFusion has been available for some time for align_image_stack and enfuse . This Python program is a variation of the enfuse GUI, and it should run on all distributions. However, it offers only simplified options and possibilities. In particular, it lacks a capability for saving masks, and it also cannot load masks that have been saved. You will find more than enough instructions online, some of them excellent, for how to use this tool. Figure 7 shows how the interface looks when it starts.

Figure 7: The MacroFusion interface consists of multiple parts. The input files are above, the options are to the left, and the preview plus other functions are at the bottom.

MacroFusion combines align_image_stack and enfuse in a single interface, but it has only some of the functions you will need and others that you won't. First, you should load the desired input files using the Add button in the upper-left window. You can decide later which images you want to individually deactivate in order to test for effects. MacroFusion deposits the temporary files generated in this way in the ~/.config/mfusion/ directory. You should make sure that the directory has enough free space.

Next, you will need to select options for enfuse . Before doing so, you should activate the automatic alignment option using the Align checkbox, which is found under Align images . This section contains the options for align_image_stack . Automatic alignment is not always a good idea when you are dealing with high resolution images. See the "High Resolution Images" box for more information on this topic. Autocrop crops the output image so that all of the edges are clean. The other two options complement this feature.

High Resolution Images

Modern cameras like the OMD-5 (Mark II) are able to generate images with a resolution of more than 7000x5000 pixels. The RAW files from these type of images are about 100MB in size; the JPEGS are at least 20MB. However, some restrictions accompany images made with this camera. These include:

  • It is absolutely necessary to use a tripod, because the images are generated out of eight different parts.
  • Currently, you have to go through an additional step and manually activate the generation of RAW files.
  • You will need several seconds of time for each image. These images are therefore only suitable for non-moving objects.
  • The automatic focus stacking function can no longer be used. You will have to release exposures of the partial images manually.
  • The alignment (align_image_stack ) is still unable to handle these images properly. Therefore you should try to achieve the best possible alignment.
  • The processing times for larger images of this type is significantly longer. Batch jobs may be a solution.

Figure 3 shows that it pays to work with high resolution images.

Figure 3: High resolution images made with macro photography are impressive. This image was constructed from four layers and then manually reworked. However, you can clearly see that more layers should have been used.

There are three tabs in the middle of the window where you set the options for enfuse . These are the Fusion , Expert , and Configuration tabs.

Basic settings can be specified under Fusion . Figure 8 shows the settings that are important for focus stacking. You should refer to the information above to figure out whether you need other settings.

Figure 8: The basic settings for focus stacking are found in the first tab.

The Force HardMask option in the Expert tab is essential to achieving good results. Contrast window corresponds to the --contrast-window-size= and Gray Projector to the --gray-projector= options (Figure 9).

Figure 9: You should remember to activate the Force HardMask (--hard-mask) option in the second tab in order get good outcomes.

Normally, there is not much you will need to modify under Configuration if you do not have a large display screen and lots of RAM. If your screen and RAM are large, then you can adapt the preview size since the default size causes details to get lost. This means that problems might become visible only once the image is finished. The output options for JPEG and TIFF images are also found here. You might want to modify these. Copy exif info incorporates the EXIF data of the partial images in the newly generated all-in-one image.

You can generate a preview with Preview using the current settings. The Before/After buttons are there for you to figure out whether these settings are better than ones used before. One click brings up the result achieved with the previous settings in the preview window.

Images that have been manually generated with align_image_stack and enfuse often look better than those prepared with MacroFusion. This illustrates the fact that some of the necessary options are still not available and therefore cannot be activated. It is possible for you to exclude individual images from the processing by deactivating the images in the Photos window with a checkmark. In many cases though, this does not prove to be sufficient.

These types of mistakes are often found in high resolution images (see again the "High Resolution Images" box). They have to be more precisely aligned and merged in order to get a good results. To do so, it makes sense to use the -d (distortion) option and the possibility for manually processing masks that are both missing from MacroFusion.

Another peculiarity involves the sequence of the images. Normally cameras generate images in the correct sequence. If you generate the images manually, then the files have to be arranged accordingly. This is not possible with MacroFusion. Instead, you have to use the command line. However, it is sufficient to create the correct sequence for align_image_stack . The TIFFs that have been aligned can then be processed with MacroFusion.

Masked Ball

Users experienced in image processing will probably like to use the --save-masks and --load-masks in enfuse . The first of these generates two masks in the work directory for every image file that has been read in and processed. These two masks include hardmask-<Number> and softmask-<Number> .

The soft masks are generated automatically. The hard masks appear with the --hard-mask option. These masks contain masks that are constructed with the algorithm referred to above and which detect the sharp regions of an image. --hard-mask also lets you use the hard masks for merging images. Figure 10 shows a mask like this. The white regions are used by the underlying image during merging. The black regions are discarded.

Figure 10: This is how the masks generated in enfuse using the --save-masks option look. They can be loaded in Gimp and manually processed.

Soft masks are always generated by default in enfuse as softmask*.tif . They contain soft transitions between individual images. They are therefore gray stage images in contrast to the black and white hard masks.

These masks are intended for the overlay of high resolution HDR images and are only suitable for the tasks lying ahead of them in focus stacking if they have been thoroughly processed, for example with the curve tool. Usually, you will get better results more quickly with the hard masks than with the soft ones.

There are two possibilities for using these masks. To do so, you should load them in Gimp so that you can then modify them with a paint tool (for example, you could use either a soft pencil or a hard brush). You could also even think about using the calligraphy tool, which would let you generate very hard edges in the mask. Next, you should write over the original version with the modified mask and use it with --load-masks .

Alternatively, you could use the masks directly in Gimp. This involves a somewhat advanced technique that also offers interesting possibilities. To get started, you should load the image belonging to the mask, for example align0001.tif . You should prepare this image by adding a layer mask by clicking on the Layer menu and then Mask and Add layer mask .

Next load the mask as an additional layer using the Open as layer … in the File menu. Then using Ctrl+X followed by Ctrl+V, you can cut this layer out and add it to the clipboard and also as a floating selection in the image. You then add it to the layer mask with Anchor layer in the Layer menu (Ctrl+H).

Now you will see the effect that the mask has (Figure 11). You can modify this mask manually and immediately observe the effect. One thing to note here is that you might want to see exactly one layer at a time during processing. Clicking on the eye of the layer in the layer dock while holding the Return key lets you temporarily turn off visibility for all other layers.

Figure 11: You can use hard masks in Gimp as layer masks.

It can be essential with hard masks to create certain transitions in order to achieve realistic results. This is accomplished with Gimp by first loading the mask as a standard image. Then you should select all of the white regions with the global color selection, extending the selection via enlarge … and hide from the Selection menu. Next fill the new selection with white. The outcome will be a somewhat larger selection with slightly softer edges with which you can then soften the transitions.

Reworking an Image

The Processing button in MacroFusion starts the image processor you have specified in the Configuration tab under Edit with . Most of the time, this will be Gimp. Frequently, it will be necessary to rework the images that have been generated. Figure 12 illustrates a typical problem. The very good output images and the use of the layer mode "hard edges" for combining the layers are responsible for the excellent sharpness of this image. However, this leads to colored artifacts that subsequently need to be edited, for example by removing the colors, adapting to color temperature, etc.

Figure 12: Some typical problems are shown here. The artifacts at the lower edge of the screen need to be removed manually. These are not present in all of the input files. The color errors are also in need of correction.

Conclusion

You can generate very good macro photos with focus stacking under Linux. However there are some issues involved. You will need to practice with lots of sample images before you can generate truly good images and understand how many levels you should use. Complex objects like those in the previous figure are difficult to create and require a lot of effort.

MacroFusion serves as the interface for the programs used for this technique, and it produces results quickly. However, command-line applications often yields better results. In particular, using your own masks (--save-masks , --load-masks ) expands the possibilities greatly.

And finally, it doesn't take much more to begin manual image compositing. If you make do without automatic alignment via align_image_stack , you can create images of optimal quality that you can then merge in high color intensity with Gimp starting with version 2.9. This method is not practical for moving objects.