Hi Joan,

Thanks for the contribution, and sorry for the delayed response. I've only now found time to digest everything. If you're interested in the image_pipeline, you should consider joining the Perception Pipelines SIG, or at least its mailing list.

I created a ticket to track your code and suggestions: https://code.ros.org/trac/ros-pkg/ticket/5201. Further comments below.

On Thu, Sep 29, 2011 at 10:05 AM, Joan Pau Beltran <joanpau.beltran@uib.cat> wrote:
Electric Emy includes a nodelet to perform crop and binning of images.
In our research group we find this very useful and were wondering why it was missing.
So very good news to see it there. Congratulations! We hope to see this node documented in the image_proc wiki soon.

Thanks for the reminder :). I've updated the reference docs.

If the input image encoding is bayer, this nodelet performs crop/binning and debayering.
We would like to discuss with the ROS community if it would be more convenient perform only the crop/binning, and do NOT perform the debayering. That debayering process, with possibly other image_proc tasks, could be performed by another image_proc node in the subsampled image topic namespace. From our point of view, it will have two main advantages:
  1. It will allow more complex debayering methods to be aplied  to the scaled images by image_proc, with zero mantainment cost of the crop/binning nodelet.
I have problems with this. If you're doing 2x2 decimation of a Bayer image, you have all the information to transform a 2x2 Bayer cell into one RGB pixel right there. The "debayering" is trivial, no interpolation needed whatsoever, and you lose very little information (just from averaging the 2 green pixels in each cell).

With your approach, you throw away 3/4ths of the information up front, and hope that a complex (hence expensive) debayering algorithm can fill some of that back in. You burn a lot more CPU and, no matter how sophisticated your algorithm, you won't get results as good as the trivial downsample-to-RGB approach.
  1. It will allow subsampling bayer images to reduce its size and send them through the network without losing the color information (if sended in mono) nor triplicating the required bandwith (if sended in color)
True, full RGB takes 3x the bandwidth of Bayer. Then again, imposing a Bayer pattern is a pretty simplistic compression method. We have image_transport for applications where bandwidth is limited.
The subsampling is quite easy to implement. It just changes depending on the encoding. Subsampling of mono and color encoded images is quite intuitive. In Bayer encodings, the idea is to consider each set of 2x2 pixels as a color cell, and these color cells are the subsampling units. The resulting image has the same encoding than the original one.

As implemented using decimation, you'll get bad aliasing artifacts when downsampling Bayer images, even in the 2x2 case where you have the information to do it perfectly. For each set of 4 rows/columns you throw out 2 adjacent ones.

Cameras that support "color binning" in hardware do something like this, but using binning/subsampling (averaging blocks of Bayer cells) instead of decimation (throwing away all but one). In that case the aliasing is mitigated.

The attached file crop_decimate.cpp includes the needed modifications to implement the explained functionality, also solving some issues and todo's in the current implementation:
  1. It should handle correctly 16 bit encodings (and any other bit depth), since in that implementation different bit depths do no need specific methods. I say it should because I could not test it. I do not have any camera or bagfile with 16 bit images.
Nice. Jack O'Quin has cameras that produce 16-bit Bayer, so we can get some test data for this.
  1. Offsets are checked to be smaller than image size, and an error is reported if it is not the case. In contrast, if the demanded ROI is bigger than the image (with or height too large for the given the offset and image size), it is SILENTLY adjusted. Should this adjustment be warned?
I figured users might be changing the ROI through reconfigure_gui, in which case warnings would just be annoying. WARN_ONCE might be appropriate. What the nodelet should do is publish the ROI actually used back through dynamic_reconfigure, so that the user will at least get GUI feedback.
  1. On Bayer images, not only the offsets are checked to be even, but also the resulting image size. This is needed to end up with a well formed Bayer image with full color cells.
Good point.
The file crop_decimate2.cpp is a slightly different implementation of the same idea, but performing the subsampling in just one function for all the encodings (given the right parameters). The function takes as targuments the cell size (2x2 for Bayer, 1x1 for other encodings, but any other size could be used).

That's pretty neat. Your approach does look like an improvement over how I handled mono and RGB-type encodings.

Probably the code is a little more elegant, but it may have a little overhead because of the inner loop (probably insignificant).

Probably it's fine. If the overhead actually is significant we could template subsample() on bytes_per_pixel to generate optimized versions.

A part from that, and to present an alternative approach, it also solves the inconsistent image - ROI offsets/size in a different way. The output image size is always the same than the given ROI once subsampled, and the region that falls out of the image is zeroed. This is similar to what happens to the rectified image when a zero alpha value is set in the calibration.

I prefer the first approach, changing the ROI to fit. Using zero to denote invalid or out-of-range pixels is ambiguous, since zero is a valid pixel value. This is also a problem with how we do rectified images with alpha < 1.

It should be said that this subsampling is the crudest one, and it can introduce artifacts in the resulting image. Another completely different approach will be to use OpenCV resizing functions to do the job. In that case, as in the debayering process, a set of methods could be supported with an extra dynamic_reconfigure parameter.

That's true. The current decimation approach is basically equivalent to cv::resize using the INTER_NEAREST (nearest-neighbor) method, which is fast but indeed crude. We could get much better output from INTER_AREA, which gives Moire-free results. A parameter to switch between at least those two options would be useful.

I think that last OpenCV versions only understand images as bgr8 encoded cv::Mat's (even mono images), so the approach will be:
  1. subscribe to the color image
  2. subsample it to a color scaled image
  3. from this color scaled image build the mono scaled image if necessary
And it won't be possible to scale a bayer image, debayer it first will be mandatory. Is there any OpenCV expert that can confirm or refutate these statements? Any information about the resizing functions will be appreciated, since the documentation is quite poor.

The very latest documentation for cv::resize is here and seems adequate to me. I believe it handles one-channel and three-channel 8- or 16-bit images just fine.

For Bayer, I would still do the 2x2 decimation to RGB (almost lossless), and then use cv::resize for any further decimation.

Cheers,
Patrick