Skip to content

Conversation

@eigendude
Copy link

Description

This PR updates camera_ros to use the image_transport API, currently tested with ROS 2 Kilted.

The image_transport package takes care of image compression, allowing this PR to be net-negative in lines of code. It also offers an expanded range of alternative transports, such as zstd compression and ffmpeg encoding.

New in the Kilted API is combined "camera" publishers/subscribers, which allows both image and camera info to be processed at the same time in the same function. The new camera API is used in this PR.

Motivation and contexet

I recently had the need to test custom QoS profiles. I noticed that v4l2_camera recently exposed a way to control the QoS of publishers. It would be valuable if camera_ros could do the same.

v4l2_camera uses the new camera API, so I updated camera_ros likewise, resulting in the diff here. However, I finished my testing before fully exposing the publisher's QoS. Still, this change allows v4l2_camera-style QoS configuration in the future.

How has this been tested?

I'm currently running this patch on my robotic Millennium Falcon, along with image rectification. The new transport topics successfully show up:

# ros2 topic list
/oasis/falcon/camera_info
/oasis/falcon/image_raw
/oasis/falcon/image_raw/compressed
/oasis/falcon/image_raw/compressedDepth
/oasis/falcon/image_raw/ffmpeg
/oasis/falcon/image_raw/theora
/oasis/falcon/image_raw/zstd
/oasis/falcon/image_rect
/oasis/falcon/image_rect/compressed
/oasis/falcon/image_rect/compressedDepth
/oasis/falcon/image_rect/ffmpeg
/oasis/falcon/image_rect/theora
/oasis/falcon/image_rect/zstd

Downstream, I'm using ORB_SLAM3 on the image_raw topic and apriltag_ros on the image_rect topic for robot pose estimation, control and visualization.

Copy link
Owner

@christianrauch christianrauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also the CI issues.

Comment on lines -704 to -705
pub_image->publish(std::move(msg_img));
pub_image_compressed->publish(std::move(msg_img_compressed));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is publishing the msg_img_compressed now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image_transport package is handling compression, as it supports a wide variety of compressed formats beyond jpeg.

I guess there's the edge case where you want to capture a hardware-encoded jpeg using libcamera and pass it through unmodified to ROS. But the extra configuration, extra code and limited flexibility doesn't seem worth it as raspberry pis get faster and can do jpeg in software just fine (and only when actually wanted by the subscriber).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there's the edge case where you want to capture a hardware-encoded jpeg using libcamera and pass it through unmodified to ROS. But the extra configuration, extra code and limited flexibility doesn't seem worth it as raspberry pis get faster and can do jpeg in software just fine (and only when actually wanted by the subscriber).

But that is the whole point of this code. It efficiently forwards the raw compressed data without recompression. There is absolutely no point on decompressing and recompressing the data. It wasts resources and reduces the image quality.

@eigendude
Copy link
Author

I pushed an update. I rebased and it now builds against ros 2 rolling.

Copy link
Owner

@christianrauch christianrauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for this PR? Is this only about reducing the code? There is a reason why the compressed and raw data is handled this way. We want to prevent unessary image conversions. If this is just about QoS settings, you should be able to adjust the settings for the topics.

This also has to support older ROS 2 distros and cannot just use API from ROS 2 kilted onwards.

Comment on lines -704 to -705
pub_image->publish(std::move(msg_img));
pub_image_compressed->publish(std::move(msg_img_compressed));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there's the edge case where you want to capture a hardware-encoded jpeg using libcamera and pass it through unmodified to ROS. But the extra configuration, extra code and limited flexibility doesn't seem worth it as raspberry pis get faster and can do jpeg in software just fine (and only when actually wanted by the subscriber).

But that is the whole point of this code. It efficiently forwards the raw compressed data without recompression. There is absolutely no point on decompressing and recompressing the data. It wasts resources and reduces the image quality.

sensor_msgs::msg::CompressedImage &destination,
const std::vector<int> &params = std::vector<int>())
image_transport::CameraPublisher
create_camera_publisher(rclcpp::Node *node, const std::string &topic)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pub_camera function is quite trivial and anly called once. Do we really need a dedicated function to create the camera publisher?

@eigendude
Copy link
Author

eigendude commented Nov 28, 2025

There is absolutely no point on decompressing and recompressing the data. It wasts resources and reduces the image quality.

I guess that was my point. Capturing jpeg directly leads to impacted image quality regardless of how it's consumed. If the resource savings aren't worth it, then images shouldn't be captured as jpeg.

But this point is only valid for today's PIs with high CPU speeds. The point is invalidated if the resource savings are worth it, such as for low powered devices.

In that case, there's two paths forward: Keep the jpeg codepath and publish in parallel to e.g. image_raw/libcamera_jpeg, or modify image_transport to accept pre-compressed images for efficiency. With the first option, we lose one advantage of the camera API (publishing both messages at once). With the second option, we've expanded the scope a bunch by working with an upstream project.

This also has to support older ROS 2 distros and cannot just use API from ROS 2 kilted onwards.

This is another downside of the change, it only builds against Kilted onward. This also has a solution, used by a large part of the ROS ecosystem, and that's branching. So not unsolvable but maybe not ideal.

I think we've covered enough downsides that the benefits of using the image_transport package aren't really worth it. The only argument left is for modernization, as someday someone might want better quality with zstd images or an h264 stream over a limited WiFi connection, or a future unseen use case, and you don't want to have to write code for that.

@christianrauch
Copy link
Owner

There is absolutely no point on decompressing and recompressing the data. It wasts resources and reduces the image quality.

I guess that was my point. Capturing jpeg directly leads to impacted image quality regardless of how it's consumed. If the resource savings aren't worth it, then images shouldn't be captured as jpeg.

But this impact in quality is up to the user to decide. And it is also a question of the bandwidth. If you capture a raw data stream on a low-cost webcam, you may see rolling shutter effects due to the delay in transporting the raw uncompressed stream. This can be avoided by selecting a compressed data stream. Some cameras provide this compressed datastream, and who are we to tell users not to use this compressed datastream coming directly from the camera?

But this point is only valid for today's PIs with high CPU speeds. The point is invalidated if the resource savings are worth it, such as for low powered devices.

It is never worth to re-compress a JPEG image. Even if you have amble resources, decompressing the compressed datastream from the camera to a raw buffer and then compressing it back to a JPEG image will still reduce the quality.

In that case, there's two paths forward: Keep the jpeg codepath and publish in parallel to e.g. image_raw/libcamera_jpeg, or modify image_transport to accept pre-compressed images for efficiency. With the first option, we lose one advantage of the camera API (publishing both messages at once). With the second option, we've expanded the scope a bunch by working with an upstream project.

The CameraPublisher only hides the fact that the raw and compressed images are published on separate topics behind a convinient API. The single API call makes it look like they are published at once, but there is not ROS API to publish on multiple topics at once. It is still publishing messages on topics using individual publishers.

If it would be possible to interface with the CameraPublisher to make it reuse our raw and compressed image buffers, it would safe us a lot of custom code. But this is not possible as far as I can see and this is the sole reason why the raw and compressed image data is directly reused for the ROS image messages without additional conversion or other preprocessing.

This also has to support older ROS 2 distros and cannot just use API from ROS 2 kilted onwards.

This is another downside of the change, it only builds against Kilted onward. This also has a solution, used by a large part of the ROS ecosystem, and that's branching. So not unsolvable but maybe not ideal.

But actually, the CI passes for older distros. So I am not sure if there is not a backward compatibility function or similar.

I think we've covered enough downsides that the benefits of using the image_transport package aren't really worth it. The only argument left is for modernization, as someday someone might want better quality with zstd images or an h264 stream over a limited WiFi connection, or a future unseen use case, and you don't want to have to write code for that.

If this is just about compressing the data in another way, then you can also just use any composable node that compresses or decompresses the message via intraprocess communication within the same node manager. This provides the most flexibility and will allow you to use GPU-accelerated de-/compression that is not part of the image_transport package.

Example composable nodes:

Coming back to your PR: What user-facing features are you exactly missing? Is this just about the QoS configuration? If so, can you just replicate the parameters for the topics that we already have?

@eigendude
Copy link
Author

you may see rolling shutter effects due to the delay in transporting the raw uncompressed stream

Good point, I hadn't considered that

but there is not ROS API to publish on multiple topics at once. It is still publishing messages on topics using individual publishers.

Right, the image_transport camera API is just a message_filter wrapper around the two topics. This is very valuable for subscribers, because they don't need their own message_filter, or hacky alternative.

But for publishers? There's no difference under the hood, and the only reason would be small code elegance.

If it would be possible to interface with the CameraPublisher to make it reuse our raw and compressed image buffers, it would safe us a lot of custom code. But this is not possible as far as I can see and this is the sole reason why the raw and compressed image data is directly reused for the ROS image messages without additional conversion or other preprocessing.

Right, this feature would have to be added to image_transport for us to use it.

composable node that compresses or decompresses the message via intraprocess communication within the same node manager

Good point, this solves the problem, and with zero-copy.

image_transport::Republisher

And my solution for zstd and ffmpeg transports already exists! Thanks for pointing this out.


Conclusion: the code doesn't change unless image_transport gets such a feature.

Moreover, we would be worse off with this PR. If the jpeg stream were mapped to another topic, it wouldn't be seen by image_transport. My motivation for noticing this: I need to reduce my WiFi bandwidth to handle "Netflix Hour". Instead of configuring image_transport, I'll use the jpeg feature and crank the quality way down.

The RPi 5 running the camera node at 1080p, 15fps stabilizes around 12% CPU. I'm interested to see how low libcamera jpeg can take this, especially because increasing CPU reduces my battery's runtime.

@christianrauch
Copy link
Owner

My motivation for noticing this: I need to reduce my WiFi bandwidth to handle "Netflix Hour".

What is a "Netflix Hour"?

Instead of configuring image_transport, I'll use the jpeg feature and crank the quality way down.

The compressImageMsg in the camera node is essentially the same as the one in cv_bridge, just with a compression quality parameter. In the end, all the compressing functions use OpenCV's cv::imencode. So you can control the jpeg compression with the same parameters and functionally there is no benefit of using the compressed image transport with JPEG or compressImageMsg in the node.

The RPi 5 running the camera node at 1080p, 15fps stabilizes around 12% CPU. I'm interested to see how low libcamera jpeg can take this, especially because increasing CPU reduces my battery's runtime.

Well, as mentioned above, all those compression functions use cv::imencode and the compressed JPEG stream from the camera cannot be controlled.

If you want to efficiently compress the images, you probably have to use the Raspberry Pis GPU or some hardware encoders to encode the image data to JPEG or, better, video.

@eigendude
Copy link
Author

eigendude commented Dec 4, 2025

What is a "Netflix Hour"?

Lol, this is my idiom for that time in the late afternoon when everyone is home from work and streaming. There's only one air; Wi-Fi is a shared medium and only one device per group of frequency channels can send data at a time. So when everyone streams in a residential setting at the same time, WiFi slows down for everyone, including my robot's HD camera stream. I use "Netflix Hour" to refer to busiest time for WiFi where there's the most competition for traffic, where you see lots of fragged, dropped and retransmitted packets.

If you can solve for netflix hour, you can handle the most adversarial WiFi environment you can imagine 😉

In the end, all the compressing functions use OpenCV's cv::imencode

I discovered this yesterday while reading the code. I saw MJPEG format and mistakenly assumed the RPi camera could compress. But no, it's opencv turtles all the way down.

just with a compression quality parameter

I'm sure image transport plugins have parameters too, but quick searching didn't uncover them, so I'm happy the param is exposed here. I'm glad to make use of it now.

I'm currently setting the quality to 60. It cut bandwidth from 10MB/s to 2.5MB/s. And I still have room to downscale from 1080p if I lose my lust for Full HD. With publishing queue size of 5 instead of 1, about half the frames get through netflix hour wifi, with only occasional half-second freezes. Fine for my nav algos.

Speaking of queue size, I did a lot of optimizing for bad wifi applications. I'll share the research, my results, and ideally a feature request to expose a queue_size parameter for publisher history depth.

If you want to efficiently compress the images, you probably have to use the Raspberry Pis GPU or some hardware encoders to encode the image data to JPEG or, better, video.

The idea behind my RPi 5 was to do all the CV on-robot, but it cut the battery life from 5hrs to 3hrs. So now I have so much compute sitting unused, and such a long runtime, I'm not concerned with only a little image compression while avoiding hardware complexity.

@eigendude
Copy link
Author

eigendude commented Dec 4, 2025

With publishing queue size of 5 instead of 1

Before I close the PR, I'd like to plug my feature request, exposing a queue_size parameter.

I'm running 1080p @ 15 fps over noisy apartment WiFi. With the default depth=1, even brief packet loss or scheduling hiccups caused frames to be dropped before downstream nodes (SLAM, rectification) could see or sync them with CameraInfo.

At 15 fps, one frame is ~67 ms, so depth 5 buys ~330 ms of slack. Depth 10 would stretch that to ~670 ms. In my tests, 5 was enough to ride out the 100–200 ms WiFi hiccups I see at Netflix Hour without making nav feel laggy, while 10 started to feel like needless extra delay. The resulting visualization of the stream is still way choppy but at least I don't get the desync between image and camera info.

Bumping the history depth from 1 to 5 on image and CameraInfo made a measurable difference: bandwidth held around ~2.5 MiB/s (JPEG quality=60), SLAM stayed stable, and latency was capped at only a few frames, which is acceptable for navigation.

Would you be able to add this parameter? Tuning for robustness vs latency is a feature many can use.

@christianrauch
Copy link
Owner

ROS 2 exposes a couple of QoS parameters that control the message transport, and then you also have different RMW implementations (DDS, Zenoh, ...) that make a difference in the communication performance. You will likely get better results with optimising the QoS parameters and by switching to rmw_zenoh_cpp.

Exposing these parameters makes sense and is useful for different communication needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants