Dear all,

I am trying to build a node that would reproduce the basics of the camera pluggin of rviz.
The difference is that I'd like to be able to correlate a point clicked by the user on the camera screen (in pixels)  with a point from a 3D point cloud (in meters).

In other words, I have my stereo camera that generates a 3D point cloud and I would like to allow the user to select on of these 3D points using the 2D image of the left or right eye.
If the user clicks a point were no 3d info is available, then I'd like to choose the closest 3d point available.

So, I need a way to correlate x,y coordinates in pixels with x,y,z coordinates in meters.

I am thinking of 2 main alternatives:

A) project the 3d-points back to a stereo_sensor frame that would represent the sensor of the camera, and from there convert my meters in pixels (I know 1 pixel is 6 um on my sensor)
B) send a ray from my pixel point and see if I collide into a 3d point (I have no idea how to do this, maybe bullet apparently?)

I have the feeling tf can help me for the first alternative.
Still, would anyone know precisely how to achieve this and which packages would help in this way (ROS is really wide and I am new to it).
Also, am I taking the good approach or is there a far more simple way do this?

Thanks for your help

Raphael