Ok, I'll reply to myself. I think I'm going in the right direction now.
I added a robot_state_publisher node to the previous scheme I was talking about. This node reads the robot description from the urdf file, receives JointState messages from the joint_states topic, and publishes the tf frames.
Now everything seems to be working fine when I open rviz to visualize (no errors or warnings). However I have to rely on the frame position and orientation that's printed on the display properties panel, since the visualization doesn't work on the computer I'm working on (damn Intel integrated graphics...).
Does this sound better?