I would think the most direct approach to exploring other animals' "experience of vision", would be to use the machine-learning algorithms we've trained to extract sense data from the human optic-nerve/neocortex, run them on recordings of other animal-brains, and see what we get.
unfortunately, the raw data coming from the eye is very much just that, raw. the experience of vision is only completed after much processing in the sensory cortex. There's a huge amount of metadata that gets mixed in with the stream - things like edge detection, attention, motion prediction, and so on. And even that's going to differ from person to person.
Although it would be very interesting if we could hijack an optic nerve stream, calibrate it using a test image, and then attempt to recreate the visual processing units. Unfortunately this falls prey to the same fundamental problems in some ways - we will only be able to interpret the signal of a mouse or cat in a way that we understand. That is, we would attempt to recreate a signal that is coherent to us and makes sense to our visual system, but that isn't really representative of their actual visual experience.