For an autonomously controlled arm, this absolutely requires some kind of object-tracking vision. Object tracking with vision is much easier to do (although still insanley hard) than mapping out a room with vision. Luckily, university students who have done such projects are greedy enough to market them after they have built them at hobbiest prices!
Look up the CMU Cam and CMU Cam 2 available for under $200 from various online robotics shops. It should let you track coloured objects and the like to help you with your arm. Another one of interest is the POB-EYE. It is more flexible than the CMU cams, but costs twice as much.
https://www-2.cs.cmu.edu/~cmucam/index.html
https://www.pob-technology.com/ (If you can read French then that is best. If not you can still hunt for the demo videos in the download section. Very neat.).
DId you want to use vision for obstacle avoidance (still different from mapping out a room)? It is somewhat possible with simple vision processing by using subtractive technoqies which I can help you tinker around with if you go with the POB-EYE. It's possible with the CMU cam but you can't program the onboard processor so you have to use your own and it's less convenient. I'd still use a sonar as backup. It's just neat to try and use vision if you already have it.
These are basically your only affordable options:
CMU Cam:
-Cheap
-Ready to use for tracking coloured objects
-Several precoded vision functions. The most useful for you will be frame differencing and colour tracking.
-Code and custom functions are not so readily available without some tinkering and using an off-board processor.
POB-EYE:
-Costs 2x the CMU
-Various more precoded vision functions than the CMU camlike IDENTIFYING OBJECTS based on an bitmap libary uploaded from the PC (can anyone say objects of interest anyone?). You would have to ask for the operational manual since it doesn't seem to be posted to find the other functions. It is also coloured BTW.
-Has an LCD slave and a PIC-carrier board slave designed for use with the POB-EYE. Presumably this means that precoded functions exist to easily display the images and such on the LCD. If you watch the demo videos you'll see what I mean. Im pretty sure most of those are already coded for you.
-Comes with good GUI software to write your own functions, as well as to compile and upload your own bitmap image libraries to the chip
As usual, the thing that costs more does more.
HEADCRACKER:
Your best way is to use traditional sonar of IR distance detection. Totally doable.