Yes, the audio is done seperately, usually using phono sockets.
S-Video keeps the colour and monochrome information seperate, in this way it overcomes many of the limitations caused by colour encoding, so gives a better quality picture.
This simple circuit adds the colour and B/W back together, giving a composite signal - but only of composite quality!, it reintroduces the limitations of encoded colour.
The capacitor can be any type, it's not at all critical!.