Look into Winbond voice chips call ISD ChipCorder. They carry serial and parallel interfaces. You may record small packets of sounds and have each one at an address you choice. Then when a particular number is received, invoked the sound representing the number at predefined address.
I suggest a 10 or 20 second device in serial interface as a starter.