may i offer something else to think about?
i think defining what you want and need is good starting place - let's see what is reasonable.
first of all, no dog hears 50kHz or above. some university research has found upper frequency limit any dog can possibly hear to be some 46kHz or so:
https://www.lsu.edu/deafness/HearingRange.html
next, if you are generating square wave with 50% duty cycle (it does not get simpler than this), even harmonics cancel out.
all you get is fundamental frequency (1x) and odd harmonics (3x, 5x, 7x, 9x, 11x...).
their magnitude decays quickly so even if the dogs could hear it, we would probably be very little
interested in anything above 5x or 7x.
but since the frequency range you are interested in is 20...40kHz, and lowest harmonic is 3x which is
60...120kHz, this is already outside of hearing range of any dog.
if they cannot hear 3x, then they certainly cannot hear 5x, 7x etc. of your frequencies.
in other words - you can filter or not, they won't notice the difference.
next thing is what is the frequency response of your speakers?
even IF you were to drive them with a perfect square wave (with steep slopes), they will NOT be able to reproduce signal exactly.
if speakers cannot reproduce signal, there is no point in making pure sine signal, using hifi amp etc.
speaker is one of natural filters in the "osc-amp-speaker-ear-brain" chain.
the weakest link (lowest frequency limit) is what determines bandwidth of entire chain.
one of them was ear as mentioned before. amp will be another component that has limited frequency response
so it will also act as a filter (although this one can easier be realized with such characteristics to not be the weak link).
filtering before amp simply affects efficiency of the circuit because if we remove higher harmonics,
which are not even used, your amp would not need to amplify those components.
so if we are generating some signal, even square wave is acceptable. suppose we are not happy with that and we would rather have something that resembles sine wave to some degree to attenuate most of harmonics.
with 4 bits, you can have 16 levels. so your sine would look something like
https://en.wikipedia.org/wiki/File:Pcm.svg
given all the restrictions in hearing, speaker response etc. this is not only good enough, it is way above what is needed for the job as no filtering is required. and for only 4bit or less, simple resistor DAC is more than acceptable.
https://en.wikibooks.org/wiki/Circuit_Idea/Parallel_Voltage_Summer
how about 1k, 2k, 4k, 8k? (or 1k, 2.2k, 3.9k, 8.2k)
so can you think of any MCU that can pump out 4-bits to one of the ports at 40kHz? i cannot think of any that can't...
the next question is how small this has to be? how much can it cost? you can get MCU kits for $5
https://processors.wiki.ti.com/index.php/MSP430_LaunchPad_(MSP-EXP430G2)
or you gen get board with cortex processor (32bit, 64k Flash, 8k ram, and TONS of IO) for mere $8
https://www.digikey.com/product-sea...-and-kits-mcu-dsp-fpga-cpld/2621773?k=stm32f0