1. Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.
    Dismiss Notice

Emerging Audio Terminals

Discussion in 'Members Lounge' started by Val Gretchev, Nov 24, 2017.

  1. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    Hi Guys


    I am surprised that there are no threads discussing the “Audio Terminal” craze introduced recently by several large corporations. I am, of course, referring to:

    1. Google with Google Home Speaker: https://store.google.com/product/google_home

    2. Amazon Echo with Alexa: https://www.amazon.com/Amazon-Echo-Bluetooth-Speaker-with-Alexa-Black/dp/B00X4WHP5E

    3. Apple HomePod: https://www.apple.com/homepod/


    I just bought the Google Home Speaker from BestBuy Canada for $99.99, a big discount due to Black Friday. I am anxious to try it out and see how good it is in its voice recognition abilities. It will be arriving next week.


    I was wondering if anyone here knows how the device works with the remote server.

    1. Does it connect to the server App at Google with TCP/IP or does it use HTTP?

    2. Is the voice recognition in the device firmware? If so, what is in the data packets that it sends to the server? Is it simple text or is it encoded somehow?

    3. Does it convert received text to audio on the way back?

    4. Or, does it send VOIP to the server where the voice recognition takes place?


    If it is a simple known interface, wouldn’t it be nice to redirect its input and output to a private server running a different App then the Google search engine. I can think of several applications for an audio terminal that works well in sending and receiving voice messages. I believe that the current application these 3 companies are promoting (music, switching electrical devices on and off) will wear off quickly. There are many much more serious applications that a device like this could be put to better use.


    So, if you have any information on the communications aspects of these devices, please let me know.
     
  2. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    Wow! Not a single reply in over a week. I guess I stumped everyone here with my little question or perhaps there are no electronics enthusiasts here any more and this is just another social website.

    I received my Google Home Speaker in the mail and set it up close to my PC. It works quite well and the voice recognition is very good. It can hear me even when the music is playing loudly.

    I then started Wireshark in my PC and set the filter for the IP address assigned to the speaker by my router. I can see some broadcast messages from the speaker to destination IP address 224.0.0.251 on port 5353 using MDNS protocol. I looked this up and it is at Google. However, I can’t see any transactions when I make a request to the speaker nor when it responds. I assume these messages are not echoed to all connected nodes on the router.

    I looked up a Wireshark help file at:
    https://wiki.wireshark.org/CaptureSetup/WLAN
    and see that I need to turn on “Monitor Mode”. However, it also says that “Windows is very limited here”. Sure enough, there is no monitor mode checkbox anywhere to be found. It looks like I may have to get a PC running Linux to complete my investigation.

    Any thoughts?
     
  3. Nigel Goodwin

    Nigel Goodwin Super Moderator Most Helpful Member

    Joined:
    Nov 17, 2003
    Messages:
    39,739
    Likes:
    713
    Location:
    Derbyshire, UK
    It's not an electronics question, it's a computer/networking one.
     
    • Like Like x 1
    • Agree Agree x 1
  4. dave miyares

    Dave New Member

    Joined:
    Jan 12, 1997
    Messages:
    2
    Likes:
    -10


     
  5. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    I guess you are making it abundantly clear that I am posting on the wrong website. I don't see any computer/networking category on the main page. Do you know of any website where I could better fit-in?

    However, I must point out that in today's world, to be successful, one must have several skills. Electronics knowledge must be augmented with software/firmware know-how and perhaps a bit of mechanical know-how. And you certainly can't function without a working knowledge of network connections and protocols.
     
  6. JimB

    JimB Super Moderator Most Helpful Member

    Joined:
    Sep 11, 2004
    Messages:
    6,737
    Likes:
    662
    Location:
    Peterhead, Scotland
    None whatsoever, no interest in such things.

    If I had such a device, little Basil would have great fun with it.

    Basil.JPG

    As the postman struggles up to the front door with an enormous sack of assorted bird seed, I wonder what else has been ordered by online voice while I am out of the house.

    JimB
     
    • Like Like x 1
    • Funny Funny x 1
  7. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    Thanks for your answer. At least I know where I stand with my little project. Tell me, is everyone here a "Super Moderator" or am I inviting censure by voicing some negative opinion?
     
  8. dave miyares

    Dave New Member

    Joined:
    Jan 12, 1997
    Messages:
    2
    Likes:
    -10


     
  9. crutschow

    crutschow Well-Known Member Most Helpful Member

    Joined:
    Mar 14, 2008
    Messages:
    10,897
    Likes:
    523
    Location:
    L.A., USA Zulu -8
    Nope, only those that say "Super Moderator" under their moniker.
    Not likely, unless you get personal or nasty.
    But negative comments seldom lead to fruitful discussions. :rolleyes:
     
    • Like Like x 1
  10. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    That's OK. I haven't had any fruitful discussion on my post before the negative comment anyway.
     
  11. JimB

    JimB Super Moderator Most Helpful Member

    Joined:
    Sep 11, 2004
    Messages:
    6,737
    Likes:
    662
    Location:
    Peterhead, Scotland
    No problem, most of us here are grown up and can stand a bit of negative opinion.

    You asked "Any Thoughts"
    I replied "No Interest"
    Just indicating my view on the situation.

    Whitby, Ontario, I have never been there.
    Whitby, North Yorkshire, I have been there many times, a favourite place for holidays in days gone by.

    JimB
     
    • Like Like x 1
  12. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    I don’t think you got the gist of what I am thinking about regarding the speakers that I re-named “Audio Terminal”. I have an application in mind that involves medical assistance in a patient’s home which requires a “terminal”. The standard model of screen, keyboard, and mouse is just not in the cards as many older people are not familiar with computer usage and would not take kindly to such an intrusion into their lives. An audio terminal that talks to them in their native language might be the answer. Take a look at the following article that may shed some light on the subject:
    https://www.cnet.com/news/ces-2017-siri-alexa-future-health-and-emotional-support-stanford/
     
    • Like Like x 1
  13. cowboybob

    cowboybob Well-Known Member Most Helpful Member

    Joined:
    Oct 22, 2011
    Messages:
    3,144
    Likes:
    493
    Location:
    James Island, SC
    Thanks, VG.

    Very interesting use of the device(s) I hadn't considered.

    I think your first post must have arrived on a busy day and slid off the bottom before anyone noticed/bothered to respond.
    Not really, but I suspect that the device is "hard coded" to reach out (via WiFi) to a specific IP address (as you noted) and that may not be something you can alter.

    And, again from your description, VOIP appears to be the "data" that is sent and received from/to the AT, hence no discernible data. See this description. Note the highlighted "Google Talk" - just the reference indicates the kind of digital muscle supporting the concept.

    Going no deeper than that though, it strikes me that interfacing to one of the current ATs appears to be doable, but rather complex (setting aside the voice recognition encoding and database manipulation at a receiver computer for recognition and response).

    I would be very interested in following your progress in this endeavor.
     
  14. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    Cowboybob, thanks for your reply.

    Yes, the IP address is most likely hard coded and would be difficult to change. The Google Home Speaker has a grill on the bottom that is held to the main body by magnets. One tug on it exposes the speakers inside (the power cord must be removed first). At the back of the speaker assembly is a USB-mini B jack. This may be a way to alter the programming but I haven’t dared to plug anything into it until I have more information.

    You are quite right about the voice protocol to and from the speaker. It is most certainly VoIP since it wouldn’t make much sense to put voice recognition into every unit. It’s best to keep the voice recognition at the server where a more powerful system can perform the analysis much better and faster. Also, a single app at the server is easier to maintain and upgrade rather than having to upload changes to all the units out there. And there are going to be millions.

    The big clue to me was when I read the following article that allows a Google Home user to telephone any land or mobile phone:
    http://www.pocket-lint.com/news/141...ng-how-does-it-work-and-where-is-it-available
    I haven’t tried this yet, but it’s next on my agenda.

    One way to get familiar with Google Home is to join the Google Developer Community and write some apps.
    https://developers.google.com/s/results/?q=google+Home&p=/

    One thing the Google Home speaker does not have is a camera. That is unfortunate, since a camera would add an extra dimension to human interaction by analyzing facial expressions and body language. This is absolutely essential in artificial intelligence apps that attempt to act as a psychologist. A camera can take a picture of a user’s dinner plate, identify the food groups, measure/estimate the volume of each, and calculate the caloric content of the food. This would be a Dietician app.

    Although a Google Home device would be a cheap hardware solution (leveraging on their huge manufacturing volume), I am not averse to developing my own hardware which fits my application better.
     
  15. cowboybob

    cowboybob Well-Known Member Most Helpful Member

    Joined:
    Oct 22, 2011
    Messages:
    3,144
    Likes:
    493
    Location:
    James Island, SC
    You're welcome, VG.

    When you can, please update us with your progress.
     
  16. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    Are you old enough to remember when the movie 2001: A Space Odyssey first came out in 1968? The epic film by Stanley Kubrick featured H.A.L. 9000, a sentient artificial intelligence, who eventually went berserk, but that was a Hollywood necessity in order to put some meat on a rather dry plot. Later, in 1984, HAL was reprised in the movie 2010.

    If you observe HAL in the movie 2001, you will notice that he not only entertains the crew by playing chess, he carefully interprets any changes in crew behaviour or mood to determine their mental state and whether they continue to be fit for duty. This is his primary function by interacting with the crew. His other function is the control of every aspect of the ship that he performs automatically in the background.

    Perhaps some younger readers do not remember the stories that circulated in those days that Stanley Kubrick played with the name HAL to send a message to his audience. If you take the next higher character in the alphabet for each letter in the name HAL, you will spell IBM.

    Artificial intelligence is not there yet to equal that of HAL depicted in the movie. It will be a long time before artificial intelligence can achieve sentience. Then, we will face a moral dilemma: can such a sentient program be shut down or is that tantamount to murder?

    [​IMG]


    In 1968 I worked at IBM as a Customer Engineer and serviced computers and their peripherals (Tape Drives, Disk Drives, Printers, Communications Interfaces) in the downtown Toronto area and vicinity.

    IBM was the foremost computer company at that time. Thomas Watson Jr. was the second President of IBM from 1952 to 1971. You can read all about him here:
    https://en.wikipedia.org/wiki/Thomas_Watson_Jr.

    Before that, Thomas J. Watson (his father) served as Chairman and CEO of IBM from 1914 to 1956. His bio is here:
    https://en.wikipedia.org/wiki/Thomas_J._Watson

    It is appropriate, therefore, that IBM named their Artificial Intelligence Software after the two most influential presidents who led IBM to greatness by building improved computing machines. You can read about Watson (computer) here:
    https://en.wikipedia.org/wiki/Watson_(computer)


    I believe that artificial intelligence is about to go ballistic in the next few years. Here is a system that Amazon will have available in April 2018 that will allow everyone to dabble with machine learning and object recognition. This will be the ideal tool to implement the Dietitian App I mentioned in the previous segment of this post.
    https://aws.amazon.com/deeplens/
    https://aws.amazon.com/blogs/aws/deeplens/

    If you are interested what VoIP packet structure looks like, you can read about it here:
    https://www.vocal.com/voip/voice-over-ip-packet-structure/
    and here:
    https://www.viavisolutions.com/en-us/literature/voip-overview-white-paper-en.pdf
     
  17. Pommie

    Pommie Well-Known Member Most Helpful Member

    Joined:
    Mar 18, 2005
    Messages:
    10,763
    Likes:
    429
    Location:
    Brisbane Australia
    ONLINE
    I agree that AI is going to boom in the coming years but can't see sentient AI in my lifetime or even this century. Do you know of any progress towards sentience?

    Mike.
     
  18. cowboybob

    cowboybob Well-Known Member Most Helpful Member

    Joined:
    Oct 22, 2011
    Messages:
    3,144
    Likes:
    493
    Location:
    James Island, SC
    Sadly, yes... But what an awesome movie! (I had been in the Navy for about a year when it came out.)

    And thanks for the links, VG.
     
  19. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    No, I don’t have any new information on research that is going on in that field. This article I just found seems to describe the current thinking situation quite well:
    https://www.psychologytoday.com/blog/mind-in-the-machine/201606/the-myth-sentient-machines

    I believe that computers of today (based on a stored program with an arithmetic-logic unit hardware) will never become self-aware. A completely new computer architecture must be invented first. Perhaps Neural Network computers with a lot more network nodes might be the answer or, something entirely different.

    I agree with you that sentient computers will not occur in my time. I can only hope for somewhat intelligent programs executing a pre-defined decision matrix and fooling people into thinking that they are talking and interacting with a human. See Touring Test:
    https://en.wikipedia.org/wiki/Turing_test

    But, that is also very useful. Think about a Psychologist making a diagnosis on a patient. He may ask a number of questions of a patient and make notes of the answers. He already has a list of possible answers and what they might mean. He then draws on his experience having asked these questions of many patients and makes a conclusion. I think that a computer program having a vast database at its disposal can make a similar diagnosis.
     
  20. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
  21. Val Gretchev

    Val Gretchev Member Forum Supporter

    Joined:
    Mar 14, 2009
    Messages:
    137
    Likes:
    16
    Location:
    Whitby, ON, Canada
    If anyone is interested in learning what makes a Google Home Speaker tick, here are a couple of URLs that disassemble a unit and spell out the modules used in the hardware.

    Taking apart the Google Home! What's inside the Google Home?
    https://www.youtube.com/watch?v=diZL6t3cvhs

    Google Home Teardown
    https://www.ifixit.com/Teardown/Google+Home+Teardown/72684

    From the software point of view, the Google website discloses details of the protocols used in Google Talk.
    https://developers.google.com/talk/open_communications
    It is clear from this FAQ that Google uses the XMPP protocol and lists the codecs for voice and video.

    If you have a Google Home, you can try programming a game by duplicating this writer’s work.
    How To Build Your Own Action For Google Home Using API.AI
    https://www.smashingmagazine.com/2017/05/build-action-google-home-api-ai/

    Once you have this working, you can try a more complicated verbal exchange with Google Assistant.
     
  22. audioguru

    audioguru Well-Known Member Most Helpful Member

    Joined:
    Mar 16, 2004
    Messages:
    32,950
    Likes:
    994
    Location:
    Canada, of course!
    My daughter received a Google Home speaker as a Christmas gift but certainly not from me. It sounds awful! and does the same things a smart phone can do.

    When my daughter reads about something that she cannot pronounce properly then the Google Home speaker tries to understand and makes a few tries or gets all confused.

    The news says that much better sounding "Smart Speakers" are coming soon.
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice