What is text-to-speech? I thought about its evolutionary theory, case studies and collaboration with robots

Hello, my name is Mitsui from Kodensha's Sales Department. Have you ever heard of text-to-speech software?
Generally, when you hear the word "text-to-speech," your first image is probably a stereotypical robot from a cartoon or a voice saying "We are aliens" in front of a fan (at least that's how it sounded to me).
However, when you listen to the latest speech synthesis, it sounds as if a person is really speaking.
It is no wonder that the quality of voice synthesizers has improved to the level where they are used as e-learning materials.
In the case of the Google Assistant and Siri on smartphones, when you ask the assistant a question after uttering a specific trigger word, the assistant will respond with a voice response. That voice is the result of text-to-speech technology. Some of you may also know that this technology is also used in the smart speakers "Google Home" and "Amazon Echo," which are currently attracting attention.
In the city, we increasingly hear voice guidance at ATMs in financial institutions and in station announcements. Because of its high quality and the fact that we usually hear it without even knowing it, many of us may not even realize that it is actually text-to-speech.
In addition, recent speech synthesis technology uses the latest AI technology called "deep learning" to analyze your own voice and reproduce it as if you were speaking it yourself. The advancement of technology is truly amazing!
At our company, we are using this text-to-speech technology in a solution with a long name that sounds like it might make you bite your tongue: "J-SERVER Guidance," a multilingual automatic translation and text-to-speech system. Initially, it was introduced as a disaster prevention administrative radio (wide-area broadcasting that provides the same information to residents) in a popular tourist spot for foreigners in Hokkaido and in a municipality in a foreign resident's district.
Currently, its use is expanding, and it is also employed for broadcasting evacuation instructions at commercial facilities in the event of a disaster. The use of the system is expanding and is expected to spread to hotels, inns, and public transportation in the future in preparation for the inbound response and events in 2020.
While the need for foreign language services is increasing, the number of situations where robots are used to serve customers is also increasing, as the shortage of manpower has been a long time coming. This means that robots that do not speak like the robots that were common a long time ago will be active!
Such an era may be just around the corner. I am looking forward to hearing the voice of our robot in the city and at sightseeing spots as I go about my work.