One of the strategic technology trends for 2017 are conversational systems. As defined by Tata Consulting, enterprise conversational systems offer a messaging or conversation-driven user experience and facilitate contextual conversations around business events. Through connected APIs, enterprises can build conversational systems that aggregate business events from every area of the enterprise to facilitate people-to-people, people-to-systems, and systems-to-systems interactions.
Gartner ranges conversational systems from bidirectional text or voice conversations (simple questions about the weather) to more complex interactions such as collecting oral testimony from crime witnesses to generate a sketch of a suspect.
While modern conversational systems are yet somewhat simple (they often require a lot of sweat, voice regulation and rephrasing before the system can answer something substantial), conversational systems of the future will be able to adapt to a multitude of forms of requests, hear and understand ‘complex’ sentences. In fact, there are interesting companies in the market working on advancing machines' capability to hear and understand humans.
Conversational systems of the future will not be limited to text/voice, though. Gartner suggests that they will enable people and machines to use multiple modalities (e.g., sight, sound, tactile, etc.) to communicate across the digital device mesh (e.g., sensors, appliances, IoT systems). The "conversation" between the human and the machine uses all these modalities to create a comprehensive conversational experience.
One of the companies that have been exploring the opportunities of advanced conversational systems for a while now is Microsoft. In 2012, the company has published a research on The Conversational Web, emphasizing that growing interest in the speech, language, and human-computer interaction (HCI) of scientific communities looking to create a conversational interface to the web.
Microsoft suggested then that the combination of natural language (spoken and written) with gesture, touch, and gaze, this natural user interface (NUI) could help individuals to complete online tasks, find what they want, and answer any question as naturally as having a conversation – anytime, anywhere.
Now, why is this important for the business community across industries? Because Google’s first search results page does not necessarily need to be a fixed model and concept; we don’t necessarily need to look at any screen to find an answer to the question; there is absolutely no need to be typing a request to see it done; no need to know in which search engine the request is being processed as long as we receive a relevant answer, etc.
The more mobile we get, the more important is the ability to obtain precise necessary information in the most efficient manner without the loss of accuracy. And the more important it is to be able to perform automate and delegate day-to-day tasks to a trustful personal assistant, which can be any device, invisibly laying somewhere in the pocket and waiting to hear user’s voice and command.
Conversational interfaces benefit businesses by enabling them to serve customers faster, better and cheaper.
As Parthasarathi V., Lead Consultant for the Workplace Reimagination practice at TCS Digital Enterprise, notes, Through conversational software, an enterprise can simplify and reimagine business processes, and reduce and automate workflows through context-aware intelligence systems. It also enables users and systems to have meaningful interactions, working in tandem to meet business goals.
Aside from Microsoft, major technology companies have been investing efforts to advance user’s contextual experience. Personal assistants like Siri, Google Now, Alexa and Cortana are aimed to change the way a) people interact with each other, 2) with devices and information, and 3) the way businesses interact with customers across apps/devices. Invisible, but always present assistants mediate interactions and ensure a seamless experience with whatever task a user wants to perform.
However, at the moment, the abovementioned assistants are hardly as ‘smart’ as they need to be to replace conventional experiences. Over time, as AI powering these systems evolves, their accuracy in understanding a human and precision in executing the request will significantly improve.
The transition to an ‘invisible presence’ will also be possible when the conversational interface loses its traditional sense and goes beyond microphones and speakers, as Gartner suggests.
As the device mesh evolves, we expect that connection and interface models will expand, and greater cooperative interaction between devices will develop. This will provide an immersive and continuous conversational experience. New input/output mechanisms will emerge using audio, video, touch, taste, smell and other sensory channels, such as radar, that extend beyond human senses. This will enable people to communicate with systems, and systems to communicate with people, in rich conversations that include more than text and voice.
As a result, applications that individuals interact with on an everyday basis will be accessible and ‘orchestrated’ across that mesh of devices, making user experience wholesome over time and space.
Users will be able to interact with an application in a dynamic multistep sequence that may last for an extended period. The experience will flow seamlessly across multiple devices and interaction channels. It will blend physical, virtual and electronic environments. And it will use real-time contextual information as the ambient environment changes, or as the user moves from one place to another.
Microsoft and IBM are leading the way for conversational systems
Somewhat utopian at the moment, the future when AI surrounds us to mediate a seamless experience interacting with the world may be sooner than expected. Microsoft, which we have mentioned before, just couple weeks ago, has made a major breakthrough in speech recognition, creating a technology that recognizes the words in a conversation as well as a person does. The milestone means that, for the first time, a computer can recognize the words in a conversation as well as a person would.
The milestone will have broad implications for consumer and business products that can be significantly augmented by speech recognition. That includes consumer entertainment devices like the Xbox, accessibility tools such as instant speech-to-text transcription and personal digital assistants such as Cortana.
This will make Cortana more powerful, making a truly intelligent assistant possible, said Harry Shum, the Executive Vice President who heads the Microsoft Artificial Intelligence and Research group.
There is another important player to mention: IBM. A little after Microsoft announced its historical achievement, IBM introduced Watson Virtual Agent, a cognitive conversational technology that allows businesses to simply build and deploy conversational agents. Watson Virtual Agent allows users – from startups and small businesses to enterprise – to easily and quickly build and train engagement bots from the cloud, harnessing the power of cognitive technologies. Companies like Staples and Autodesk are embracing services that go beyond simple, narrowly focused tools to sophisticated full-blown virtual agents, relying on deep natural language processing capabilities that can be used to assist consumers.
Earlier this year, IBM also partnered with the University of Michigan to launch a $4.5-million collaboration to develop a new class of conversational technologies that will enable people to interact more naturally and effectively with computers. In Project Sapphire, IBM and the U-M Artificial Intelligence Lab will develop a cognitive system that functions as an academic adviser for undergraduate computer science and engineering majors at the university. The system will allow researchers to explore how smart machines interact with people in goal-driven dialogues.