Powering Applications with Conversational Speech Interfaces
Conversational Speech Interfaces
Conversational Speech Interfaces For Applications
Products - STVoiceFlow framework provides STVoiceFlowRunner and STMediaRunner modules that enable conversational speech interfaces in applications.
STVoiceFlowRunner Module
Enables applications to execute complex conversational speech interactions with application users driven by configurable Voice Flows.
Features:
Application designers and developers configure Voiceflows which define the adaptive conversational speech
interactions between an application and its users.
Supports Voiceflow modules of multiple types with automatic and configured transitions among the Voiceflow modules.
Comprehensive real-time Voiceflow processing event notifications for integration with events received concurrently through other
application user interfaces.
Powerful SDK and bi-directional dynamic data sharing allow STVoiceFlow framework and application to adapt and update the
conversational speech user experience.
Automatic and configured handling of interruptions from device or other programs.
Interfaces with STMediaRunner module for execution of low level media tasks on devices.
STVoiceFlowRunner module interprets and processes the following Voiceflow module types:
Executes complex low level media tasks on devices related to audio playback, audio recording, speech recognition,
speech synthesis, audio session interruptions and audio route changes.
Features:
Supports continuous and discrete speech recognition tasks
with option to perform on-device speech recognition.
Supports speech synthesis with runtime switching among various voices.
Integrated with Voice Activity Detection with real time event notifications.
Supports various audio formats for audio playback and audio recording.
Detection and processing of audio session interruptions and audio route changes.
Comprehensive real-time media event notifications to STVoiceFlowRunner module and to application.
STMediaRunner module supports the following functions (and more):
Speech Recognition:
Continuous and discrete
Natural language and task-based
On-device and cloud-based
Complete and partial results
Speech Synthesis:
Multiple voices and languages
Audio streaming for playback of large text
Supports string and file text sources
Supports recording of synthesised audio to files
Audio Playback:
Support of multiple audio formats
Automatic repeat of audio playback
Time stamps to start and stop audio playback
Audio bookmark with notification to client
Audio Recording:
Support of multiple audio formats
Optionally managed by a Voice Activity Detector
Option to exclude silence audio
Audio bookmark with notification to client
Acoustic Echo Cancellation (if available on device)