Speech Tiles, Enabling Conversational Speech-Powered Applications

Enables applications to execute complex conversational speech interactions with application users driven by configurable Voice Flows.

Application designers and developers configure Voiceflows which define the adaptive conversational speech interactions between an application and its users.
Supports Voiceflow modules of multiple types with automatic and configured transitions among the Voiceflow modules.
Comprehensive real-time Voiceflow processing event notifications for integration with events received concurrently through other application user interfaces.
Powerful SDK and bi-directional dynamic data sharing allow STVoiceFlow framework and application to adapt and update the conversational speech user experience.
Automatic and configured handling of interruptions from device or other programs.
Interfaces with STMediaRunner module for execution of low level media tasks on devices.

Play Audio
- Single/multiple audio segments
- Recorded/Speech Synthesized
Record Audio
- Controlled by Voice Activity Detection
- Automatic termination
- Silence filtering
Audio Dialog
- Multiple audio playback schemas
- Continuous/discrete speech recognition
- User input and intent matching strategies
- Automatic error handling and re-prompting
Audio Listener
- Conntinuous audio playback with application feed
- Audio playback independent of speech recognition
- User input and intent matching strategies
Pause Resume
- Pauses Voiceflow processing
- Application controls Voiceflow processing continuation
Process
- Management of Voiceflow processing state and data stores

Supports continuous and discrete speech recognition tasks with option to perform on-device speech recognition.
Supports speech synthesis with runtime switching among various voices.
Integrated with Voice Activity Detection with real time event notifications.
Supports various audio formats for audio playback and audio recording.
Detection and processing of audio session interruptions and audio route changes.
Comprehensive real-time media event notifications to STVoiceFlowRunner module and to application.

Speech Recognition:
- Continuous and discrete
- Natural language and task-based
- On-device and cloud-based
- Complete and partial results
Speech Synthesis:
- Multiple voices and languages
- Audio streaming for playback of large text
- Supports string and file text sources
- Supports recording of synthesised audio to files
Audio Playback:
- Support of multiple audio formats
- Automatic repeat of audio playback
- Time stamps to start and stop audio playback
- Audio bookmark with notification to client
Audio Recording:
- Support of multiple audio formats
- Optionally managed by a Voice Activity Detector
- Option to exclude silence audio
- Audio bookmark with notification to client
Acoustic Echo Cancellation (if available on device)
Voice Activity Detection

STMediaRunner module Reference Guide