SIREN: Underwater Robot-to-Human Communication with Audio | Minnesota Interactive Robotics and Vision Laboratory

High-fidelity interaction is necessary for collaborative work between divers and autonomous underwater vehicles (AUVs). However, underwater environments are adversarial to standard forms of communication, making underwater human-robot interaction (UHRI) particularly challenging. Gestural communication has naturally become one of the most common methodologies for human-to-robot (H2R) communication in underwater environments, taking advantage of the already ubiquitous gestural languages used by divers. What then of the inverse task, robot-to-human (R2H) communication? Aside from power/status-indicating tones, audio communication has not been significantly explored for underwater robots. Sound travels well through water, but producing and comprehending it is challenging. Most commercially available speakers are not designed for vibrating water rather than air, while underwater-compatible speakers tend to be quite expensive, and incompatible with small AUVs. Additionally, human auditory processing is not well suited for comprehending sound underwater, leading to confusion and garbling of complex signals such as speech. Due to these confounding challenges, audible communication from robots to humans underwater is largely unexplored.

The SIREN (Sound Indicators via Resonance Exciters uNderwater) project creates a novel audio-based communication system for underwater human-robot interaction. SIREN utilizes a surface transducer to produce sound by vibrating the frame of an underwater robot, essentially turning the robot’s outer surface into the vibrating membrane of a speaker. We employ this hardware to create the audible communication indicators of SIREN, which we refer to as sonemes. SIREN creates two forms of sonemes for robot-to-human communication: synthesized text-to-speech (TTS-sonemes) and synthesized musical indicators (Tone-sonemes). To profile the system’s capabilities with respect to underwater communication, we perform a substantial in-person human study with 12 participants. In this study, participants were trained on the use of one of the previously mentioned audio communication systems. Participants were then asked to identify the communication from their system in a pool at various distances. This study’s results demonstrate that sound is a viable method of underwater communication. TTS-Sonemes outperform Tonal-Sonemes at close distances but fail at further distances, while Tonal-Sonemes remain recognizable as the distance to the robot increases.

SIREN Sounds — Depiction of the two types of sonemes that SIREN can produce, asking a diver for their attention.

Soneme symbol clustering — Clustering of scuba diver sign language symbols, sourced from instructional scuba materials.

Student Lead: Michael Fulton
External collaborator: Rafa Absar, Assistant Professor of Computer Science and Cybersecurity, Metro State University, Saint Paul, MN, USA.