We are pleased to announce the publication of a new paper titled “Bottom-up modeling of phoneme learning: Universal sensitivity and language-specific transformation” in the journal Speech Communication. This study was conducted by Frank and Youngah.
The research investigates the emergence and development of universal phonetic sensitivity during early phonological learning using an unsupervised modeling approach. The authors trained autoencoder models on raw acoustic input from English and Mandarin to simulate bottom-up perceptual development, focusing on phoneme contrast learning.
The results demonstrate that phoneme-like categories and feature-aligned representational spaces can emerge from context-free acoustic exposure alone. The study reveals that universal phonetic sensitivity is a transient developmental stage that varies across contrasts and gradually gives way to language-specific perception, mirroring infant perceptual development. Different featural contrasts remain universally discriminable for varying durations over the course of learning. These findings support the view that universal sensitivity is not innately fixed but emerges through learning, and that early phonological development proceeds along a mosaic, feature-dependent trajectory.
Tan, F. & Do, Y. (2025). Bottom-up modeling of phoneme learning: Universal sensitivity and language-specific transformation. Speech Communication.
open_in_newDOI
The LDL team came together for an enjoyable lunch gathering at a hotpot restaurant. This event provided a wonderful opportunity for team members to connect over a shared meal, fostering stronger bonds while sparking discussions on ongoing research.
Our lab was well represented at the Annual Meeting on Phonology (AMP 2025) at UC Berkeley. Ivy, Frank, and Youngah not only enjoyed a fun Waymo experience, but also presented their research as below:
Youngah, along with scholars from Harvard University, presented a paper titled “Investigating the Tone-Segment Asymmetry in Phonological Counting: A Learnability Experiment.” Scholars involved in this research were Jian Cui, Hanna Shine, Jesse Snedeker.
Frank, Ivy, and Youngah presented a talk titled “Modeling Prosodic Development with Prenatal Audio Attenuation.”
Additionally, Youngah participated in a keynote panel discussion on “Future Directions in Deep Phonology” with other scholars, including Volya Kapatsinski, Joe Pater, Mike Hammond, Jason Shaw, and Huteng Dai.
Overall, AMP 2025 was a rewarding and excellent opportunity for our team to engage in deep intellectual conversations with leading experts in phonology, fostering new ideas and collaborations that will propel our research forward.
We are pleased to announce the publication of a new paper by Frank Lihui Tan and Youngah Do in the journal Linguistics Vanguard. The paper, titled “Attention-LSTM autoencoder simulation for phonotactic learning from raw audio input,” explores a novel approach to phonotactic learning using an attention-based long short-term memory (LSTM) autoencoder trained on raw audio input.
Unlike previous models that rely on abstract phonological representations, this study simulates early phonotactic acquisition stages by processing continuous acoustic signals. The research focuses on an English phonotactic pattern, specifically the distribution of aspirated and unaspirated voiceless stops. The model implicitly acquires phonotactic knowledge through reconstruction tasks, demonstrating its ability to capture essential phonotactic relations via attention mechanisms. The findings suggest that the model initially relies heavily on contextual cues to identify phonotactic patterns but gradually internalizes these constraints, reducing its dependence on specific phonotactic cues over time.
This study provides valuable insights into both computational modeling and infants’ phonotactic acquisition, highlighting the feasibility of early phonotactic learning models based on raw auditory input.
Tan, F. & Do, Y. (2025). Attention-LSTM autoencoder simulation for phonotactic learning from raw audio input. Linguistics Vanguard.
open_in_newDOI
Youngah presents at the Media For All 2025 conference at the University of Hong Kong.
On May 30, 2025, Youngah, our Lab Principal Investigator, delivered a compelling keynote at the Media For All 2025 conference at The University of Hong Kong. Titled “Empowering Cultural Preservation and Inclusivity Through Technology: Innovations in Hong Kong Sign Language”, the address showcased our lab’s pioneering efforts to preserve Hong Kong Sign Language (HKSL) and promote inclusivity for the Deaf community.
Preserving HKSL’s Cultural Heritage
Our research focuses on safeguarding the linguistic and cultural richness of HKSL. Through meticulous documentation and archiving of HKSL signs, narratives, and dialogues, we are building a lasting repository to ensure this vital aspect of Hong Kong’s heritage endures. These efforts provide a foundation for cultural preservation, enabling future generations to engage with and learn from the Deaf community’s unique linguistic identity.
Breakthroughs in Sign Language Technology
Central to our work is an innovative HKSL handshape detection model, which leverages advanced machine learning to enhance the accuracy and speed of sign language recognition. This technology marks a significant leap forward in interpreting HKSL, enabling seamless communication. Key applications include:
A comprehensive HKSL curriculum designed for hearing learners, making the language accessible to a broader audience and fostering cross-community understanding.
Practical tools, such as real-time sign language interpretation for paramedic services, ensuring effective communication during emergencies, and art exhibition accessibility, enriching cultural participation for Deaf individuals.
Building Bridges Between Communities
Our work goes beyond technology—it’s about building unity. By developing tools that facilitate communication, we aim to create a deeper connection between the Deaf and hearing communities. These efforts promote a society that celebrates diversity, embraces cultural heritage, and ensures inclusivity for all.
Youngah’s keynote resonated with attendees, sparking conversations about the role of technology in social good. The Media For All 2025 conference provided an ideal platform to share our vision, and we’re excited to continue this journey toward a more inclusive future.
Looking Ahead
The advancements shared in the keynote are just the beginning. Our team remains dedicated to pushing the boundaries of HKSL research and its applications. We invite collaborators, community partners, and stakeholders to join us in this mission to preserve HKSL and empower the Deaf community.
For more information about our work or to explore potential partnerships, please contact our lab through the Knowledge Exchange Office at The University of Hong Kong. Together, we can create a more inclusive and culturally rich society.
Youngah presents at the Media For All 2025 conference at the University of Hong Kong.Youngah presents at the Media For All 2025 conference at the University of Hong Kong.Youngah presents at the Media For All 2025 conference at the University of Hong Kong.
We are pleased to announce the publication of a new paper by Arthur, Thomas (joint first authors), Aaron, and Youngah in the journal Cognitive Linguistics.
The paper, titled “Iconic hand gestures from ideophones exhibit stability and emergent phonological properties: an iterated learning study,” explores the stability and phonological properties of iconic hand gestures associated with ideophones. Ideophones are marked words that depict sensory imagery and are usually considered iconic by native speakers. The study investigates how these gestures are transmitted across generations using a linear iterated learning paradigm.
The findings reveal that despite noise in the visual signal, participants’ hand gestures converged, indicating the emergence of phonological targets. Handshape configurations over time exhibited finger coordination reminiscent of unmarked handshapes observed in phonological inventories of signed languages. Well-replicated gestures were correlated with well-guessed ideophones from a spoken language study, highlighting the complementary nature of the visual and spoken modalities in formulating mental representations.
Thompson, A. L., Van Hoey, T., Chik, A. W. C., & Do, Y. (2025). Iconic hand gestures from ideophones exhibit stability and emergent phonological properties: An iterated learning study. Cognitive Linguistics. open_in_newDOI
We are pleased to announce the publication of a new paper by Samuel, Xiaoyu, Thomas, Bingzi, and Youngah. The paper, titled “Bilinguals’ Advantages in Executive Function: Learning Phonotactics and Alternation,” has been published in Second Language Research.
This study investigates the relationship between phonotactics and alternation in phonological acquisition and explores whether bilingual speakers have an advantage in learning alternation patterns that are not fully supported by phonotactics. Phonotactics refers to the legal sequences and structures within a language’s phonology, while alternation involves context-sensitive changes in morphemes. The research predicts that bilinguals, due to their enhanced executive function and multitasking abilities, will outperform monolinguals in handling multiple independent phonological pattern learning tasks simultaneously.
The findings reveal that bilingual participants successfully learned alternation patterns regardless of their consistency with stem-internal phonotactic patterns. In contrast, monolinguals only acquired alternation patterns with full phonotactic support. This suggests that bilingualism may confer advantages in managing phonotactics and alternation learning tasks simultaneously.
Sze, S. L., Yu, X., Van Hoey, T., Yu, B., & Do, Y. (2025). Bilinguals’ advantages in executive function: learning phonotactics and alternation. Second Language Research. open_in_new DOI
We are delighted to announce that the paper “Learners’ generalization of alternation patterns from ambiguous data,” presented at the Annual Meeting on Phonology 2024 (AMP2024), has been published in the conference proceedings. This paper is authored by Bingzi (former member of LDL and current PhD candidate at MIT), Ivy, and Youngah.
The published paper investigates how learners generalize phonological alternation patterns when faced with ambiguous data. It explores whether learners prefer simple or complex rules in their generalizations, shedding light on the biases and mechanisms underlying phonological learning.
The findings indicate that learners tend to favor simpler generalizations, contributing to our understanding of phonological acquisition and cognitive processes involved in language learning. This research represents a significant advancement in the study of phonological learning.
Yu, B., Zheng, S., & Do, Y. (2025). Learners’ Generalization of Alternation Patterns from Ambiguous Data. Proceedings of the Annual Meetings on Phonology, 1(1), Article 1. open_in_newDOI
We are pleased to announce that Xiaoyu and Youngah’s paper, “Preference for Distinct Variants in Learning Sound Correspondences During Dialect Acquisition,” has been published in the journal Language and Speech.
This research delves into how learners acquire sound correspondences (SCs) in second dialect acquisition. SCs occur when sounds occupy corresponding positions in cognate words of related languages or dialects. While SCs can consist of both similar and distinct variants, the impact of this similarity on learning has been understudied.
In their study, Xiaoyu and Youngah investigated whether the degree of similarity between dialect variants affects SC learning. They employed an artificial language learning experiment where participants learned SCs between Standard Mandarin and “artificial dialects,” using a set of carefully controlled sound contrasts. The degree of similarity between the variants was evaluated using multiple measures, including phonetic and phonological metrics validated by typological evidence.
The findings revealed that while similarity did not impact the learning of simple one-to-one SCs, learners showed a preference for more distinct variants when the SC mapping structure was more complex (i.e., two-to-one or one-to-two mappings). This preference, however, only emerged when the dissimilarity between the variants was sufficiently large to cross a certain threshold.
This study demonstrates that although learners initially display a general lack of sensitivity to similarity differences, a preference for distinct variants emerges when SC mapping structures become more complex and the dissimilarity between variants reaches a critical level. This suggests that when acquiring complex SC patterns, learners seek out more salient cues, leading to an improved ability to differentiate between distinct variants.
Yu, X., & Do, Y. (2025). Preference for Distinct Variants in Learning Sound Correspondences During Dialect Acquisition. Language and Speech. open_in_newDOI