Xiaoyu was invited to present a talk at the Department of Chinese Language and Literature, Peking University (PKU). In his presentation, he shared behavioral and neurophysiological evidence regarding how speakers learn and process sound correspondence during dialect contact, concluding with discussion on the mental lexicon of bidialectals.
PKU’s Chinese linguists have a long-standing tradition of studying the sound correspondences among Chinese varieties and between Chinese languages and neighboring languages. Their work has made significant contributions to fields such as historical-comparative linguistics, language variation and change, and dialectology. Xiaoyu’s talk approaches sound correspondences from alternative perspectives, sharing new findings derived from artificial language learning and neurolinguistic methods.
We are pleased to share a new publication from Ivy, Frank and Youngah in the Journal of Experimental Psychology: Learning, Memory, and Cognition. The paper, titled “Modeling the impact of prenatal audio attenuation on speech sound learning,” examines how human infants appear to have substantial knowledge of the sound structure of their native language at birth, despite the fact that the uterine environment strongly limits auditory input to low-frequency sounds.
The study explores whether this prenatal low-frequency exposure may actually support later speech sound learning rather than hinder it. To address this question, the authors trained neural network models in two stages designed to simulate prenatal and postnatal learning. During the prenatal stage, models were exposed to speech that was either naturally low-pass filtered, artificially high-pass filtered, or unfiltered. After birth, all models were trained on full-frequency speech. Three different neural network architectures were examined, including a long short-term memory network, a convolutional neural network, and a residual neural network, to test whether the effects generalised across learning systems.
Across architectures, the results showed that prenatal exposure to low-frequency speech led to faster and more effective phonetic learning once full-frequency input became available. In contrast, exposure to high-frequency–only speech was less beneficial during prenatal learning. These findings suggest that the low-frequency sounds available before birth may provide a useful foundation that helps infants extrapolate to the richer speech input they encounter after birth, offering a computational explanation for early speech sound knowledge.
Zheng, S., Tan, F. & Do, Y. (2026). Modeling the impact of prenatal audio attenuation on speech sound learning. Journal of Experimental Psychology: Learning, Memory, and Cognition. Advance online publication. open_in_newDOI
We are pleased to announce a new publication by Jian, Hanna, Youngah and Jesse in the Proceedings of the Annual Meetings on Phonology. The paper, titled “Investigating the Tone-Segment Asymmetry in Phonological Counting: A Learnability Experiment,” examines how learners acquire rules that rely on counting either tones or segments, two fundamental components of spoken language.
Tone-segment asymmetry has long attracted attention in phonological theory, with many proposals suggesting that tones and segments behave differently in how they pattern across languages. This study provides the first experimental test of whether these typological differences are connected to how easily such patterns can be learned. Using an artificial-language learning paradigm, the authors compared learners’ ability to acquire a tonal counting rule with their ability to learn a structurally parallel segmental rule.
The results reveal that an unattested segmental counting pattern is significantly more difficult for learners than its tonal equivalent. This asymmetry in learnability suggests that cognitive biases may contribute to the distribution of tone‑ and segment‑based counting patterns observed cross‑linguistically.
Cui, J., Shine, H., Do, Y., & Snedeker, J. (2026). Investigating the tone-segment asymmetry in phonological counting: A learnability experiment. Proceedings of the Annual Meetings on Phonology, 2(1). open_in_newDOI
We are pleased to announce the publication of a new paper titled “Bottom-up modeling of phoneme learning: Universal sensitivity and language-specific transformation” in the journal Speech Communication. This study was conducted by Frank and Youngah.
The research investigates the emergence and development of universal phonetic sensitivity during early phonological learning using an unsupervised modeling approach. The authors trained autoencoder models on raw acoustic input from English and Mandarin to simulate bottom-up perceptual development, focusing on phoneme contrast learning.
The results demonstrate that phoneme-like categories and feature-aligned representational spaces can emerge from context-free acoustic exposure alone. The study reveals that universal phonetic sensitivity is a transient developmental stage that varies across contrasts and gradually gives way to language-specific perception, mirroring infant perceptual development. Different featural contrasts remain universally discriminable for varying durations over the course of learning. These findings support the view that universal sensitivity is not innately fixed but emerges through learning, and that early phonological development proceeds along a mosaic, feature-dependent trajectory.
Tan, F. & Do, Y. (2025). Bottom-up modeling of phoneme learning: Universal sensitivity and language-specific transformation. Speech Communication. open_in_newDOI
The LDL team came together for an enjoyable lunch gathering at a hotpot restaurant. This event provided a wonderful opportunity for team members to connect over a shared meal, fostering stronger bonds while sparking discussions on ongoing research.
Our lab was well represented at the Annual Meeting on Phonology (AMP 2025) at UC Berkeley. Ivy, Frank, and Youngah not only enjoyed a fun Waymo experience, but also presented their research as below:
Youngah, along with scholars from Harvard University, presented a paper titled “Investigating the Tone-Segment Asymmetry in Phonological Counting: A Learnability Experiment.” Scholars involved in this research were Jian Cui, Hanna Shine, Jesse Snedeker.
Frank, Ivy, and Youngah presented a talk titled “Modeling Prosodic Development with Prenatal Audio Attenuation.”
Additionally, Youngah participated in a keynote panel discussion on “Future Directions in Deep Phonology” with other scholars, including Volya Kapatsinski, Joe Pater, Mike Hammond, Jason Shaw, and Huteng Dai.
Overall, AMP 2025 was a rewarding and excellent opportunity for our team to engage in deep intellectual conversations with leading experts in phonology, fostering new ideas and collaborations that will propel our research forward.
We are pleased to announce the publication of a new paper by Frank Lihui Tan and Youngah Do in the journal Linguistics Vanguard. The paper, titled “Attention-LSTM autoencoder simulation for phonotactic learning from raw audio input,” explores a novel approach to phonotactic learning using an attention-based long short-term memory (LSTM) autoencoder trained on raw audio input.
Unlike previous models that rely on abstract phonological representations, this study simulates early phonotactic acquisition stages by processing continuous acoustic signals. The research focuses on an English phonotactic pattern, specifically the distribution of aspirated and unaspirated voiceless stops. The model implicitly acquires phonotactic knowledge through reconstruction tasks, demonstrating its ability to capture essential phonotactic relations via attention mechanisms. The findings suggest that the model initially relies heavily on contextual cues to identify phonotactic patterns but gradually internalizes these constraints, reducing its dependence on specific phonotactic cues over time.
This study provides valuable insights into both computational modeling and infants’ phonotactic acquisition, highlighting the feasibility of early phonotactic learning models based on raw auditory input.
Tan, F. & Do, Y. (2025). Attention-LSTM autoencoder simulation for phonotactic learning from raw audio input. Linguistics Vanguard. open_in_newDOI
Youngah presents at the Media For All 2025 conference at the University of Hong Kong.
On May 30, 2025, Youngah, our Lab Principal Investigator, delivered a compelling keynote at the Media For All 2025 conference at The University of Hong Kong. Titled “Empowering Cultural Preservation and Inclusivity Through Technology: Innovations in Hong Kong Sign Language”, the address showcased our lab’s pioneering efforts to preserve Hong Kong Sign Language (HKSL) and promote inclusivity for the Deaf community.
Preserving HKSL’s Cultural Heritage
Our research focuses on safeguarding the linguistic and cultural richness of HKSL. Through meticulous documentation and archiving of HKSL signs, narratives, and dialogues, we are building a lasting repository to ensure this vital aspect of Hong Kong’s heritage endures. These efforts provide a foundation for cultural preservation, enabling future generations to engage with and learn from the Deaf community’s unique linguistic identity.
Breakthroughs in Sign Language Technology
Central to our work is an innovative HKSL handshape detection model, which leverages advanced machine learning to enhance the accuracy and speed of sign language recognition. This technology marks a significant leap forward in interpreting HKSL, enabling seamless communication. Key applications include:
A comprehensive HKSL curriculum designed for hearing learners, making the language accessible to a broader audience and fostering cross-community understanding.
Practical tools, such as real-time sign language interpretation for paramedic services, ensuring effective communication during emergencies, and art exhibition accessibility, enriching cultural participation for Deaf individuals.
Building Bridges Between Communities
Our work goes beyond technology—it’s about building unity. By developing tools that facilitate communication, we aim to create a deeper connection between the Deaf and hearing communities. These efforts promote a society that celebrates diversity, embraces cultural heritage, and ensures inclusivity for all.
Youngah’s keynote resonated with attendees, sparking conversations about the role of technology in social good. The Media For All 2025 conference provided an ideal platform to share our vision, and we’re excited to continue this journey toward a more inclusive future.
Looking Ahead
The advancements shared in the keynote are just the beginning. Our team remains dedicated to pushing the boundaries of HKSL research and its applications. We invite collaborators, community partners, and stakeholders to join us in this mission to preserve HKSL and empower the Deaf community.
For more information about our work or to explore potential partnerships, please contact our lab through the Knowledge Exchange Office at The University of Hong Kong. Together, we can create a more inclusive and culturally rich society.
Youngah presents at the Media For All 2025 conference at the University of Hong Kong.Youngah presents at the Media For All 2025 conference at the University of Hong Kong.Youngah presents at the Media For All 2025 conference at the University of Hong Kong.
We are pleased to announce the publication of a new paper by Arthur, Thomas (joint first authors), Aaron, and Youngah in the journal Cognitive Linguistics.
The paper, titled “Iconic hand gestures from ideophones exhibit stability and emergent phonological properties: an iterated learning study,” explores the stability and phonological properties of iconic hand gestures associated with ideophones. Ideophones are marked words that depict sensory imagery and are usually considered iconic by native speakers. The study investigates how these gestures are transmitted across generations using a linear iterated learning paradigm.
The findings reveal that despite noise in the visual signal, participants’ hand gestures converged, indicating the emergence of phonological targets. Handshape configurations over time exhibited finger coordination reminiscent of unmarked handshapes observed in phonological inventories of signed languages. Well-replicated gestures were correlated with well-guessed ideophones from a spoken language study, highlighting the complementary nature of the visual and spoken modalities in formulating mental representations.
Thompson, A. L., Van Hoey, T., Chik, A. W. C., & Do, Y. (2025). Iconic hand gestures from ideophones exhibit stability and emergent phonological properties: An iterated learning study. Cognitive Linguistics. open_in_newDOI