Opening and Closing Jaw Movements of Young Children who Stutter

Objective: In this longitudinal study, we investigated the hypothesis that kinematic measures of jaw movement produced by children who stuttered (CWS) and children who did not stutter (CWNS) would differ between opening and closing speech gestures, across phonetic contexts, and across development. Methods: Mean amplitude, velocity, and duration of jaw opening and closing gestures during repeated productions of bilabial syllables were analyzed longitudinally at 1-year intervals for 13 CWS and 7 children CWNS. The utterances ranged across four phonetic contexts: single-syllable, two-syllable, three-syllable, and six-syllable. For jaw movement transduction, a strain gauge was attached to a football helmet in a novel design to minimize head movement. All kinematic measures were made from jaw movement tracings in Windaq (Dataq Instruments, Inc.) software, based on a standard millimeter to voltage conversion. Results: The main finding of the study was that opening gestures were produced by both CWS and CWNS with greater amplitude and duration compared to closing gestures. However, the kinematics associated with opening and closing jaw movements did not differ between CWS and CWNS, suggesting that the intrinsic articulatory dynamics of the two groups were similar. In addition, adapting the kinematics of opening and closing jaw movements across the four phonetic contexts did not differ between the groups for either movement amplitude or velocity. However, CWS produced the closing gesture with significantly greater duration compared to multi-syllable conditions, relative to CWNS. Finally, CWS and CWNS exhibited different longitudinal patterns for jaw amplitude and peak velocity. Conclusion: The speech motor systems of CWS and CWNS exhibited broadly similar organization of intrinsic articulatory dynamics, but groups may differ in how underlying dynamics are adapted to changing phonetic contexts across development. It is possible that the speech fluency of CWS might benefit from greater articulatory constraints, as the main between-group difference was identified when degrees of freedom of movement were greatest. Implications of findings are discussed within the development of a hierarchically organized speech motor system.


Introduction
The speech motor skills of children and adults who stutter have been investigated in search of potential speech motor factors in developmental stuttering [1,2]. The aim of the current study was to investigate whether speech motor systems of children who stutter (CWS) and children who did not stutter (CWNS) develop differently in terms of the intrinsic dynamics of their speech motor systems across phonetic contexts. Intrinsic dynamics are a preferred state of system organization [3]. Kinematic measures of amplitude, peak velocity, and duration across jaw opening and closing gestures were compared across four levels of phonetic length and three time points. It was hypothesized that CWS would have difficulty adapting to changing phonetic contexts, and thus reveal less flexible intrinsic dynamics compared to CWNS.
Opening and closing gestures: The syllable is considered a fundamental unit of speech production [4,5] that is often motorically realized as an opening gesture and a closing gesture. A gesture is defined as a linguistically significant movement within the vocal tract [6]. Opening movements are typically associated with vowel production, whereas closing movements with consonant production [7]. Kinematic measures associated with gestures (i.e.peak velocity, duration, amplitude) have been used to evaluate how gestures are organized as functional units within a syllable [8]. In normal speakers, the time to peak velocity of closing movements across articulators (e.g. lower lip, upper lip, and jaw) is highly correlated, providing evidence of coordinative relations among articulators [9]. When both opening and closing movements are examined, peak velocity is highly correlated for closing movements, but this consistency is reduced for opening movements, suggesting distinct functional organizations for opening and closing gestures [7,8]. Opening and closing movements also differ across speaking rates, which may be interpreted as different functional organization across changing phonetic goals [10]. Such evidence points to a hierarchical organization of the speech motor system [11,12] such that the intrinsic dynamics of the speech motor system flexibly adapt to higher-level phonetic goals (e.g syllable stress).

Journal of Speech Pathology & Therapy
Speech motor control and stuttering: A potential speech motor limitation in stuttering has typically been investigated by comparing adults who stutter (AWS) and adults who do not stutter (AWNS) on motor tasks [1]. A small number of studies of intergestural coordination have reported that AWS exhibit longer durations for movement and greater amplitude of closing gestures, but not opening gestures [13]. Similarly, differences have been reported between AWS and AWNS on jaw closing but not opening gestures [14]. Finally, it has been reported that AWS produced opening and closing movements with consistently reduced displacement and lower velocity than AWNS, although such differences did not reach statistical significance [15]. Taken together, these findings suggest that intrinsic dynamics of speech production may be atypical for AWS, particularly for closing gestures.
Because the onset of developmental stuttering typically occurs during preschool [16], kinematic studies of preschool-age children are needed to test whether a potential motor limitation is present close to stuttering onset. A recent study of preschool-age children reported no between-group differences for amplitude, peak velocity, and duration of opening and closing gestures of single-syllable productions, although male CWS exhibited reduced amplitude of displacement and velocity dynamic ranges [17]. To evaluate how the intrinsic dynamics of a speech motor system are flexibly adapted to new phonetic environments, it is necessary to compare kinematic measures of opening and closing gestures across phonetic contexts.
A number of studies have examined stability of repetitions of sequences of movements with the Spatiotemporal Index (STI) [18]. The STI has been used extensively [18,19] to investigate changes in the speech motor system over development as well as across different linguistic contexts. Studies have shown that STI decreases across development from young children to adults [18][19][20] but increases with utterance length and complexity [21,22] and when speech departs from habitual rate [15,18]. A recent study reported greater STI with increases in sentence length, but this did not differ between CWS and CWNS, although a group difference was reported for simple versus complex utterances [23]. In addition, a recent longitudinal study reported greater STI for CWS at the final visit of the study compared to CWNS, suggesting that phonetic context may be related to stuttering development [2].
The present study: The aim of current study was to investigate whether potential motor factor in developmental stuttering resides at the relatively low level of intrinsic dynamics or in adapting intrinsic dynamics to changing phonetic goals. Amplitude, duration, and velocity of jaw movements were compared across 1) opening and closing gestures to assess intrinsic dynamics, 2) phonetic length to assess adaptation and 3) three longitudinal time points to assess development. It was hypothesized that if CWS have limitations in adapting intrinsic dynamics to changing phonetic contexts, then adapting intrinsic dynamics across levels of phonetic length will differ between CWS and CWNS.

Method Participants
Participants included 13 CWS and 7 CWNS at their first visit (Table  1). Participants were classified as "stuttering" if they produced at least three stuttering-like disfluencies (i.e. part word repetitions, singlesyllable whole world repetitions and disrhythmic prolongations) per 100 words, following classification criteria by Ambrose et al. [2]. All participants were part of a larger study directed by the University of Illinois (Ehud Yairi, PI). Because only 2 of 13 stuttering participants were confirmed to have persisted by the end of the study, comparison of persistent versus recovered children was not possible. Thus, only data from CWS participants while they were stuttering was included in this study.

Procedure
Jaw movements were transduced with a strain gauge system [24]. To avoid artifact due to neck flexion, a custom-made helmet was fashioned from a youth sized football helmet ( Figure 1). Multiple holes were drilled in the helmet to reduce weight and a strain gauge mounting bracket was attached. When placed on the child's head, foam wedges were inserted as needed to ensure that the child's head could not move inside the helmet. The strain gauge was attached to the inferior surface of the mandible at midline with plastic tubing and double-sided tape. With this helmet-strain gauge combination, artifacts due to head movements were minimized.  Table 1: Speech and language measures for both children who stutter (CWS) and children who did not stutter (CWNS) at time 1. Weighted SLD is measure of stuttering severity, including frequency and iterations per unit; language score is Spoken score of Test of Early Language Development, 3rd Edition (TELD-3 Spoken); phonology score is Hodson Assessment of Phonological Processes, 3rd Edition (HAPP-3); standard deviations in parentheses.

Strain gauge calibration
A calibration rod with two fixed pegs was used to determine the millimeter to voltage conversion. The calibration rod was temporarily affixed to the strain gauge apparatus on the helmet so that the voltage meter could be zeroed. Next, the strain gauge was moved a fixed distance on the calibration rod to achieve the target voltage to millimeter conversion, as displayed on WinDaq acquisition software (Dataq Instruments, Inc.). Jaw movements and the acoustic signal were acquired in two channels at 5000 samples per second.
The acoustic signal was obtained by placing a microphone near the child. The utterances produced by participants were /pap/, /bab/, and / mam/ across four conditions of phonetic length. The full set of observations is displayed in Table 2. Children were instructed to produce each utterance as prompted by the examiner at one-second intervals. Fifteen repetitions were attempted for each utterance, for a maximum of 180 per child.

Measurement and data reduction
Jaw displacement waveforms were smoothed at 30 Hz and associated velocity waveforms were derived for measures of peak amplitude (mm), duration (ms), and peak velocity (mm/s) for opening and closing jaw gestures ( Figure 2). Only perceptually fluent tokens were included in the analysis. Before any measures were taken, a coder listened to each token to ensure that it did not contain any errors or disfluencies. The accompanying acoustic signal was used to confirm that each token was consistent with the desired target. The onset and offset of each gesture were defined as 10% of the peak velocity associated with that gesture [8], and the point where velocity was zero was defined as the boundary of the two gestures ( Figure 3). Means and standard deviations for each variable were calculated for statistical analysis.

Reliability
Inter-rater reliability was conducted for gesture duration on approximately 20% of the data. Duration measures were chosen because they entailed accurate identification of velocity, as gesture onset and offset were defined as 10% of velocity peak. Pearson correlation revealed that reliability was 0.956.

Statistical analysis
The lmer function from packages lme4 [25] and language R [26] of the R software environment [27] was used for comparisons 1) between opening and closing gestures; 2) across four levels of phonetic length; 3) across three longitudinal time points; and 4) between CWS and CWNS. Mixed effects models were able to account for unique, random contributions of each participant, which was necessary given their unequal contributions to the dataset.
Three mixed effects models were used, corresponding to each of three dependent measures: mean amplitude, mean duration and mean peak velocity. In addition to including participant as a random factor in each model, fixed factors included phonetic context, longitudinal time, opening or closing gesture, and stuttering versus non-stuttering group. In addition, other factors/variables of interest, including age and gender and interactions group x gesture, group x phonetic context, and group x time, were considered in each model. If any resulted in a significantly better model fit, it remained in the model. Finally, degrees of freedom were calculated based on the Satterthwaite approximation using the statistical package lmer Test [28]. Thus, up to 24 total observations (context x time x gesture) were collected across 20 participants, resulting in a relatively large amount of data given the total number of participants in each group.

Results
Results are organized according to each of the kinematic measures. Measures are examined between groups and opening versus closing gesture, across conditions of phonetic length, and across developmental time.

Group observations
The total number of observations acquired across participants is presented in (Table 3).  Note that each participant, observations were made for 2 gestures (opening and closing), across 4 levels of phonetic length, and across 3 time points, and kinematic measures made for each observation correspond to the waveform presented in (Figure 3).

Gesture amplitude
First, no main effect was found for group, with CWS exhibiting similar amplitude compared to CWNS, but a main effect was found for gesture, with the opening gesture having overall greater amplitude compared to the closing gesture, t(310)=5.772, p<0.001. There was no interaction between group and gesture, suggesting that groups' intrinsic dynamics were similar across opening versus closing gestures. Second, there was a main effect for phonetic context, with conditions 2, 3, and 4 having significantly lower amplitude compared to condition 1 (p<0.001 for all comparisons). There was no interaction between group and phonetic context, suggesting that both groups similarly adapted intrinsic dynamics to changing phonetic contexts. Finally, there was a main effect for developmental time, with time 3 being associated with significantly lower amplitude compared to time 1, t(316)=5.733, p<0.001 and an interaction between group and time, with CWS exhibiting lower amplitude at time 3 compared to time 1, relative to CWNS, t(324)=4.418, p<0.001. Thus, CWNS and CWS may exhibit different developmental trajectories (Figure 3).

Gesture duration
A main effect was found for group, with CWS exhibiting greater kinematic duration compared to CWNS, t(56)=3.050, p=0.003. In addition, the opening gesture was significantly longer than the closing gesture, t(312)=5.639, p<0.001. There was no interaction between group and gesture, again suggesting that groups' intrinsic dynamics were similar across opening versus closing gestures. Secondly, a main effect was found for phonetic context, with conditions 2, 3, and 4 being associating with significantly shorter duration compared to condition 1 (p<0.001 for all comparisons). A related interaction was present between group and phonetic context, with CWS exhibiting significantly greater duration on condition 1 compared to conditions 2, 3, and 4, relative to CWNS (p<0.05 for all comparisons). As can be seen in Figure 4, in comparison to Figure 5, gesture duration was greater at condition 1 compared to conditions 2-4 for CWS, relative to CWNS, but only for the closing gesture (p<0.05 for all comparisons). Finally, no effect was found for developmental time.

Gesture velocity
First, there was no main effect for peak velocity for CWS compared to CWNS, nor was there a main effect for opening versus closing gesture. There was also no interaction between group and gesture, suggesting intrinsic dynamics were similar between groups. Second, there was a main effect for phonetic context, with condition 3, t(313)=2.813, p=0.005, and condition 4, t(313)=4.512, p<0.001, being associated with significantly lower peak velocity compared to condition 1. However, no interaction between group and phonetic length was present, suggesting that groups adapted similarly to changing phonetic contexts. Finally, a significant effect was found for developmental time, with mean peak velocity being significantly less at time 3 compared to time 1, t(316)=4.107, p<0.001. This is explained by the interaction between group and developmental time t(325)=4.425, p<0.001, with CWS having significantly greater peak velocity at time 3 compared to time 1, relative to CWNS ( Figure 6). This difference in peak velocity for CWS at time 3 is consistent with greater jaw amplitude at time 3, presumably to move the jaw a greater distance in a similar amount of time.

Discussion
In this study, we tested the hypothesis that CWS would differ from CWNS in terms of adapting intrinsic articulatory dynamics to changing phonetic contexts. The speech kinematic patterns for opening and closing gestures were compared across levels of phonetic length and three longitudinal time points. First, the groups exhibited similar kinematic patterns across opening and closing gestures, as no interaction of group by gesture was identified, providing evidence for broadly similar intrinsic dynamics. In addition, the groups adapted intrinsic dynamics similarly to changing phonetic contexts. Yet, CWS exhibited greater duration on the closing gesture compared to CWNS at time 1, and the two groups appeared to show divergence in kinematics at time 3.

Opening versus closing movements
Our main finding is that children, regardless of fluency status, produced opening gestures that had greater amplitude and duration than closing gestures, consistent with previous research in adults [7,8,13]. This seems to suggest that intrinsic articulatory dynamics can differ between the two gestures, but that children learn this kinematic pattern early in life. Max et al. [13] suggested that opening and closing gestures are associated with a fundamentally different organization, yet the articulatory adjustments for closing movements remain tightly coupled to opening movements [7], thereby increasing phonetic constraints for the closing gesture. In contrast, articulatory demands appear greater for the opening gesture, as opening gestures entail greater degrees of freedom of articulator movement [29]. However, groups in the present study did not differ in terms of kinematic patterns across opening and closing gestures. These findings can be taken as evidence that the speech motor systems of CWS and CWNS remain similar in terms of intrinsic dynamics that allow basic speech motor patterns to be produced. Marked deviations in the production of these gestures would instead be expected in motor speech disorders, in contrast to stuttering.

Phonetic context
Embedding a single syllable in multiple phonetic contexts resulted in pronounced differences in gestural control between the production of a single syllable and multi-syllabic utterances. It suggests that the single-syllable condition might afford a different intergestural organization due to greater phonetic constraints of multi-syllabic conditions. In these multi-syllabic utterances, the gestures that appeared immediately before or after the target syllable might have reduced degrees of freedom of movement. For example, conditions 3 and 4 included the syllable [bai] prior to the target syllable, and conditions 2, 3, and 4, included a schwa syllable immediately after, which likely limited the degrees of freedom of movement available the target syllable.
In spite of the generally similar movement dynamics, the CWS differed prominently from CWNS in terms of movement duration in the single syllable condition. This finding is consistent with research in adults showing that articulator movement of AWS is greater in duration than typical speakers [13]. However, this finding of greater duration for CWS was specific to the closing gesture of the single syllable production. Assuming that degrees of freedom of movement are greater for the single syllable condition, this is the only condition in which the target syllable is at the initial position of the utterance, perhaps suggesting speech motor organization is unique at this position compared to other positions. It is widely known that the initial position of an utterance is the point at which stuttering is most likely to occur in both adults and children [29,30]. Accordingly, CWS might have more difficulty when degrees of freedom are relatively high, such as the point of speech initiation, implying that CWS benefit from greater articulatory constraints.
Consistent with this interpretation, the initial position of the utterance is also the location where planning demands are highest, and where higher level linguistic planning gives way to articulatory control. For example, initiating an utterance takes into account higher level, discourse-oriented factors such as when to initiate an utterance, or depending on social feedback, the coordination and timing of turntaking with another speaker [31,32] and might involve distinct neural systems [33], as higher-level discourse-oriented factors cascade into lower-level articulatory dynamics [34]. Perhaps a common point of difficulty in stuttering is this transition from higher-level planning to lower-level sensorimotor control at the initiation of an utterance, consistent with a finding by Max et al. [12], who reported that articulatory movements of AWS are more similar to AWNS toward the end of utterances.

Development
The current study addressed whether adapting intrinsic dynamics to changing phonetic contexts would differ over development. CWS exhibited greater amplitude at time 3 compared to time 1, in contrast to the reduction exhibited by CWNS. This finding was also associated with greater peak velocity at time 3 for CWS compared to CWNS. It is worth noting that participants in this study were preschool-age, and similar to nonlinear change observed in other domains, a speech motor system could undergo reorganizations in the preschool years, as more efficient solutions are reached to accommodate experience. Intrinsic dynamics of opening and closing gestures could thus be established at a relatively early age in development. New contexts could be simple increases in utterance length, like in the current study, or involve the expression of novel ideas. Each level within a hierarchically organized system is subject to its own constraints, and behavior within a hierarchically organized system emerges according the constraints any given level imposes on the level below [3]. Therefore, longitudinal differences in amplitude and peak velocity could indicate emergence of different speech motor solutions across different levels within a hierarchically organized speech motor system.

Intrinsic dynamics and spatiotemporal stability
An important question that emerges from this study is whether assessment of opening and closing gestures across changing phonetic contexts remains appropriate when compared with spatiotemporal stability of repeated movement trajectories. It is worth considering that two differing populations could exhibit similar STI values, but perform a task very differently. For example, greater amplitude for CWS compared to CWNS might not be apparent in stability measures. However, for a group showing differences in multiple parameters (duration, velocity and amplitude), the combined effect suggests variation in motor solutions at the utterance level. Thus, direct kinematic measures as per the current study reveal important information about how the speech motor system reaches gestural goals, while the STI reveals important information about how gestural goals are planned at the utterance level. Each approach to assessing the speech motor systems of young children could complement the other.

Limitations
A limitation of the present investigation is that it would have been ideal to acquire data from more than one articulator, for example, upper and lower lips, but using a single strain gauge was a considerable accomplishment considering the age of the youngest children in this study. Another limitation is the small sample size. Although it would have been ideal to acquire data from more participants, a number of participants dropped out of the study due to attrition. A final limitation is that more kinematic data was not acquired to assess recovered versus persistent groups of CWS. However, if intrinsic articulatory dynamics for opening and closing gestures are established early in development, differences from stuttering to recovery would not be expected. Further research should explore this possibility.

Conclusion
Overall, our results indicate that CWS and CWNS likely do not differ generally in terms of intrinsic dynamics, and may even adapt to changing phonetic contexts in a similar manner. Yet, CWS diverged from CWNS by producing the closing gesture in single-syllable contexts with greater duration. It is unclear if this points to a different developmental trajectory, but given the age of the children, this different developmental pattern did not arise from overt therapy or from long-term stuttering. It is possible that speech production in CWS may benefit more from phonetic contexts in which degrees of freedom are reduced, as relatively greater degrees of freedom at utterance may contribute the higher likelihood of disfluency at utterance initial position [29,30]. Findings from this study call for more investigations into the multiple levels of speech motor control, including the intrinsic dynamics and the varying linguistic and communicative contexts to which they must be adapted.