Behavioral inhibition and childhood stuttering. Behavioral inhibition and childhood stuttering.

Purpose— The purpose of this study was to assess the relation of behavioral inhibition to stuttering and speech/language output in preschool-age children who do (CWS) and do not stutter (CWNS). Method— Participants were preschool-age (ages 36 to 68 months), including 26 CWS (22 males) and 28 CWNS (13 males). Participants’ behavioral inhibition (BI) was assessed by measuring the latency to their sixth spontaneous comment during conversation with an unfamiliar experimenter, using methodology developed by Kagan, Reznick, and Gibbons (1989). In addition to these measures of BI, each participant’s stuttered and non-stuttered disfluencies and mean length of utterance (in morphemes) were assessed. Results— Among the more salient findings, it was found that (1) there was no significant difference in BI between preschool-age CWS and CWNS as a group, (2) when extremely high versus low inhibited children were selected, there were more CWS with higher BI and fewer CWS with lower BI when compared to their CWNS peers, and (3) more behaviorally inhibited CWS, when compared to less behaviorally inhibited CWS, exhibited more stuttering. Conclusions— Findings are taken to suggest that one aspect of temperament (i.e., behavioral inhibition) is exhibited by some preschool-age CWS and that these children stutter more than CWS with lower behavioral inhibition. The present results seem to support continued study of the association between young children’s temperamental characteristics and stuttering, the diagnostic entity (i.e., CWS versus CWNS), as well as stuttering, the behavior (e.g., frequency of stuttered disfluencies).


Temperament and developmental stuttering
Recent reviews suggest that increased attention is being paid to the possible association between temperament and childhood stuttering (Conture, Kelly, & Walden, in press;Kefalianos, Onslow, Block, Menzies, & Reilly, 2012;Seery, Watkins, Mangelsdorf, & Shigeto, 2007). Temperament, as defined by Rothbart and Bates (1998), can be described as "constitutionally-based individual differences in emotional, motor and attentional reactivity and self-regulation…(that are) relatively stable over time" (p.109). Sanson, Hemphill, and Smart (2004) further suggest that such "…constitutionally-based differences in behavioral style…are visible from the child's earliest years" (p.143). These constitutionally-or biologically-based components include genetic as well as non-genetic elements such as prenatal, environmentally-based variables (e.g., prenatal drug exposure), birth complications, and perinatal influences present in the child's early rearing environment.
Regarding the possible association between temperament and childhood stuttering, in a recent review, Kefalianos et al. (2012) cautiously concluded that there may be some association between temperament and stuttering during the preschool years. The cautious nature of their conclusion resulted from several factors, none the least of which being the relatively small number of published studies as well as inconsistencies among findings. However, using independent replication of findings as a guideline for "trustworthiness" of findings, Kefalianos et al. noted some consistencies, that is, preschool-age children who stutter (CWS), when compared to preschool-age children who do not stutter (CWNS), appear to exhibit (1) lower adaptability (Anderson, Pellowski, Conture, & Kelly, 2003;Howell et al., 2004;Schwenk, Conture, & Walden, 2007) (2) lower attention span/ persistency (Howell et al., 2004;Karrass et al., 2006;Schwenk et al., 2007), (3) more negative quality of mood (Howell et al., 2004;Johnson, Walden, Conture, & Karrass, 2010) and (4) higher activity level (Embrechts, Ebben, Franke, & Van de Poel, 2000;Howell et al., 2004). In addition, several empirical studies, not reported in the above review, have also shown that CWS, when compared to CWNS, are more emotionally reactive to environmental stimuli Wakaba, 1998) as well as lower in inhibitory control (i.e., the capacity to plan and suppress inappropriate approach responses under instructions or in novel or uncertain situations) and attention shifting (Eggers, De Nil, & Van den Bergh, 2010). In brief, various aspects of temperament appear to be associated with childhood stuttering, ranging from attention, affect/mood, adaptability and reactivity to their environment, and inhibitory control.
Such between-group differences, although clearly warranting further empirical assessment, as suggested by Kefalianos et al. (2012), may reflect real differences between CWS and CWNS based on Eggers, De Nil, and Van den Bergh's (2009) findings. Specifically, Eggers et al. reported a similar, highly congruent three-factor temperament structure for children who stutter, children who do not stutter and children with vocal nodules. Eggers et al. concluded that any possible differences between CWS and CWNS on various indexes of temperament reflect real differences and are not the result of differences in underlying temperamental construct(s).
Based on the findings reviewed above, it has been theorized that the temperamental processes of children who stutter may also contribute to the difficulties these children have establishing normally fluent speech Conture, Walden, Arnold, Graham, Harfield, & Karrass, 2006;Walden, Frankel, Buhr, Johnson, Conture, & Karrass, 2012). For example, Conture et al.'s (2006) Communication-Emotion (C-E) model suggests that temperamental factors (e.g., emotional reactivity or emotion regulation) may exacerbate the speech disfluencies of children who stutter. According to the C-E model, emotional reactivity may be associated with detecting/reacting to speech errors whereas emotion regulation may be related to changing, correcting and/or coping with covert/overt speech/ language errors. These responses, according to the C-E model, are thought to contribute to quantitative (e.g., frequency) and/or qualitative (e.g., types and duration) change in stuttering. More recently, Conture and Walden (2012) proposed a dual diatheses-stressor framework (DD-S), in which diatheses (i.e., vulnerabilities) and stressors relating to emotion and speech-language processes are associated with childhood stuttering (for general review of diathesis-stress models, see Monroe & Simons, 1991). The DD-S model predicts that emotional reactivity, emotional regulation and their joint effects impact the frequency and severity of stuttering in preschool-age children .

Behavioral inhibition and developmental stuttering
To date, most studies of temperamental characteristics associated with childhood stuttering have relied upon caregiver reports (e.g., Anderson et al., 2003;Eggers et al., 2009Eggers et al., , 2010Karrass et al., 2006). Although questionnaires measuring temperamental characteristics have been shown to be valid (Rothbart, 2011;Rothbart & Bates, 1998;Thompson, 1999), it has been suggested that parents are biased informants (Strelau, 1998), and that a parent's report about a child's behaviors may not always yield the truest representation of the child's actual behaviors (Kiel & Buss, 2006). Thus, as Kagan (2007) suggests, in order to best understand the temperament of individuals, one should assess temperament from at least three perspectives: parental report, behavioral observation, and psychophysiology. Given that there is a relative lack of research relating temperament and childhood stuttering using methods other than parental reports, one reasonable next step would be to examine temperamental dimensions of children by means of behavioral observation.
Kagan and his colleagues (Kagan, Reznick, Clarke, Snidman, & Garcia-Coll, 1984;Kagan, Reznick, & Gibbons, 1989;Kagan, Sidman, & Arcus, 1998) reported that one temperamental characteristic, behavioral inhibition (BI), can be measured reliably by means of behavioral observation. Behavioral inhibition is a temperamental characteristic that is expressed as initial avoidance, distress, or subdued emotion when a person is exposed to unfamiliar people, places, and situations . Behavioral inhibition has been considered to be a salient temperamental trait because of its similarity to an animal's immobility when encountering a novel context, a biologically-prepared response thought to involve a state of fear (Blanchard & Blanchard, 1988). Kagan (1989) estimated that 10% to 15% of children are consistently inhibited or uninhibited. To measure BI of 4 year-old children, Kagan et al. (1989) used the child's latency to the 6 th spontaneous comment and the total number of spontaneous comments during a speaking task. Their findings indicated that for the entire sample, there was no preservation of inhibited or uninhibited behaviors from 14 or 20 months to 4 years of age. However, for the extreme sample (top and bottom 20 percentiles), there was significant preservation of the two categories of behaviors (i.e., inhibited and uninhibited behaviors) from the second to the fourth year of life. Accordingly, they speculated that the construct of being inhibited and uninhibited to the unfamiliar isa qualitative category rather than continuous dimension.
Behavioral inhibition has also been found to be associated with children's reactivity (Calkins, Fox, & Marshall, 1996;Kagan et al., 1998), which has been studied in association with childhood stuttering (e.g., Arnold, Conture, Key, & Walden, 2011;Johnson et al., 2010;Walden et al., 2012). For example, using behavioral observations, Kagan and colleagues (1998) reported that 4-year-old children, who had been classified as low reactive at 4 months, talked more often than 4-year-old children classified as high reactive infants. Based on these findings, Kagan et al. argued that there is a predictive relation between reactivity and behavioral inhibition. According to Kagan et al., the basic assumption underlying their argument is that behavioral inhibition is a "derivative" of reactivity because both reactive infants and inhibited children are thought to possess a low threshold of activation in the central nucleus of the amygdala and its projections to the hypothalamus, sympathetic chain, and cardiovascular system. Although reactivity can be observed/ measured in infants as young as 4 months of age, young infants do not display marked differences in shyness toward strangers or timidity to novel events until they turn 9-12 months (Kagan & Snidman, 1991;Kagan et al., 1998). Thus, Kagan and his colleagues' speculation seems reasonable that a child's behavioral inhibition can be predicted by the child's reactivity as an infant rather than the other way around.
It should also be noted that most studies reviewed above regarding the association between temperament and childhood stuttering have focused on between-group (i.e., CWS vs. CWNS) differences in temperamental characteristics. Given that the C-E model suggests that temperamental factors may exacerbate CWS' speech disfluencies, a better understanding of the role of the temperamental characteristics in the development of stuttering might result from assessing how BI impacts the quantity of stuttered disfluencies within the group of CWS as well as CWNS. In the same vein, it would be interesting to investigate how BI impacts non-stuttered disfluencies (e.g., revisions) within CWS as well as CWNS, thus addressing the suggestion that there may be similar underlying mechanisms for stuttered and non-stuttered disfluencies (Johnson, 1959;Postma & Kolk, 1993;Williams, Silverman, & Kools, 1968).

Behavioral inhibition and mean length of utterance
Besides stuttering, BI has been speculated to impact several other aspects of children's speech-language production. For example, Paul and Kellogg (1997) reported that clinician ratings of "Approach/Withdrawal" correlated positively with mean length of utterance (MLU) for first grade children with a history of slow expressive language development. This finding suggests that children who were more "outgoing" produce longer and more complex language during their conversational speech. To date, however, there have been few empirical studies regarding the relation between temperamental characteristics and MLU in preschool-age CWS and CWNS. Given that stuttered disfluencies as well as non-stuttered disfluencies are most apt to occur on long and complex utterances (e.g., for further review of this issue, see Zackheim & Conture, 2003), the influence of BI on MLU appears to warrant assessment due to the possibility that MLU may serve as a mediator or suppressor of the association between BI and instances of stuttering. Depending on findings, MLU may or may not need to be analytically controlled as a covariate when assessing the relation between BI and stuttering frequency.

Research hypotheses
Therefore, the present study was designed to address the issue of whether behavioral inhibition is associated with childhood stuttering and speech language output of children who do and do not stutter. This investigation addressed three specific issues. First, the study addressed whether there is a difference in BI between CWS and CWNS. It was hypothesized that preschool-age CWS, when compared to preschool-age CWNS, would exhibit more behavioral inhibition as indexed by longer latency to the 6 th spontaneous comment. Second, the study investigated whether more behaviorally inhibited CWS and CWNS differ in quantity of instances of stuttered and non-stuttered disfluencies when compared to less behaviorally inhibited CWS and CWNS, respectively. It was hypothesized that more behaviorally inhibited CWS would exhibit more stuttered and non-stuttered disfluencies than less behaviorally inhibited CWS during conversation. Similarly, it was hypothesized that more behaviorally inhibited CWNS, when compared to less behaviorally inhibited CWNS, would exhibit more stuttered and non-stuttered disfluencies during conversation. Finally, the study examined the relation between BI and MLU for both CWS and CWNS. It was hypothesized that more behaviorally inhibited CWS, when compared to less behaviorally inhibited CWS, would exhibit shorter MLU. Similarly, it was hypothesized that more behaviorally inhibited CWNS, when compared to less behaviorally inhibited CWNS, would exhibit shorter MLU. Overall, it was thought that findings from this empirical study would increase our understanding of possible group differences in BI between preschool-age CWS and CWNS as well as whether preschool-age children's behavioral inhibition impacts their stuttered and non-stuttered disfluencies as well as speech-language output.
All participants were paid volunteers, and were referred to the Vanderbilt Bill Wilkerson Center for participation by their caregiver. Caregivers were informed of the study via (a) a free, widely read parent-oriented magazine, (b) local health care provider, or (c) self/ professional referral to the Vanderbilt Bill Wilkerson Hearing and Speech Center.
No child had received formal treatment for stuttering or other communication disorders prior to participation in the present study. However, whether any of the parents of CWS had received, prior to time of testing, informal advice, counseling or guidance regarding strategies to increase their child's fluency -directly (i.e., face-to-face) or indirectly (i.e., online) -was not possible to reliably determine and hence not reported in this study. 1 Also, participants had no known or reported hearing, neurological, developmental, academic, intellectual, or emotional problems. This study's protocol was approved by the Institution Review Board at Vanderbilt University, Nashville, Tennessee. For each participant, parents signed informed consent, and their children assented.

Excluded participants-
From an initial group of 36 CWS and 42 CWNS, 10 CWS and 14 CWNS were excluded. Two participants (1 CWS, 1 CWNS) were excluded because they did not meet inclusion criteria for speech/language ability, 2 participants (1 CWS, 1 CWNS) were excluded because they did not produce the minimum number of spontaneous comments (i.e., 6) to be included in the analysis, 9 participants (5 CWS, 4 CWNS) were excluded because their recorded utterances were not sufficiently intelligible for analysis, 4 participants (1 CWS, 3 CWNS) were excluded because they did not meet the condition of speaking to a single unfamiliar examiner to have a conversation with a child (e.g., mother was also engaged in the conversation with a child in the room), and 7 participants (2 CWS, 5 CWNS) were excluded due to missing speech/language data or technical problems (e.g., late-started recording). This resulted in 26 CWS and 28 CWNS who served as participants for the final data analyses.

Children who stutter (CWS)-A child was considered a CWS if he or she (a)
exhibited three or more stuttered disfluencies (i.e., sound/syllable repetitions, monosyllabic whole word repetitions, and sound prolongations) per 100 words of conversational speech based on a 300-word sample (Bloodstein, 1995;Curlee, 2007) and (b) received a total overall score of 11 or above (i.e., a severity equivalent of at least "mild") on the Stuttering Severity Instrument-3 (SSI-3, Riley, 1994).

2.2.2.
Children who do not stutter (CWNS)-A child was considered a CWNS if he or she (a) exhibited two or fewer stuttered disfluencies per 100 words of conversational speech based on a 300-word sample and (b) receivedan overall score of 8 or less (a severity equivalent of less than "mild") on the SSI-3.
Furthermore, each participant passed a bilateral pure tone hearing screening. These tests were administered to each child during the visit to the Vanderbilt Bill Wilkerson Center.

Race
The participant's race was obtained via parental interview. The CWS group included 1 biracial, 8 African American and17 Caucasian participants; The CWNS group included 3 African American and 25 Caucasian participants.

Socioeconomic status (SES)
Each participant's SES was determined through application of the Four-Factor Index of Social Status (Hollingshead, 1975), on the basis of maternal and paternal occupation and educational levels. Scores ranged from 18.5 to 66; a higher score suggests higher SES. There was no significant difference in SES between CWS (M =42.08, SD = 13.68) and CWNS (M = 47.04, SD = 11.93),t (50) = 1.393, p = .170, d=−0.39.

Procedures
Participants were observed in a free-play situation with an unfamiliar adult examiner. The child and adult examiner engaged in a loosely-structured free play conversation centered around age-appropriate toys (e.g., barnyard toy set) situated between the examiner and the child. All conversations were video-recorded for subsequent analyses. All examiner-child conversations were child-directed with the examiner providing prompts to elicit utterances when the child's relative discontinuation of talking led the examiner to prompt the child to continue speaking.
Systematic Analysis of Language Transcripts, research version 2008 (SALT 2008;Miller & Iglesias, 2008) was used to transcribe a child's utterances and to measure (1) MLU, and (2) the presence/absence of three types of spontaneous comments (i.e., unprovoked utterances, questions, elaborations of answers) from the audio-video recordings. To measure the frequency of various types of speech disfluencies (i.e., sound/syllable repetitions, monosyllabic whole-word repetitions, sound prolongations, revisions, interjections, and phrase repetitions), an online or real-time analysis of fluency (Conture, 2001) based on 300word conversational speech samples was used. Each participant's latency to the 6 th spontaneous comment was measured to the nearest one-hundredth of a second with a stopwatch during observation of the audio-video recordings.
2.6.1. Transcription-The SALT-transcribed examiner-child conversational samples were analyzed using several SALT conventions (Buhr, & Zebrowski, 2009;Richels, Buhr, Conture, & Ntourou, 2010) based on Brown (1973, with the exception of the segmentation convention. First, a pause and intonation contour was used to determine the completion of a sentence. If children produced more than two independent clauses joined by conjunctions without pausing or changing intonation contour, utterances were segmented after the second conjoined clause (Miles, Chapman, & Sindberg, 2006). Second, affirmatives and negatives such as yes and no were not analyzed for MLU except cases in which yes or no responses were immediately followed by a full-clause (Johnston, 2001). Third, utterances of a nonlinguistic nature such as screams or cries were not transcribed. Fourth, sentences that include unintelligible words were marked with a single "x" for each unintelligible syllable and were not analyzed for MLU. Fifth, non-stuttered disfluencies (e.g., revisions) were placed within parentheses so as not to be included in the linguistic analysis.

Description of variables
2.7.1. Talker group-Talker group (CWS vs. CWNS) was the independent variable for the first hypothesis, with talker group inclusion criteria as described above.

2.7.2.
Latency to the 6 th spontaneous comment-The index of BI, the latency (in seconds) to the 6 th spontaneous comment (SC), was obtained during an examiner-child conversation in a similar manner to Kagan et al.'s (1998) study by measuring the time a child took from the initiation of his/her first utterance to the initiation of his/her 6 th SC in the child's speech sampl e. A spontaneous comment was defined as "any remark that is not a direct answer to the examiner's question" (Kagan et al., p.1487). For example, as shown in Table 1, a spontaneous comment was coded "when a child elaborated an answer, asked the examiner a question, or remarked on an incident in the child's life" (Kagan et al., p.1487).
To prevent the duration of participants' stuttered disfluencies from confounding the measure of latency to the 6 th SC, the total duration of instances of stutterings occurring during the epoch from the first utterance to the 6 th SC was subtracted from the measure of latency to the 6 th SC. Likewise, to prevent participants' unintelligible utterances from confounding the measure of latency to the 6 th SC, the total duration of unintelligible utterances that occurred during the epoch above was subtracted from the measure of latency to the 6 th SC. However, the time spent by examiner's utterances was included in the aforementioned epoch because our post-hoc analysis indicated no significant difference (U(14)=21.0, Z=−1.155 p=.248, d=1.00) between children with higher BI versus lower BI in the proportion of time spent by examiner's prompts per children's latency to the 6 th SC.

Higher behavioral inhibition versus lower behavioral inhibition group-
Similar to the methodology employed by Kagan et al. (1989), in the present study, children whose latency to the 6 th SC was in the top 15% of the distribution were assigned to higher BI group whereas children whose latency to the 6 th SC was in the bottom 15% of the distribution were assigned to lower BI group. The fifteenth and eighty fifth percentiles were chosen as cut-off percentiles because previous researchers have suggested that 10 to 15 percent of children are consistently shy or spontaneous (Garcia-Coll, Kagan et al., 1984;Kagan, 1989).

Frequency of stuttered and non-stuttered disfluencies-
The first author and three independent trained coders measured the frequency of stuttered and non-stuttered speech disfluencies during a 300-word conversation sample obtained during child-examiner play. As mentioned above, and based on Vanderbilt University's Disfluency Count Sheet (Conture, 2001), stuttered disfluencies included sound/syllable repetitions, monosyllabic whole-word repetitions, and sound prolongations and non-stuttered disfluencies included phrase repetitions, interjections, and revisions. 2.7.5. Mean length of utterance (MLU)-Mean length of utterance in morphemes was mainly computed using Brown's procedure (1973), with differences due to adherence to SALT conventions. MLU was based on a participant's 10-minute conversation with an examiner. The current study included only the first 10 minutes of conversational speech of each participant, given that 10 minutes was the longest time period of conversation common to all fifty-four participants.

Data analyses
2.8.1. Between-group comparisons-To assess possible between-group (i.e., CWS vs. CWNS) demographic (i.e., SES and chronological age) differences, independent samples ttests were employed. Because the latency to the 6 th SC was not normally distributed for this study, analysis of covariance (ANCOVA) with a log transformation of the dependent variable was used to evaluate whether the index of BI differed between the talker groups.

Within-group comparisons-
Because the latency to the 6 th SC was not normally distributed for this study, non-parametric analyses (e.g., Spearman rank-order correlations) were performed to assess the relation of the index of BI to stuttered and non-stuttered frequencies and MLU among CWS and CWNS.
2.9. Measurement reliability 2.9.1. Inter-judge reliability-Approximately 20% of the total final data corpus of each talker group (5 age-matched CWS and 6 age-matched CWNS) was selected at random to assess inter-judge measurement reliability for the latency to the 6 th spontaneous comment and MLU. A speech-language pathology post-doctoral fellow served as a reliability coder. The reliability coder was blind to talker group. Comparison between the first author and the reliability coder's measures of the latency to the 6 th spontaneous comment revealed strong inter-judge reliability (Spearman's rho =.940, p<.01). Similarly, comparison between the first author and the reliability coder's measures of MLU revealed strong inter-judge reliability (Spearman's rho =.991, p<.01).
To assess inter-judge measurement reliability for frequency of stuttered and non-stuttered disfluencies among the four coders, the four coders re-coded approximately 20% of the data corpus (5 CWS and 6 CWNS). Comparison among four coders' judgments of frequency of stuttered disfluencies revealed strong inter-judge reliability (Mean Spearman's rho =.940, p<.01; range from .843 to .957). Similarly, comparison among four coders' judgments of frequency of non-stuttered disfluencies revealed significant but relatively modest inter-judge reliability (Mean Spearman's rho =.761, p<.05; range from .576 to .899).

Intra-judge reliability
Approximately 20% of the total final data corpus of each talker group (5 CWS and 6 CWNS) was selected at random to assess intra-judge measurement reliability for the latency to the 6 th spontaneous comment and MLU. Comparison between the first author's initial and subsequent re-measurement of latency to the 6 th spontaneous comment revealed high intrajudge reliability (Spearman's rho =.973, p<.01). Similarly, comparison between the first author's initial and subsequent re-measurement of MLU revealed high intra-judge reliability (Spearman's rho =.945, p<.01).
Assessment of intra-judge reliability for frequency of stuttered and non-stuttered speech disfluencies was based on approximately 20% of the total data corpus (5 CWS and 6 CWNS) that the first author had initially coded. Comparison of the first author's initial and re-coding for stuttered disfluencies indicated strong intra-judge reliability (Spearman's rho =.956, p<.01). Similarly, comparison of the initial and re-coding for non-stuttered disfluencies indicated strong intra-judge reliability (Spearman's rho=.833, p<.01).

Mean length of utterances (MLU)-An
analysis of covariance (ANCOVA) was conducted to determine whether there was a difference in MLU between preschool-age CWS and CWNS as well as between boys and girls regardless of talker group. Results indicated neither main effect for MLU for talker group, nor gender, after controlling for the influence of age.

Normality of distribution of measured variables
Normality of distribution was assessed for all variables. Assessment of histograms displaying each variable's distribution, aided by the judgment of an experienced statistician, indicated normal distributions for: frequency of non-stuttered disfluencies and MLU for CWS as well as frequency of stuttered and non-stuttered disfluencies and MLU for CWNS. Conversely, distributions for frequency of stuttered disfluencies and latency to the 6 th SC for CWS as well as CWNS' latency to the 6 th SC were not normally distributed (i.e., lognormal or Poisson in nature). Therefore, statistical procedures appropriate to either normal or nonnormal distributions were employed in subsequent analyses.

Findings regarding a priori hypotheses
3.3.1. Between-group differences in latency to the 6 th SC-In order to prevent the violation of the assumption of normality, a log of the outcome of the dependent variable (i.e., latency to the 6 th SC) was used for analysis. Results of this ANCOVA (Table 2) indicated no significant difference between preschool-age CWS and CWNS in behavioral inhibition as indexed by latencies to the sixth spontaneous comment (SC) after controlling for gender and between-group differences in language ability (i.e., TELD-3 receptive and EVT-2).
However, it could be argued that across time, characteristic behavioral style is more manifest in the extremes, as was found in Kagan et al.'s (1989) work. Therefore, as previously described in the method section, we selected those children (in both talker groups) whose latency to the 6 th SC was in the top and the bottom 15% of the distribution (8 children in each BI group). Assessment of the two extreme groups (i.e., higher vs. lower BI) indicated a significant relation between talker group (CWS vs. CWNS) and higher versus lower BI group, χ 2 = 4.267, df=1, p<.05, Cramer's V=.516 (Table 3), there being more CWS with higher and fewer CWS with lower BI when compared to CWNS. Unlike the overall statistical analysis above, for the extreme group analysis, gender was not controlled because (a) previous researchers have reported similar proportions of boys and girls for each extreme group (i.e., children with higher BI vs. lower BI) , and (b) statistical assessment of gender difference between the extreme BI groups indicated no significant association between gender and the two extreme groups (χ 2 = .291, df=1, p=.590, Cramer's V=.134). Figure 1, results indicated that for preschool-age CWS, increased latency to the 6 th SC was associated with more stuttered disfluencies (Spearman's rho=.49, p=.011). However, as shown in Table 4, there was no significant correlation between latency to the 6 th SC and frequency of non-stuttered disfluencies for CWS. For CWNS, there was no significant relation between latency to the 6 th SC and frequency of either stuttered or non-stuttered disfluencies.

Relations between latency to the 6 th SC and frequency of speech disfluencies within talker group-As shown in
When extremely high versus low inhibited children were selected for CWS (4 children in each BI group), results of a Mann-Whitney U test indicated that there was a significant difference in frequency of stuttered disfluencies per 100 words between the higher (M=13.75, SD=1.13) and lower (M=8.42, SD= 3.43) BI groups, U(6)=.000, Z=−2.309, p=. 021, d=2.09, with higher BI CWS exhibiting more stuttered disfluencies than lower BI CWS. However, there was no significant difference for CWNS in frequency of stuttered disfluencies between the higher and lower BI groups. Likewise, for either CWS or CWNS, there was no significant difference in frequency of non-stuttered disfluencies between higher and lower BI groups. Table 4, there was no significant relation between latency to the 6 th SC and MLU for either CWS or CWNS. When extremely high versus low inhibited children were selected within each talker group (4 children in each BI group), results of a Mann-Whitney U test indicated no significant difference in MLU between higher and lower BI groups for either CWS or CWNS. However, as might be expected, post-hoc analysis indicated a significantly positive correlation between TELD expressive language score and MLU for both CWS (Spearman's rho=.470, p<.05) and CWNS (Spearman's rho=.515, p<.01).

CWS' number of spontaneous comments, words and stuttered disfluencies during the first versus second five minutes of conversation-
Results of Wilcoxon matched-pairs signed-ranks test (Z=.235, p<.05) indicated that CWS produced significantly fewer spontaneous comments during the first (M= 12.65, SD= 8.80) compared to the second five minutes of conversation (M= 16.31, SD= 9.89, d=−0.39). Consistent with the above finding, Wilcoxon matched-pairs signed-ranks test (Z=2.52, p<. 05) revealed that CWS produced significantly fewer words during the first (M= 96.27, SD= 50.50) than the second five minutes (M= 115, SD= 53.97, d=−0.36) of conversation.

Main findings: An overview
The present study resulted in three main findings. The first main finding indicated that there was no significant difference in BI between preschool-age CWS and CWNS as a group. The second main finding indicated that when extremely high versus low inhibited children were selected, there were more CWS with higher BI and fewer CWS with lower BI when compared to their CWNS peers. The third main finding indicated that more behaviorally inhibited CWS, when compared to less behaviorally inhibited CWS, exhibited more stuttered disfluencies. The general implications of each of these three findings will be discussed below.

Difference in BI between CWS and CWNS
The first hypothesis, that preschool-age CWS, when compared to preschool-age CWNS, would exhibit more behavioral inhibition was not supported by our findings. However, when extremely high versus low inhibited children were compared, it was found that there are significantly more CWS with higher BI and fewer CWS with lower BI than their CWNS peers.
One possible explanation for these equivocal findings is based on Kagan et al.'s (1989) suggestion that behavioral inhibition may be manifest more in the extreme groups than in the entire sample. Consistent with the present study, Kagan and his colleagues reported diverging findings depending on whether the entire sample or two extreme groups were used for analysis. They found that when the entire sample was included for analysis, there was no preservation of behavioral differences from the second to the fourth year of life. In contrast, when children were selected to represent the extremes of behavioral inhibition, there was a significant preservation of behavioral differences from the second to the fourth year of life. Based on these findings, Kagan et al. (p.845) suggested that "continuous dimensions do not always capture the most essential structural or functional properties of the entities being compared".
Thus, the present finding regarding higher versus lower BI children seems to suggest that at least, for the extreme groups, predisposition toward behavioral inhibition may be associated with development or exacerbation of stuttering. Conversely, one might speculate that uninhibited behavioral characteristics are associated with amelioration of the child's speech disfluencies, speculation that must await future empirical study for support or refutation.

Relations between BI and frequency of speech disfluencies within talker group
The second hypothesis, that more behaviorally inhibited CWS and CWNS, when compared to less behaviorally inhibited CWS and CWNS, would exhibit more stuttered and nonstuttered disfluencies during conversation was partially confirmed by our findings. That is, more behaviorally inhibited CWS, when compared to less behaviorally inhibited CWS, exhibited more stuttered disfluencies. This prediction was based on Conture et al.'s (2006) C-E model which suggests that temperamental factors may exacerbate speech disfluencies of children who stutter. Therefore, this finding appears to support not only our prediction, but also the C-E model. Treon (2010) suggested that there is a debate regarding whether emotion plays a causal role in stuttering or the other way around, an issue sometimes referred to as the "directionality of effect". At present, however, it seems premature to establish the directionality of effect between BI and stuttering based on current findings. As suggested by Kefalianos et al. (2012), a better understanding of directionality of this relation may be achieved by longitudinal studies of children whose temperament is determined prior to stuttering onset.

Relations between BI and MLU within talker group
The third hypothesis that more behaviorally inhibited CWS and CWNS, when compared to less behaviorally inhibited CWS and CWNS, would exhibit shorter MLU was not supported by our finding. This finding is inconsistent with Paul and Kellogg's (1997) finding that more outgoing children tend to show longer MLU than less outgoing children. Such inconsistency may be explained by the fact that participants of Paul and Kellogg's study are older (i.e., 1 st grade) than this study's preschool-age participants (i.e., ages 3-5). Perhaps, Paul and Kellogg's suggestion that MLU at the age of 6 may be seen not as a measure of language complexity, but of a temperamental characteristic (e.g., withdrawal/approach) does not apply to preschool-age children. Indeed, our ancillary finding indicated that MLU is more associated with linguistic (e.g., expressive language) than temperamental domains (e.g., behavioral inhibition) for preschool-age children. Therefore, it may be worthwhile in future studies to investigate whether there is a relation between MLU and BI for school-age children or adults who stutter.

CWS' number of SC, words and stuttered disfluencies during the first versus second five minutes of conversation
The ancillary findings indicated that preschool-age CWS produced significantly fewer spontaneous comments (SC) as well as fewer words and marginally significantly fewer stuttered disfluencies during the first as compared to the second five minutes of conversation. These post-hoc findings are interesting because they assess changes in number of SC, number of words and frequency of stuttered disfluencies as a function of time within a conversation. These findings suggest that as CWS "warm up" during conversational interaction and become more comfortable with their conversational partners, they are apt to produce more spontaneous comments that lead to increased number of words and hence increased stuttered disfluencies. This result may argue for the use of longer temporal samples of the child's speaking to better accommodate the tendency of some preschool-age CWS to be "slow-to-warm-up" communicatively (see Sawyer & Yairi, 2006 for related findings and discussion).

Caveats
First, the relatively small number of participants (i.e., 26 CWS and 28 CWNS) in the present study more than likely limited statistical power, which likely influenced the inferential statistical analyses of the data. However, the seemingly adequate representativeness of the participants would seem to lend present findings a reasonable degree of credence.
Second, unlike Kagan et al.'s (1989) methodology that used an aggregate score of latency to the child's 6 th SC and the number of spontaneous comments, we used the latency (to the 6 th SC) measure only. The present authors chose this variable because as Gest (1997, p 467) suggested, "latency measures are often the most sensitive indicators of individual differences, possibly because they are obtained at the start of an interaction, when novelty is at a maximum". However, the total number of SCs was not measured in this study because each participant's time during which a conversation was recorded varied from 10 to 40 minutes. Such different durations of conversational samples may result from the fact that the present study was designed to obtain 300 words of a conversation to assess speech disfluencies. Thus, for the present study, it was not considered appropriate to use the total number of SCs as a dependent variable because different conversation time may affect the number of SCs, that is, longer conversation may inflate while shorter conversation may deflate the total number of SC. Furthermore, the latency to the 6 th SC and the total number of SC appear to be redundant. Indeed, it was found that this latency (to the 6 th SC) measure was significantly correlated (r =−.61, p <.001) with number of spontaneous comments .
Third, there were relatively small expected frequencies (i.e., expected frequencies are less than 5; total sample size is 16) in the chi square that was computed for the first hypothesis. However, Camilli and Hopkins (1979) demonstrated that even with quite small expected frequencies, the test produces few Type I errors in the 2×2 case as long as the total sample size is greater than or equal to 8. Similarly, Howell (2002, p.159) argued that "with small sample sizes, power is more likely to be a problem than inflated Type I error rates."

Conclusion
In summary, preschool-age CWS, when compared to their CWNS peers, are more likely to exhibit extremely high behavioral inhibition and less likely to exhibit extremely low behavioral inhibition. Furthermore, CWS' behavioral inhibition appears to be associated with higher frequency of stuttered disfluencies.
Findings of this study provide further support for recent theories of childhood stuttering (e.g., Conture et al., 2006;) that propose that temperamental characteristics and/or related emotional processes are associated with childhood/ developmental stuttering. The present findings are based on direct behavioral observation, a methodology in need of further corroboration through the use of caregiver questionnaires as well as psychophysiological measures of temperamental characteristics.
Thus, present results seem to support the continued use of direct behavioral observation to study the association between young children's temperamental characteristics and stuttering, the diagnostic entity (i.e., CWS vs. CWNS), as well as stuttering, the behavior (e.g., frequency of stuttered disfluencies). By triangulating caregiver questionnaire, direct behavioral observation and psychophysiological means for investigating the association between temperament and stuttering, future researchers can improve our understanding of the role, if any, that emotion may play in the onset, development and maintenance of childhood stuttering.

highlights
• Behavioral inhibition (BI) of preschool-age CWS and CWNS was assessed by measuring the latency to their sixth spontaneous comment during conversation with an unfamiliar experimenter.
• No difference in BI was found between preschool-age CWS and CWNS as a group.
• However, in the extreme BI groups, there were more CWS in higher BI and fewer CWS in lower BI group than CWNS.
• CWS' BI correlate s with their frequency of stuttered disfluencies. Spearman rank-ordercorrelation between latency to the 6 th spontaneous comment and frequency of stuttered disfluencies per 100 words for preschool-age CWS (N=26). Table 1 Types and Examples of Spontaneous Comments Measured in the Present Study during Preschool-age Children's (N=54) Conversation with an Examiner. Adapted from Kagan et al. (1998).

Types of spontaneous comments (SC) Example
Unprovoked comment (