SURFACE at Syracuse University SURFACE at Syracuse University Provider Type and Depression Treatment Adequacy Provider Type and Depression Treatment Adequacy

We investigate the effect of initial provider (primary care physician, psychiatrist, or non-physician mental health specialist) on the adequacy of subsequent treatment for persons with depression. Our data are from MarketScan ® , a medical and pharmacy insurance claims database, which we use to estimate models of the likelihood of treatment for depression and the likelihood that any treatment received is adequate. Patients initially seeing psychiatrists are most likely to receive adequate treatment. Provider type has a statistically and medically significant effect on whether any treatment occurs but a smaller effect on treatment adequacy among treated patients. The results show the importance of provider type in treatment patterns, but the effects on patient outcomes are yet to be determined definitively.


Introduction
Depression is a widespread illness in the United States and elsewhere, and although costeffective treatments exist, many persons remain untreated or inappropriately treated . The popular press has noted the apparent paradox that mental health treatment has significant long-term benefits to society and yet is the subject of stigma, which permeates a cycle of under-treatment (Cloud 1999). Although many mentally ill persons who are untreated are outside the social safety net, even among persons covered by health insurance plans the treatment of depression is frequently inadequate and not cost-effective . We present evidence on how the course of treatment for persons with depression varies by type of initial care provider.
To treat depression cost-effectively, patients would ideally be matched with therapies most cost-effective for them individually. Mental health systems often perform poorly at matching patients with their ideal providers, creating access problems for patients who could be helped and creating waste from patients who consume resources that are of little therapeutic benefit (McGuire 1995). To treat depressed enrollees more cost-effectively, health plans need to study the effects of directing patients into particular pathways of care. An important component of directing a patient's course of treatment is the provider who first diagnoses the depression. A key collateral issue is then how the course of treatment compares among the various types of providers so as to understand how health insurance systems, both directly through their benefit structures and indirectly through care decisions of mental health providers, influence the adequacy of care for depression.
The role of provider specialty is particularly important in treating mental conditions such as depression, where the range of treatment options available to a patient may depend heavily on the training of the provider treating the patient. General medical practitioners, psychiatrists, psychologists, and psychiatric social workers or other non-physician mental health specialists can treat patients suffering from depression, and many patients are treated by more than one type of professional. Because each provider type has unique training and expertise, the course of treatment is likely to vary among provider types. The consequences of provider type differences for patients with mental disorders are still not clearly established, and our contribution is to investigate how provider type affects treatment adequacy among patients diagnosed with depression.
We find that provider type is significantly related to the adequacy of treatment of a depressed person. A depressed person initially diagnosed by a general medical practitioner psychiatrist is about seven times more likely to go untreated than a similar depressed person initially diagnosed by a psychiatrist. No treatment is up to 15 times more likely to occur if the depressed person is diagnosed by a non-physician provider than if the person is diagnosed by a psychiatrist. Although type of initial provider affects the dimension of treatment adequacy via the existence of any depression treatment, we find that initial provider type is much less relevant to whether the amount of the treatment selected is considered adequate. Any connection between initial provider and eventual patient outcome concerning depression largely occurs through whether treatment is initiated at all rather than through intensity of treatment once initiated.

Background Literature
The role of the health care provider has been of much research concern because of its potential importance in determining the course of treatment of certain ailments. Specialty of the health care provider has been shown to have wide-ranging medically and economically significant effects on treatment intensity, costs, and outcomes.

Brief Review of Provider Specialty Impact
The general relationship between provider specialty and pattern of care and costs is still unclear. Interconnections seem to depend crucially on the particular diagnosis under study. For example, outcomes of persons suffering a myocardial infarction vary according to whether the patient was admitted to the hospital by a generalist or a specialist (Jollis et al. 1996). In contrast, among persons with symptoms of knee osteoarthritis, physician specialty and specific management practices seem not to account for variations in patient outcomes (Mazzuca et al. 1997). Research on asthma patients suggests that treatment intensity may be greater for persons treated by specialists (Engel et al. 1989). Although treatment costs are necessarily higher for asthma patients treated by specialists, the greater treatment intensity does not clearly result in better outcomes (Freund et al. 1988). As a final overview example, we note that costs of treating episodes of various musculoskeletal conditions seem to be lower for persons referred to a specialist earlier in the course of treatment (Nyman et al. 1998). The role of provider specialty in medical and economic outcomes seems to be context specific.

Brief Review of Provider Specialty in Mental Health Care
Similar to results we have described above, the connections among treatment modes, provider types, costs, and outcomes for users of mental health services are not generalizeable.
Still, some results are worth noting briefly. Compared to usual depression care by general medical practitioners, patients receiving guideline-based pharmacotherapy and psychotherapy interventions (via pharmacotherapy provided by trained primary care physicians using scripted interpersonal psychotherapy) had superior outcomes at modestly higher cost (Lave et al. 1998).
Because depression treatment studies suggest there are important style differences across the various types of mental health care providers, some researchers have concluded that, where mental health care is concerned, policies should channel patients away from primary care providers because general medical practitioners too often administer superficial treatment (Mechanic 1990;Wells et al. 1996). Some research also suggests that mental health care initiated by general medical providers is as expensive as treatment initiated by psychiatrists and more expensive than treatment initiated by non-medical providers (Holmes and Deb 1998).
Looking within treatment for mental illness, research on differences in counseling styles for depression across the provider specialties of psychiatry, psychology, and general medicine have found that general practitioners counsel less than psychiatrists and psychologists.
Furthermore, compared with other mental health specialists, master's level clinicians reveal lower skill at counseling for psychosocial problems . Although pharmacotherapy seems to be more important than other treatment factors concerning the dual outcomes of cost of care and pharmacotherapy completion, targeted concurrent psychotherapy can be cost-effective (Dobrez et al. 2000).
Differences in the type of care delivered by one provider type compared to another are, in part, determined by variation in education, training, and other provider factors. Care differences can also be partly determined by the preferences of patients, who choose providers based on the types of care they offer. Based on the limited existing literature, it appears that patients' preferences are more likely to affect the decision to seek any care rather than the type of care (Frank and Kamlet 1989;Ettner and Herman 1997;Fortney, Rost, and Zhang 1998;Swindle et al. 2000). So, patient demographic factors, such as age, gender, health, and mental health status have not been found to predict well the choice of particular provider specialties. Only geographic proximity and insurance have been shown to have much influence on patients' use of psychiatrists and mental health specialists (Fortney, Rost, and Zhang 1998;Ettner and Herman 1997). 1 Our contribution is to add to the understanding of importance of provider type in caring for depression. We answer two questions concerning the extensive and intensive dimensions of anti-depression treatment: How does initial provider type matter in whether treatment occurs at all? If treatment does occur, are there important differences in treatment adequacy across provider types? 4

Conceptual Framework
Our starting point is that medical treatment is a joint decision of the patient and a health care provider concerning both the extensive and intensive margins of treatment (Pohlmeier and Ulrich 1995). We adopt a two-stage decision making view of depression treatment in which the first decision is whether treatment begins and the second decision, conditional on the initiation of treatment, is the intensity (adequacy) of treatment. 2 More specifically, if the latent index 1 y * , which represents the difference between the depressed patient's (utility) benefits and costs of treatment, is positive, then treatment occurs. Similarly, if the latent index , which represents the difference between the depressed patient's (utility) benefits and costs of intensive (adequate) treatment conditional on treatment beginning, is positive, then treatment is adequate. Expressed algebraically, the conceptual model that we estimate describes the joint likelihood that treatment occurs and is adequate 2 y * y 1 = Prob (treatment) = f 1 (X 1 ) and (1) 1 0 y * > where X 1 are observed exogenous variables affecting the likelihood of treatment and X 2 are observed exogenous variables affecting treatment intensity for patients treated, which will have elements common to X 1 . Because the conceptual model of depression treatment in Equations (1) and (2) most logically represents treatment and treatment adequacy as (joint) probabilistic outcomes, it must be made stochastic by allowing the underlying net utility indexes, j y * , to have unmeasured components in addition to the observed patient and health care provider attributes in X j ( j = 1, 2).
The empirical representation that we will estimate admits stochastic parts of Equations (1) and (2) that are correlated, which represents a latent common factor, so that the model is in the spirit of the well-known generalized multi-equation regression framework known as seemingly unrelated regressions or SUR (Greene 2003). Our ultimate research objective is to produce estimates of the differential or so-called marginal effect (ME) on each of the two outcomes of interest in anti-depression treatment from having a psychiatrist versus another type of initial provider.

Empirical Framework
We undertake a multivariate empirical model focusing on two dimensions of treatment adequacy: whether treatment occurs at all and whether any treatment occurring is intensive enough to be considered adequate. The independent variable of most interest in our study is qualitative -provider type. We first determine in a single equation context whether persons diagnosed with depression are treated (with psychotherapy, pharmacotherapy, or both) versus remaining untreated after diagnosis. Using a multiple equation model considering jointly the extensive and intensive dimensions of anti-depression treatment, we then examine the influence of provider type on whether treatment is intensive enough to be considered adequate among patients treated. Our multi-equation models adjust for nonrandom assignment of treatment and address whether our conclusions are robust to the possibility that patients select the particular type of initial provider based on latent (to the researcher) factors underlying likely treatment and its adequacy.

The Bivariate Probit Model
Let y 1 be the binary variable indicating whether treatment occurs and y 2 be the binary variable indicating whether any treatment occurring is deemed medically adequate. The bivariate probit model we use for our core results estimates y 1 = f 1 (X 1 ) + e 1 and y 2 = f 2 (X 2 ) + e 2 where X j is the matrix of independent variables (j = 1, 2). The error terms (e j 's) in the bivariate probit are taken as following a standard bivariate normal cdf where the e j 's each have mean zero and unit variance and a covariance of ρ. The particular variant of the bivariate probit we use for our core 6 results takes account of the non-random composition of the data for the second stage of antidepression treatment because the model incorporates statistically in the likelihood function that it is only when there is treatment (y 1 = 1) are there data on adequacy of treatment (y 2 and X 2 ) (Greene 2003). 3

Data
Our sample is from the Marketscan ® database, which contains standardized medical and prescription claims of enrollees in private employer-provided health plans across the United States, representing about six percent of corporate health care insurance expenditures. Because sample members are in similar health insurance situations overall, insurance generosity and particular mental health coverage differences are implicitly held constant in our empirical results, which also means that we do not examine the issue of how health insurance parameters influence initial provider and subsequent treatment adequacy. We use insurance claims data from 1990-1994 for patients diagnosed with depression, including outpatient procedure codes and prescription claims. To be included in our sample, patients must have claims information for the entire six months before diagnosis and for the entire 12 months after diagnosis. Because depression in children and in geriatric populations may require special consideration, we limit our sample to patients ages 18 to 65. As noted in the summary statistics in Table 1, our study sample size is 5,562 persons; 74 percent are women, and the mean age is 41.
Patients' diagnostic categories reflect the first listed depressive disorder identified on the index claim during the episode. As seen in Table 1, approximately 11 percent of patients were diagnosed with single episode major depressive disorder, 9 percent with recurrent major depressive disorder, 67 percent with dysthymic disorder, and 13 percent with depression not otherwise specified. 4 About 22 percent of sample patients had their initial diagnosis from a general medical provider, 44 percent from a psychiatrist, 6 percent from a psychologist, and 34 percent by a non-physician mental health therapist, which we subsequently refer to as a master's level clinician.
Our data represent episodes of care for persons diagnosed with depression. Each episode is indexed by the initial occurrence of a claim with a diagnosis of a depressive disorder, including single and recurrent episode major depression, dysthymia, and depression not otherwise specified. We call the first six-month period of each episode the pre-diagnosis period.
Although symptoms of depression may exist, there is no record of either diagnosis or treatment for depression during the pre-diagnosis period. The 12 months following the index claim comprise the follow-up period, during which we use the claims data to characterize treatment as falling into one of four categories: (1) medication only, (2) psychotherapy only, (3) combination therapy, or (4) no treatment. As noted in Table 1, about 10 percent of patients had treatment using antidepressant medication only, 35 percent had treatment using psychotherapy only, about 20 percent had treatment with both medication and psychotherapy, and 35 percent were not treated at all within the health plan system after diagnosis.
It is important to mention, and then argue against, a possible important omitted variable in our data and associated omitted variable bias in our empirical results. In a general setting, provider type may be determined by health insurance parameters, and the effects of provider type on depression treatment could greatly reflect insurance differences across patients, rather than any independent effect of provider type. For example, insufficient insurance coverage could encourage people to seek care initially from a psychologist and then not to pursue subsequent treatment on economic grounds. There is some support in the literature for the perspective that care is hard to afford based on studies of persons seeking mental health treatment (Fortney, Rost, and Zhang 1998;Holmes and Deb 1998). Many other studies that include a larger, community perspective find that there are many forces affecting provider choice other than insurance, and that non-insurance factors are also probably more important than insurance (Swindle et al. 2000).
Because most Americans also do not know what their mental health benefit is, insurance seems unlikely to be a crucial factor in care seeking behavior initially (Mickus, Colenda, and Hogan 2000). Insurance generosity, as metered by expected out-of-pocket payments, also does not predict use of certain providers over others, such as psychiatrists versus general medical practitioners (Holmes and Deb 1998). So, the most important point for our research objective is that the marginal influence of provider type should not be confounded with that of insurance generosity in our particular data set because the entire cohort we study has similar employerprovided health insurance coverage.
A limitation of our data is that the provider effects we estimate have been cleansed of any confounding effects of insurance differences across patients; we control for insurance coverage indirectly as our data are for persons with similar coverage. Stratification by similar insurance also means that our results are moot concerning how insurance generosity might influence, or even dominate, the provider effects on adequacy of depression care.
In studying the link between provider type and adequacy of treatment, one might also want to consider more deeply than we can with our data the issue of whether there are important latent differences across people designated as depressed by different providers in the first place.
Does a diagnosis of a depressive disorder on a claim from a counselor mean the same thing as on a claim from a psychiatrist?
We also take note of an unusually high percentage of persons with dysthymic disorder in our data. Prior studies using MarketScan ® have noted that about 30 percent of claims list dysthymia as the primary depression diagnosis, which is about half the incidence of a dysthymic disorder in our sample. To investigate the impact of the high incidence of dysthymia, we stratified the sample into persons who received an antidepressant, which is a cohort similar to data in other studies (Hylan et al, 1999) and persons who did not receive an antidepressant. Rates of dysthymia approached more than 75 percent in the group of patients who did not receive an antidepressant, a finding that may offer a partial explanation for the apparently high rates of inadequate care among non-physicians. Many persons with long standing depressive symptomatology are appropriately treated by intermittent psychotherapy at a frequency of two or three visits per year. Thus, our finding that many patients with dysthymia who were treated by non-physicians receive inadequate care could be a consequence of how we construct an episode, and future research on optimal care for dysthymia may prove valuable.

Outcome Measures
We first examine, ceteris paribus, whether provider choice is related to the likelihood a depressed person receives no treatment following initial diagnosis, which is a fundamental form of inadequate treatment. We define treatment as the situation where a patient files a claim for any antidepressant prescription or any psychotherapy during the year following diagnosis as sample. Expert guidelines generally recommend specific lengths of treatment, which can be assessed in claims data such as ours. We considered subjects as receiving adequate care if the process of care was consistent with either an adequate course of medication or psychotherapy.
The empirical results we present for stage two of our research are based on a low threshold level of treatment adequacy. Adherence to medication guidelines means there were four or more antidepressant prescriptions filled during the first six months following the index date. Our adequacy threshold for antidepressant treatment has been used in previous research and shown to be a clinically relevant marker (Melfi et al. 1998;Hylan et al. 1999).
A minimum of six sessions is generally considered consistent with the experts' recommendations for adequate psychotherapy. Measuring adequate psychotherapy is complicated by the recognition that many episodes of major depression may spontaneously remit, and a period of so-called watchful waiting may be appropriate. Therefore, as few as two follow-up psychotherapy sessions may be appropriate. Here adequate care allows for clinically appropriate watchful waiting, and we initially use a minimum of two psychotherapy claims as our measure of adequate psychotherapy. We then consider how our conclusions might change if adequate anti-depression treatment requires more than two psychotherapy sessions.

Empirical Results
The numbers in parentheses in the last column of Table 2 highlight the heterogeneity across initial providers in the overall proportion of patients receiving adequate treatment: 88 percent for psychiatrists, 49 percent for general medical providers, 37 percent for psychologists, and 20 percent for master's level therapists. Table 2 also identifies the substantial heterogeneity in treatment initiation across initial providers. The proportion of depressed patients who are untreated ranges from about five percent for psychiatrist initial providers to 74 percent for master's level therapist initial providers. Does the wide difference in treatment incidence across providers remain after statistical adjustment for patient characteristics?

Whether Treatment Occurs
The specific other factors held constant, or so-called control variables, in our regression of whether treatment occurs are measures of the patient's personal characteristics and health status. Personal characteristics in our regressions include age and gender. We capture health status by the number of comorbid physical conditions and specific depression diagnosis, such as dysthymia where one wants to adjust the estimated effect of provider choice for the possibility that low-grade long-term depression often goes untreated. Finally, our model also includes the total costs of medical claims in the three months before the depression diagnosis as an indication of the patient's attachment to the health care system. The ceteris paribus differences in treatment incidence across provider types, as revealed by the estimated marginal effects in Table 3, are similar to the differences in average treatment incidence of Table 2 because in our data there is little difference in the mixes of patients seen across providers. Simple average differences in treatment propensities across provider types do not dramatically misrepresent the intrinsic heterogeneity among providers in whether there is any treatment at all. As measured by absence of any treatment, general medical practitioners are about seven times more likely to offer no treatment than psychiatrists; master's level therapists and psychologists are about 11 to 15 times more likely to offer no treatment than psychiatrists.

The estimated marginal effects in
Finally, we note that our estimated probit model of interpersonal differences in whether treatment is received in Table 3 is informative in an overall sense. A naïve prediction mechanism for which patients would receive treatment is to place all persons into the mode, treated, situation. The baseline naïve prediction mechanism would then correctly categorize 65 percent of patients; our probit model correctly predicts the treatment situation for 80 percent of patients in our sample.

Treatment Adequacy Among Treated Patients
The list of variables we hold constant in studying how provider type influences treatment adequacy among the treated also includes the patient's age, sex, and number of comorbidities.
When studying persons treated with depression, we also include more specific information about the depression diagnosis. Finally, control variables include several measures of health and the propensity to use health care services: pre-diagnosis medical costs; the number of pre-diagnosis physician visits, prescriptions, and emergency room visits; as well as an indicator for any prior non-psychiatric hospital admission.
We have noted that for patients overall there is a 30 percentage point difference in the dimension of treatment adequacy that is whether treatment is initiated: 65 percent of all depressed patients received treatment, and 35 percent did not ( Table 2). The first stage of our empirical research examined the importance of initial provider and various possible additional contributors to individual patient differences underlying the 30 percent point difference along the extensive margin of treatment. We now examine the complementary intensive margin of antidepression treatment. We are interested in the question of whether, given patient characteristics, there are there important differences across providers in the likelihood that a patient's treatment adheres to broad national guidelines for minimally acceptable treatment of depression. Among 13 other issues, is there a benefit to the patient in terms of treatment adequacy from so-called onestop shopping in that psychiatrists are the only providers offering both psychotherapy and pharmacotherapy?
Concerning intensity of treatment, the data in Table 2 indicate that 57 percent of all patients received adequate treatment, while 87 percent of treated patients received adequate treatment. Among patients receiving any treatment, 92 percent who initially saw a psychiatrist were treated adequately, while 50 percent who initially saw a general medical practitioner were treated adequately.
Most important for the second stage of our research is that there are much larger differences in treatment inadequacy across providers on the extensive margin than across providers on the intensive margin. That is, persons initially seeing master's level therapists are 15 times more likely to receive no treatment than persons initially seeing psychiatrists; the biggest difference in treatment adequacy among the treated is that persons initially seeing general medical providers are about half as likely to receive adequate treatment than persons initially seeing psychiatrists. We therefore expect to find smaller ceteris paribus differences in treatment inadequacy across the initial providers of treated patients than the difference we found across the initial providers concerning the inadequacy measure of whether any treatment at all occurred. An implication of sample selection bias if one were to study treatment adequacy in isolation is that there are latent factors common to the outcomes of whether any treatment is begun and whether treatment is adequate among patients who are treated. Because the estimated correlation between the error terms of the two probit equations in our bivariate probit model is insignificantly different from zero, the results in Table 4 do not indicate a latent common factor that might underlie a problematic sample, one determined by the outcome to be explained. In our data there seems to be no evidence of a feedback effect from likely treatment adequacy and the decision to begin treatment.

Robustness Check: Adequacy Measure and Depression Type
It is important to examine whether our focal result, that patients who initially see psychiatrists are more likely to be treated and for that treatment to be adequate, still holds if we consider two basic data issues concerning outcome measure and sample composition. To check the robustness of our conclusions thus far, we re-estimated our bivariate probit model first using a more stringent definition of adequacy on the psychotherapy dimension and then using only the sub-sample of persons with major depression.
As our first basic robustness check, we re-ran the bivariate probit model of Table 4 using a stricter definition of adequacy: at least four (rather than two) psychotherapy visits for treatment to be termed adequate. The possible latent common factor influencing both the likelihood of 15 treatment and treatment adequacy among the treated (the selection bias effect) was again statistically insignificant (P = 0.67), and the marginal effect of psychiatrist on the likelihood of treatment adequacy was again positive and statistically significant, although a third smaller when we doubled the number of psychotherapy visits in our treatment adequacy threshold (ME = 0.102, P = 0.004).
In our second basic robustness check we re-estimated the models in Tables 3 and 4 Table 4 (ME = 0.104, P = 0.007).
In summary, when we consider a more restrictive definition of adequacy or a more restrictive estimation sample including only patients with major depression, having a psychiatrist as the initial provider maintains its positive differential over the other types of providers. Taken in conjunction with the results of Tables 3 and 4, a patient initially seeing a psychiatrist is as much as four times more likely to receive treatment and a treated patient 1.25 times more likely to receive adequate treatment. Our core result continues to hold, which is that patients initially seeing a psychiatrist are more likely to receive treatment and for that treatment to be adequate with the differential effect of a psychiatrist initial provider much larger on the extensive than intensive margin of treatment.

Robustness Check: Endogenous Provider Choice
As the final segment of our empirical examination of the link between provider type and course of depression treatment, we consider more fully the issue that the type of initial provider 16 is obviously not randomly assigned in our observational data. We again focus our robustness checking on the coefficient of psychiatrist. We elaborate, conceptually and empirically with additional regression results, on two situations: (1) initial provider type depends on observable patient characteristics, and (2) initial provider type depends on a latent common factor that reflects the propensity of the patient to seek treatment from a psychiatrist and to participate in an adequate treatment regimen. Tables 3 and 4 adjust for one type of non-randomly determined (endogeneity of) provider type whereby the mental health provider is determined by and therefore correlated with observable patient characteristics. To ignore any connection between the characteristics of the patients and the provider type regressor would lead the researcher to attribute to provider the dual effects of the patient's characteristics on whom he or she sees initially and the additional effects of the patient's characteristics on treatment adequacy and intensity. We avoid some obvious omitted variable biases in the estimated effects of provider types on treatment incidence and adequacy by including as control regressors measured aspects of the patients' demographic characteristics, health, and type of depression. The results in Tables   3 and 4 are purged of the most basic type of non-randomization of initial provider, which is that the patient's demographic characteristics, health, and depression type are not unrelated to the provider type.

The regressions in
There is another, more statistically subtle, type of endogeneity that is not necessarily purged in the results of Table 4, which is that there may also be a latent common factor between provider type (specifically, whether one initially sees a psychiatrist) and treatment incidence and intensity that makes a person more likely to want to see a psychiatrist and submit to the treatment regime a psychiatrist typically offers. Such a latent common factor, if present but un-modeled would make the results of Table 4 overstate the positive differential impact of a psychiatrist on treatment outcomes.
The most obvious way to deal with the issue of a latent common factor here is again through a multivariate probit model (Greene 2003). We were unsuccessful at estimating a trivariate probit model of provider type, treatment, and treatment adequacy among the treated (the likelihood function would not converge). As a second best solution we estimated two additional bivariate probit models that focus on the possible differential impact of a psychiatrist: (1) a bivariate probit model of whether psychiatrist was the initial provider and then whether the patient got any treatment and (2) a bivariate probit model of whether psychiatrist was the initial provider and then whether the patient who received any treatment got adequate treatment.
Together, the two additional bivariate probit models further purify our conclusions for possible endogeneity of provider type, while continuing to address possible selection bias in that only treated patients are in the sample for the portion of the model examining treatment adequacy. 7 In the bivariate probit model for whether the psychiatrist was the initial provider and whether the patient was in turn treated, the estimated latent common factor is negative and statistically insignificant (ˆ0.121 ρ = − , P = 0.23). Persons who, for reasons not related to their observed characteristics, are more likely to seek treatment from a psychiatrist are not also more likely to submit to treatment. In the bivariate probit model that includes a latent common factor between having a psychiatrist as an initial provider and treatment adequacy, the estimated latent common factor is also negative and statistically insignificant at conventional levels (ˆ0.146 ρ = − , P = 0.10). Persons who, for reasons unrelated to the observed characteristics, are more likely to seek treatment from a psychiatrist are not also more likely to submit to more intense (adequate) treatment.
The message from the results of our two additional bivariate probit models is straightforward. There is a positive and notable differential effect of a psychiatrist initial provider that is not altered by the type of possible (latent common factor) endogeneity that is un-modeled in Tables 3 and 4. 8

Discussion
Even after adjusting for patient mix, the disparities among provider types in whether any treatment occurs are much larger than the disparities across providers in whether treatment initiated is adequate. A stark result emerging is that depressed persons initially diagnosed by non-physician mental health specialists are far less likely to be treated at all for depression than persons diagnosed by medical providers, whether psychiatric or general practitioners. Most noteworthy in our data is that 75 percent of untreated depressed patients had dysthymia, and 72 percent of the untreated suffering from dysthymia initially saw a master's level therapist. What are possible reasons for the much higher incidence of non-treatment of depressed patients initially seen by non-medical providers?
Premature termination from the treatment of mental disorders, including the specialized case of immediate dropout after a single visit, is a long-standing problem for the mental health system. In a comprehensive review, Baekeland and Lundwall (1975) identify at least 15 distinct variables that predict dropout, including patient social isolation, denial, passive-aggressiveness, family attitudes and behavior, therapist attitudes, and discrepancies in expectations between the patient and therapist. Indeed, multiple factors often contribute at the level of an individual patient. We contribute to the literature following recent advances in pharmacotherapy and brief psychotherapy techniques by highlighting the magnitude of the emerging distinction between various providers of treatment.
Differences in dropouts in our data complicates our statistical analysis by introducing the possibility of selection bias. For example, one possible scenario underlying our main result is that persons who initially seek treatment from master's level therapists or psychologists have an underlying latent (to the researcher) predisposition not to follow through with treatment subsequently. The predisposed patient we have in mind is drawn to a non-medical provider in 19 order to find out about treatment but then decides against it or seeks follow-up care through selfhelp groups or pastoral counseling (Swindle et al. 2000). Another possibility for the relatively high incidence of treatment among medical providers is data construction. Physicians may code depression as the visit reason only for patients they believe likely to follow through with treatment, at least in the period immediately following the physician's diagnosis (Rost et al. 1994). A third possibility is that there are relevant unmeasured differences in the severity of depression across patients, which make patients less in need of continued care seek out psychologists and master's level therapists. It follows directly that there is a need to use multivariate techniques to mitigate selection effects.
Although the data we use are from the early 1990s, we believe our results could have important implications for how mental health care is delivered currently. Specifically, the associations between provider type and treatment provision are consistent with recent findings that depression frequently remains unrecognized (Hirschfeld et al. 1997) and is poorly treated when recognized (Young et al. 2001). The typical quality of care for depression has changed little even though the number of people who receive mental health benefits through carved-out managed behavioral health care has increased rapidly (Findlay 1999). Provider networks established by managed care are heavily reliant on non-physician mental health specialists (Goldman 2001). Our regression estimates may suggest a reason for continued low-quality treatment, even as access to mental health treatments, both psychosocial and pharmaceutical, have expanded. Much of the new care depends on the initial evaluation by non-physician specialists, but little follow-up care is being provided.
We offer an explanation for the observation that many who receive care from nonphysicians have only a single visit with no subsequent care. Most Americans (Swindle et al. 2000) and most depressed patients (Dwight-Johnson et al. 2000) express preferences for talk therapy over medication treatment, so that patients simply may not understand the nature of talk 20 therapy or the wide variation in the nature of psychosocial treatments that are offered. Poorly understood wide variation in psychosocial treatments may act as an impediment to seeking further care for the large proportion of patients who apparently prefer psychotherapy.
The stark result presented here is that non-treatment is an extreme form of inadequate treatment that varies systematically with initial provider type, ceteris paribus. Untreated depression is an important issue because there are notable consequences, including unneeded reductions in well-being and lengthening of time depressed (Berndt et al. 1998(Berndt et al. , 2000Sacket and Torrence 1978;Anton and Revicki 1995;Murray and Lopez 1996;and Fryback et al. 1993). The main policy implication of our research is that clearer understanding of how the health care system might better serve depressed persons may require researchers to focus on the link between initial provider and non-initiation of treatment. A logical first step is to understand why so many apparently depressed persons receive no further treatment following the diagnosis, which would allow more focused programs that improve access and quality of care for a devastating illness.

1.
We discuss the role of insurance on provider choice below in the Data section under Empirical Framework.

2.
Later we consider the possibility that the type of provider the patient sees initially may be based on the provider's likely amount of treatment.
3. An additional advantage of the bivariate probit is that X1 and X2 need not differ because of the non-linearity of the model. Conceptually, the same things will affect both margins of treatment. Rather than arbitrarily eliminate variables from X1 and X2, we present results for the case where X1 ≡ X2. Robustness checks for versions of the model with X1 ≠ X2 left the coefficients of interest unchanged in terms of general magnitudes and statistical significance.

4.
We ignore so-called brief and prolonged depressive reaction disorders because they are a response to an identifiable stressor and typically not treated with either medication or counseling.

5.
We did not pursue binary outcome regression models that are not based on the assumption of a particular error distribution, such as the normal that we use, the logistic that is the basis of the logit, or the uniform that is the basis of the linear probability model. The most popular of the so-called semi-parametric binary outcome models is the maximum score, or M-Score, estimator. By construction the M-Score will give the best in-sample fit. However, the M-Score and other semi-parametric estimators do not reveal marginal effects of the independent variables, which are the focus of our research (Greene 2003).

6.
For the reasons mentioned above and tractability we also did not pursue estimating a semi-parametric bivariate outcomes model.

7.
To allow the additional bivariate probit models to have maximal chance to locate a significant latent common factor, we estimated them without identical regressor lists in the two equations. The regressor lists for each equation were selected by running single equation probit regressions and using the variables in the bivariate models that had P ≤ 23 0 .10 in the three simple probit models for initial provider is a psychiatrist, treatment, and adequate treatment.

8.
If we were to ignore the lack of statistical significance of ρ in our ancillary bivariate probits, the calculated differential impact of a psychiatrist is larger because a negative value of ρ means that persons who, for reasons unknown to the researcher, are more likely to see a psychiatrist are also less likely to be treated or to receive adequate treatment. Controlling for psychiatrists' patients' possible latent propensities not to be treated or not to be treated adequately enlarges the estimated differential effect of having a psychiatrist as an initial provider on treatment incidence and adequacy.   Note: Adequacy of treatment equation based on 3,636 treated individuals. The estimated coefficient of correlation between the error terms of the first-stage probit explaining the existence of treatment and the second-step probit above explaining treatment adequacy among the treated (ρ) is 0.205 with a standard error of 0.641 and associated P-value of 0.749. Model estimate of probability of guideline adherence, given treatment, is 86.4 percent. Source: Authors' calculations.