Some Heuristics about Elliptic Curves

We give some heuristics for counting elliptic curves with certain properties. In particular, we rederive the Brumer–McGuinness heuristic for the number of curves with positive/negative discriminant up to X, which is an application of lattice-point counting. We then introduce heuristics that allow us to predict how often we expect an elliptic curve E with even parity to have L(E, 1) = 0. We find that we expect there to be about c 1 X 19/24(logX)3/8 curves with |Δ| < X with even parity and positive (analytic) rank; since Brumer and McGuinness predict cX 5/6 total curves, this implies that, asymptotically, almost all even-parity curves have rank 0. We then derive similar estimates for ordering by conductor, and conclude by giving various data regarding our heuristics and related questions.


INTRODUCTION
We give some heuristics for counting elliptic curves with certain properties. In particular, we rederive the Brumer-McGuinness heuristic for the number of curves with positive/negative discriminant up to X, which is an application of lattice-point counting. We then introduce heuristics (with refinements from random matrix theory) that allow us to predict how often we expect an elliptic curve E with even parity to have L(E, 1) = 0.
It turns out that we roughly expect that a curve with even parity has L(E, 1) = 0 with probability proportional to the square root of its real period, and since we have an upper bound of size 1/∆ 1/12 on the real period, this leads us to the prediction that almost all curves with even parity should have L(E, 1) = 0. By the conjecture of Birch and Swinnerton-Dyer, this says that almost all such curves have rank 0.
We then make similar heuristics for enumeration by conductor. The first task here is simply to count curves with conductor up to X, and for this we use heuristics involving how often large powers of primes divide the discriminant. On making this estimate, we are then able to imitate the argument we made previously, and thus derive an asymptotic for the number of curves with even parity and L(E, 1) = 0 under the ordering by conductor.
We again get the heuristic that almost all curves with even parity should have L(E, 1) = 0.
We then make a few remarks regarding how often curves should have nontrivial isogenies and/or torsion under different orderings, and then present some data regarding average ranks and the proportion of rank-2 curves. In particular, we give new evidence that the proportion of rank-2 curves goes to zero; this involves a careful "random" sampling of curves whose conductor is larger than previously considered, and we require an analysis of the variation of the real period to ensure that our sample is not overly biased.
We conclude by giving data for the Mordell-Weil lattice distribution of rank-2 curves, and speculating about symmetric power L-functions.

THE BRUMER-MCGUINNESS HEURISTIC
First we rederive the Brumer-McGuinness heuristic [Brumer and McGuinness 90] for the number of elliptic curves whose absolute discriminant is less than a given bound X; the technique here is essentially lattice-point counting, and we derive our estimates via the assumption that these counts are well-approximated by the areas of the given regions.

Conjecture 2.1. (Brumer-McGuinness.) The number
A ± (X) of elliptic curves over Q whose minimal (integral) discriminant has absolute value less than X is asymptotically given by (splitting into positive and negative discriminant) As indicated by Brumer and McGuinness, the identity α − = √ 3α + was already known to Legendre and is related to complex multiplication (CM). These constants can be expressed in terms of beta integrals since α + = 1 3 B 1 2 , 1 6 and α − = B 1 2 , 1 3 . Recall that every elliptic curve over Q has a unique integral minimal model y 2 + a 1 xy + a 3 y = x 3 + a 2 x 2 + a 4 x + a 6 with a 1 , a 3 ∈ {0, 1} and |a 2 | ≤ 1.
Fix one of the 12 choices of (a 1 , a 2 , a 3 ). Since these are all bounded by 1, the discriminant is thus approximately −64a 3 4 − 432a 2 6 . So we essentially wish to count the number of (a 4 , a 6 ) lattice points with 64a 3 4 + 432a 2 6 ≤ X, where we note that Brumer and McGuinness divide the curves according to the sign of the discriminant. The lattice-point count for a 1 = a 2 = a 3 = 0 is given by 1.
These integrals are probably known, but I am unable to find a reference. The integrals respectively simplify 1 to This counts all models of curves; if we eliminate nonminimal models, for which we have p 4 | c 4 and p 12 | ∆ for some prime p, we expect to accrue an extra factor 2 of 1 As N. D. Elkies indicated to us, we can write differentiate under the integral sign, then substitute t 2 + a = ax 3 , and finally integrate again to obtain I(1). 2 Note that some choices of (a 1 , a 2 , a 3 ) necessarily have odd discriminant, but the other choices compensate to give the proper Euler factors at 2 (and 3). A more direct way of getting the 1/ζ(10) factor is to note that nonminimality at p occurs when c 4 /p 4 and c 6 /p 6 satisfy the Connell congruences [Connell 91] we mention below. A referee points out that Brumer [Brumer 92,Lemma 4.3] obtains this same 1/ζ(10) factor via sieving.
1/ζ(10). From this we get the conjecture of Brumer and McGuinness presented above.

COUNTING CURVES OF EVEN PARITY WHOSE CENTRAL L-VALUE VANISHES
Due to work of Wiles [Wiles 95,Taylor and Wiles 95] and others [Diamond 96,Conrad et al. 99,Breuil et al. 01], we know that elliptic curves over Q are modular, and this implies that the completed L-function extends to an entire function and satisfies the functional equation When the plus sign occurs, we say that E has even parity. (See [Silverman 92, for the definitions of the conductor N and L-function L(E, s) of an elliptic curve E.) We now try to count elliptic curves E with even parity for which L(E, 1) = 0. Throughout this section, E will denote a curve with even parity, and we shall order curves by discriminant. Via the conjectural 50-50 principle, we expect that under any reasonable ordering, half of the elliptic curves should have even parity. 3 In particular, we predict that there are asymptotically A ± (X)/2 curves with even parity and positive/negative discriminant up to X.
Our main tool will be random matrix theory, which gives a heuristic for predicting how often L(E, 1) is small. We could alternatively derive a cruder heuristic by assuming that the order of the Shafarevich-Tate group is a random square integer in a given interval, but random matrix theory has the advantage of being able to predict a more explicit asymptotic. Our principal heuristic is the following.
Heuristic 3.1. The number R(X) of elliptic curves E/Q with even parity and L(E, 1) = 0 and minimal absolute discriminant less than X is given asymptotically by R(X) ∼ cX 19/24 (log X) 3/8 for some constant c > 0.
In particular, note that we get the prediction that almost all curves with even parity have L(E, 1) = 0 under this ordering.

Random Matrix Theory
Originally developed in mathematical statistics by Wishart [Wishart 28] in the 1920s and then in mathematical physics (especially the spectra of highly excited nuclei) by Wigner [Wigner 55],Dyson,Mehta,and others (particularly [Marčenko and Pastur 67]), random matrix theory [Mehta 04] has now found some applications in number theory, the earliest being the oft-told story of Dyson's remark to Montgomery regarding the paircorrelation of zeros of the Riemann ζ-function.
Based on substantial numerical evidence, random matrix theory appears to give reasonable models for the distribution of L-values in families, though the issue of what constitutes a proper family is a delicate one (see particularly [Conrey et al. 05,Section 3], where the notion of family comes from the ability to calculate moments of L-functions rather than from algebraic geometry).
The family of quadratic twists of a given elliptic curve is given by for square-free d. The work (most significantly a monodromy computation) of Katz and Sarnak [Katz and Sarnak 99] regarding families of curves over function fields implies that when we restrict to quadratic twists with even parity, we should expect that the L-functions are modeled by random matrices with even orthogonal symmetry. This means that local statistics involving the distribution of spacings of zeros of the L-functions should be the same as the statistics concerning the distribution of the spacings of the eigenangles of matrices taken randomly from SO(2M ) with respect to Haar measure. Furthermore, the distribution of the special values of such L-functions should be related to the distribution of the evaluations at 1 of the characteristic polynomials of such matrices.
An argument based on frequency of small L-values and discretization (similar to the below) then gives that the number of d with E d of even parity, |d| < D, and L(E d , 1) = 0 is given by c E D 3/4 (log D) bE , where b E takes on four possible values (see [Delaunay and Watkins 07]) depending on the splitting behavior of the cubic polynomial x 3 + Ax + B, while c E has yet to be determined explicitly. Various data have been given by Rubinstein [Conrey et al. 06] to lend credence to this guess. We note that the exponent 3/4 can already be suspected from work of Waldspurger [Waldspurger 81], which relates 4 to E a modular form of weight 3 2 whose dth coefficient c(d) is such that c(d) 2 is proportional to L(E d , 1).
In particular, the Ramanujan conjecture predicts that the dth coefficient is bounded by d 1/4+ , and so, assuming a reasonable distribution, the probability that it is zero is about one in d 1/4 . Summing over d up to D then gives the crude heuristic (possibly due to Sarnak).
Though we have no exact function-field analogue for considering the set of all elliptic curves of even parity, we brazenly assume (largely from looking at the sign in the functional equation) that the symmetry type is again orthogonal with even parity. 5 What this means is that we want to model properties of the L-function via random matrices taken from SO(2M ) with respect to Haar measure. Here we wish the mean density of zeros of the L-functions to match the mean density of eigenvalues of our matrices, and so, as in [Keating and Snaith 00], we should take 2M ≈ 2 log N .
We suspect that the L-value distribution is approximately given by the distribution of the evaluations at 1 of the characteristic polynomials of our random matrices. At the crude level, this distribution is determined entirely by the symmetry type, while finer considerations are distinguished via arithmetic considerations.
With this assumption, via the moment conjectures of [Keating and Snaith 00] and then using Mellin inversion, as t → 0 we have (see [Conrey et al. 02,(21)]) that (here This heuristic is stated for fixed M ≈ log N , but we shall also allow M → ∞. It is not easy to understand this probability, since both the constant α(E) and the matrix size M depend on E. We can take curves with e M ≤ N ≤ e M +1 to mollify the impact of the conductor, but in order to average over a set of curves, we need to understand how α(E) varies. One idea is that α(E) separates into two parts, one of which depends on local structure (Frobenius traces) of the curve, and the other of which depends only on the size of the conductor N . Letting G be the Barnes G-function (such that G(z+1) = Γ(z)G(z) with G(1) = 1) and M = log N , we have that where L p (X) = (1 − a p X + pX 2 ) −1 when p ∆ and L p (X) = (1 − a p X) −1 otherwise; see [Conrey et al. 02,(10)] evaluated at k = − 1 2 , though that equation is wrong at primes that divide the discriminant; see [Conrey et al. 07,(20)], where Q should be taken to be 1. Note that the Sato-Tate conjecture [Tate 65] implies that a 2 p is p on average, and this implies that the above Euler product converges. 6

Discretization of the L-Value Distribution
We let τ p (E) be the Tamagawa number of E at the (possibly infinite) prime p, and write τ (E) = p τ p (E) for the Tamagawa product and T (E) for the size of the torsion group. We also write Ω re (E) for the real period, and S(E) for the size of the Shafarevich-Tate group when L(E, 1) = 0, with S(E) = 0 when L(E, 1) = 0. (For precise definitions of the Tamagawa numbers, torsion group, periods, and Shafarevich-Tate group, see [Silverman 92], though below we give a brief description of some of these.) We wish to assert that sufficiently small values of L(E, 1) actually correspond to L(E, 1) = 0. We do this via the conjectural formula of Birch and Swinnerton-Dyer , which asserts that Our discretization 7 will be that Note that we are using only that S(E) takes on integral values, and do not use the (conjectural) fact that it is square. Using (3-1), we estimate the number of curves with positive (for simplicity) discriminant less than X and even parity and L(E, 1) = 0 via the lattice-point sum We need to introduce congruence conditions on c 4 and c 6 to make sure that they correspond to a minimal model of an elliptic curve. The paper [Stein and Watkins 02] uses the work of Connell [Connell 91] in a different context to get that there are 288 classes of (c 4 mod 576, c 6 mod 1728) that can give minimal models, and so we get a factor of 288/(576·1728), assuming that each congruence class has the same impact on all the entities in the sum. Indeed, this independence (on average) of various quantities with respect to c 4 and c 6 is critical in our estimation of W (X).
There is also the question of nonminimal models, from which (as in the Brumer-McGuinness heuristic) we get a factor of 1/ζ(10).
Hereα R is the limit 2 1/8 G(1/2)π −1/4 of α R (M ) as M → ∞, whileᾱ A is a suitable average of the arithmetic factors α A (E), and β( √ τ ) is the average of the square root of the Tamagawa product. We have also approximated log N ≈ log ∆ and assumed that the torsion is trivial; below we will give these heuristics justification (on average). Note that everything left in the integral is a smooth function of u 4 and u 6 .
We shall first evaluate the integral inŴ (X) given these suppositions, and then try to justify the various assumptions that are inherent in this guess. 8 For convenience, we try to list all the heuristic assumptions we have made.
8 Note that our methods do not readily generalize to positive rank, since there is no apparent way to model the heights of points (and thus the regulator). A referee points out that Lang [Lang 83] gives some bounds, and perhaps suggests a distribution, but this seems insufficient for our purposes.
• Lattice-point sums are well approximated by areal integrals.
• We have via random matrix theory.
• There is independence among the arithmetic factor α A (E), the Tamagawa products, and the real period.
• We can replace log N by log ∆, and torsion can be ignored.
Changing variables in theŴ -integral gives a Jacobian of 432/∆ 1/6 µ 4 + µ, so that Thus the variables are nicely separated, and since the µ-integral converges, we do indeed get the asymptotiĉ A similar argument can be given for curves with negative discriminant. This concludes our derivation of Heuristic 3.1, and now we turn to giving some reasons for our expectation that the arithmetic factors can be mollified by taking their averages.

Expectations for Arithmetic Factors on Average
In the next section we shall explain (among other things) why we expect that log N ≈ log ∆ for almost all curves, and in Section 5, we shall recall the classical parameterizations of X 1 (N ) due to Fricke to indicate why we expect that the torsion size is trivial outside a sparse set of curves. Here we show how to compute the various averages (with respect to ordering by discriminant) of the square root of the Tamagawa product and the arithmetic factors α A (E). For both heuristics, we make the assumption that curves satisfying the discriminant bound |∆| ≤ X behave essentially the same as those that satisfy |c 4 | ≤ X 1/3 and |c 6 | ≤ X 1/2 . That is, we approximate our region by a big box. We write D for the absolute value of ∆, and consider how often high powers of primes divide D.
3.4.1 Primes Dividing the Discriminant. We wish to know how often a prime divides the discriminant to a high power. Fix a prime p ≥ 5 with p much smaller than X 1/3 . We estimate the probability that p k | ∆ by considering all p 2k choices of c 4 and c 6 modulo p k ; that is, we count the number of solutions C(p k ) to the congruence c 3 4 −c 2 6 = 1728∆ ≡ 0 (mod p k ). This auxiliary curve c 3 4 = c 2 6 is singular at (0, 0) over F p , and has (p − 1) nonsingular F p -solutions that lift to p k−1 (p − 1) points modulo p k .
For p k sufficiently small, our (c 4 , c 6 )-region is so large that we can show that the probability that p k | ∆ is C(p k )/p 2k . We assume that big primes act (on average) in the same manner, while a similar heuristic can be given for p = 2, 3. Curves with p 4 | c 4 and p 6 | c 6 will not be given by their minimal model; indeed, we want to exclude these curves, and so we will multiply our probabilities by κ p = (1 − 1/p 10 ) −1 to make them conditional on this criterion. For instance, the above counting of points says that there is a probability of (p 2 − p)/p 2 that p D, and so on conditioning on minimal models, we get κ p (1−1/p) for this probability.
What is the probability P m (p, k) that a curve given by a minimal model has multiplicative reduction at p ≥ 5 and p k D for some k > 0? In terms of Kodaira symbols, this is the case of I k . For multiplicative reduction we need that p c 4 and p c 6 . These events are assumed independent, and each has a probability (1 − 1/p) of occurring. If we assume these conditions and work modulo p k , there are (p k − p k−1 ) such choices for both c 4 and c 6 , and of the resulting (c 4 , c 6 ) pairs we noted above that p k−1 (p − 1) of them have p k | D. So, given a curve with p c 4 and p c 6 , we have a probability of 1/p k−1 (p − 1) that p k | D, which gives 1/p k for the probability that p k D. In symbols, we have that (for p ≥ 5 and k ≥ 1) Including the conditional probability for minimal models, we get for p ≥ 5 and k ≥ 1. Note that summing this over k ≥ 1 gives κ p (1−1/p)/p for the probability for an elliptic curve to have multiplicative reduction at p. What is the probability P a (p, k) that a curve given by a minimal model has additive reduction at p ≥ 5 and p k D for some k > 0? We shall temporarily ignore the factor κ p = (1 − 1/p 10 ) −1 from nonminimal models and include it at the end. We must have that p | c 4 and p | c 6 , and thus get that k ≥ 2. For k = 2, 3, 4, which correspond to Kodaira symbols II, III, and IV respectively, the computation is not too bad: we get that p 2 D exactly when p | c 4 and p c 6 , so that the probability is for p 3 D we need p c 4 and p 2 | c 6 and thus get for the probability; and for p 4 D we need p 2 | c 4 and p 2 c 6 , and so get for the probability. Note that the case k = 5 cannot occur. Thus we have (for p ≥ 5) the formula More complications occur for k ≥ 6, where now we split into two cases depending on whether additive reduction persists on taking the quadratic twist by p. This occurs when p 3 | c 4 and p 4 | c 6 , and we denote by P n a (p, k) the probability that p k D in this subcase. Just as above, we get that 9,10. These are respectively the cases of Kodaira symbols IV , III , and II . For k = 11 we have P n a (p, k) = 0, while for k ≥ 12 our condition of minimality implies that we should take P n a (p, k) = 0. We denote by P t a (p, k) the probability that p 6 | D with either p 2 c 4 or p 3 c 6 . First we consider curves for which p 7 | D, and these have multiplicative reduction at p upon twisting. In particular, these curves have p 2 c 4 and p 3 c 6 , and the probability of this is Consider k ≥ 7. We then take c 4 /p 2 and c 6 /p 3 both modulo p k−6 , and get that p k−6 (D/p 6 ) with probability 1/p k−6 in analogy with the above. So we get that for k ≥ 7. This corresponds to the case of I k−6 . Finally, for p 6 D (which is the case I 0 ) we get a probability of 1 p 2 · 1 p 3 that p 2 | c 4 and p 3 | c 6 , and since there are p points mod p on the auxiliary curve we get a conditional probability of (p 2 − p)/p 2 that p 6 D. So we get that P t a (p, 6) = We now impose our current notation on the previous paragraphs, and naturally let P t a (p, k) = 0 and P n a (p, k) = P a (p, k) for k ≤ 5. Our final result is that with P n a (p, k) and P t a (p, k) equal to zero for other k. We conclude by defining P 0 (p, k) to be zero for k > 0 and to be the probability (1 − 1/p 10 ) −1 (1 − 1/p) that p D for k = 0. We can easily check that we really do have the required probability relation ∞ k=0 P m (p, k) + P n a (p, k) + P t a (p, k) + P 0 (p, k) = 1, since the cases of multiplicative reduction give κ p (1 − 1/p)/p; the cases of Kodaira symbols II, III, and IV give κ p (1/p 2 − 1/p 5 ); the cases of Kodaira symbols IV , III , and II give κ p (1/p 7 − 1/p 10 ); the cases of I k summed for k ≥ 1 give κ p (1 − 1/p)/p 6 ; the case of I 0 gives κ p (1 − 1/p)/p 5 ; and the sum of these with P 0 (p, 0) = κ p (1 − 1/p) does indeed give us 1. We could do a similar (more tedious) analysis for p = 2, 3, but this would obscure our argument. The heuristics we used in deriving these probabilities were that the curves with |∆| ≤ X act like those in a big box with |c 4 | ≤ X 1/3 and |c 6 | ≤ X 1/2 , and that the effect of large primes dividing the discriminant can be estimated in a similar manner as with the small primes.
3.4.2 Tamagawa Averages. Given a curve of absolute discriminant D, we can now compute the expectation for its Tamagawa number. We consider primes p | D with p ≥ 5, and compute the local Tamagawa number t(p); this can be done as in [Cohen 93, Algorithm 7.5.1] (with a corrected line 2 of step 3 in early printings).
When E has multiplicative reduction at p and p k D, then t(p) = k if −c 6 is square mod p, and otherwise, t(p) = 1, 2, depending on whether k is odd or even. So the average of t(p) for this case is for k odd or even respectively. When E has potentially multiplicative reduction at p with p k D, for k odd we have t(p) = 4, 2, depending on whether (c 6 /p 3 ) · (∆/p k ) is square mod p, and for k even we have t(p) = 4, 2, depending on whether ∆/p k is square mod p. In both cases the average of t(p) is In the case of I 0 reduction where we have p 6 D, we have that t(p) = 1, 2, 4, corresponding to whether the cubic in this case. For the remaining cases, when p 2 D or p 10 D we have t(p) = 1, while when p 3 D or p 9 D we have t(p) = 2. Finally, when p 4 D we have t(p) = 3, 1, depending on whether −6c 6 /p 2 is square mod p, and similarly when p 8 D we have t(p) = 3, 1, depending on whether −6c 6 /p 4 is square mod p, so that the average of t(p) in both cases is 1 2 (1 + √ 3). We get that n a (k) = 1, for k = 2, 3, 4, 8, 9, 10, while with n a (k) and t a (p, k) equal to zero for other k. We define the expected square root of the Tamagawa number K(p) at p by and assume that all the primes act independently to get that the expected global 9 Tamagawa number is The convergence of this product follows from an analysis of the dominant k = 0, 1, 2 terms of (3-5), which gives a behavior of 1 + O(1/p 2 ). So we get that the Tamagawa product is a constant on average, which we do not bother to compute explicitly (we would need to consider p = 2, 3 more carefully to get a precise value).

Arithmetic Averages.
To compute the average value of α A (E) = p F (p) in (3-2), we similarly assume that each prime acts independently. 10 We then compute the average value for each prime by calculating the distribution of F (p) for all the curves modulo p (including those with singular reduction, and again making the slight adjustment for nonminimal models). This gives some constant for the averageᾱ A of α A (E), which we again do not compute explicitly. Note that p F (p) converges if we assume the Sato-Tate conjecture [Tate 65], since then we have that a 2 p is p on average.

RELATION BETWEEN CONDUCTOR AND DISCRIMINANT
We now give heuristics for how often we expect the ratio between the absolute discriminant and the conductor to be large. The main heuristic we derive is the following.
Heuristic 4.1. The number B(X) of elliptic curves over Q whose conductor is less than X satisfies B(X) ∼ cX 5/6 for some explicit constant c > 0.

Remark 4.2.
It must be noted that the data of Cremona [Cremona 06] do not coincide with this heuristic; in fact, the growth seems almost linear in the conductor, for taking the curves with conductor in the range 40,000-130,000 in his database and doing a log-log regression yields a best-fit exponent of 0.98, which is much closer to 1 than to 5 6 . An upper bound of B(X) X 1+ is explicated in [Duke and Kowalski 00, Section 3.1].
9 Note that the Tamagawa number at infinity is 1 when E has negative discriminant and otherwise is 2, the former occurring approximately √ 3/(1 + √ 3) ≈ 63.4% of the time. 10 This argumentative technique can also be used to bolster our assumption that using Connell's conditions should be independent of other considerations.

Remark 4.3.
We claim that the constant c here can be made explicit, but this would require a more careful analysis at p = 2, 3 than we wish to describe here.
To derive this heuristic, we estimate the proportion of curves with a given ratio of (absolute) discriminant to conductor. Since the conductor is often the squarefree kernel of the discriminant, by way of explanation we first consider the behavior of f (n) = n/sqfree(n). The probability that f (n) = 1 is given by the probability that n is square-free, which is classically known to be 1/ζ(2) = 6/π 2 . Given a prime power p m , to have f (n) = p m says that n = p m+1 u, where u is square-free and coprime to p. The probability that p m+1 n is (1 − 1/p)/p m+1 , and given this, the conditional probability that n/p m+1 is square-free is (6/π 2 ) · (1 − 1/p 2 ) −1 . Extending this multiplicatively beyond prime powers, we get that In particular, the average of f (n) γ exists for γ < 1; in our elliptic curve analogue, we will require such an average for γ = 5 6 . We note that it is an interesting question 11 to prove an asymptotic for n≤X n/sqfree(n).

Derivation of the Heuristic
We keep the notation D = |∆| and wish to compute the probability that D/N = q for a fixed positive integer q. For a prime power p v with p ≥ 5, the probability that p v (D/N ) is given by the following: the probability that E has multiplicative reduction at p and p v+1 D, that is, P m (p, v + 1); plus the probability that E has additive reduction at p and p v+2 D, that is, P a (p, v +2); and the contribution from P 0 (p, v), which is zero for v > 0 and for v = 0 is the probability that p does not divide D. So, writing v = v p (q), we get that (with a similar modified formula for p = 2, 3) We emphasize that this probability is with respect to curve-ordering by discriminant (as in the last section), 11 The saddle-point method as indicated in [Tenenbaum 88] and [Burris and Yeats 05] might be applicable, but it appears to involve quite careful estimation to achieve an asymptotic rather than a log-asymptotic. It was pointed out to us by G. Tenenbaum that [Schwarz 65] improves on the result of [de Bruijn 62], though the result is not that explicit. and as previously, we have assumed that the primes act independently, that curves with |∆| ≤ X act like those in a big box, and that the effect of large primes is similar to that from small primes. Writing α = α + + α − , from Conjecture 2.1 we have and if this last sum converges, we then get Heuristic 4.1.
To show that the last sum in (4-2) does indeed converge, we get an upper bound for the probability in (4-1). We have that P m (p, v + 1) ≤ 1/p v+1 and P a (p, v + 2) ≤ 2/p v+1 , which implieŝ We then estimate and the last product is seen to be convergent on comparison to ζ 7 6 3 . Thus we shown that the last sum in (4-2) converges, so that Heuristic 4.1 follows. We note that Fouvry, Nair, and Tenenbaum [Fouvry et al. 92] have shown that the number of minimal models with D ≤ X is at least cX 5/6 , and that the number of curves with D ≤ X with Szpiro ratio log D log N ≥ κ is no more than c X 1/κ+ for every > 0.

Dependence of D/N and the Tamagawa Product
We assume that D/N should be independent of the real period, but the Tamagawa product and D/N should be somewhat related. 12 We compute the expected square root of the Tamagawa product when D/N = q. As with (4-1) and using the defined in (3-3) and (3-4), we find that this is given by where v 1 = v + 1, v 2 = v + 2 and v = v p (q).

The Comparison of log ∆ with log N
We now want to compare log ∆ with log N , and explicate the replacement therein in Guess 3.2. In order to bound the effect of curves with large D/N , we note that and use Rankin's trick (that is, bounding the characteristic function of q ≥ Y by (q/Y ) 1−α for a parameter 0 < α < 1 that will be chosen optimally), so that for any 0 < α < 1 we have (using p α − 1 ≥ α log p in the penultimate step, and then the prime number theorem to bound p for some constantsc, c, by taking α = 1/ √ log Y (this result is stronger than needed).
However, a more pedantic derivation of Guess 3.2 does not simply allow replacing log N by log ∆, but requires analysis (assuming Ω re (E) to be independent of q) of αRᾱA 3456 ζ(10) The above estimate on the tail of the probability and a simple bound on η(q) in terms of the divisor function shows that we can truncate the q-sum at Y with an error of O (1/Y 1− ) (for all > 0), and choosing (say) Y = e √ log X gives us that log(∆/q) ∼ log ∆ (note that we have restricted to ∆ > √ X). So the bracketed term becomes the desired q<Y η(q)(log ∆) 3/8 · Prob D/N = q ∼ β( √ τ )(log ∆) 3/8 , on noting that the q-part of the sum converges to β( √ τ ) as Y → ∞.

Counting Curves with Vanishing L-value
We now estimate the number of elliptic curves E with even parity and L(E, 1) = 0 when ordered by conductor.
From Guess 3.2 we get that the number of even-parity curves with 0 < ∆ < qX, D/N = q and L(E, 1) = 0 is given byŴ and we sum this over all q. As we argued above, the tail of the sum does not affect the asymptotic (and so we can take log ∆ ∼ log N inŴ ), and again we get that the q-sum converges. This then gives the desired asymptotic for the number of even-parity curves with conductor less than X and vanishing central L-value (after arguing similarly for curves with negative discriminant).

Relations to Other Work
It is proposed by Hindry [Hindry 05, Conjectures 5.4 and 5.5] that a theorem of Brauer-Siegel type might hold for elliptic curves; that is, it should be that nonvanishing values of L(E, 1) are bounded quite far away (say 1/ log N ) from 0. This would say that the product of the regulator and #X cannot be too small. We view this as unlikely; already in the rank-zero case we can see no reason why there should not be infinitely many curves with trivial Shafarevich-Tate group. Indeed, having #X = 1 should be approximately as common as having positive rank according to the above discretization methodology. The main difference between the elliptic curve case and that for number fields is that the latter deals with L-values at the edge of the critical strip, while our interest is in central values.
We might also point out that a guesstimate of X 19/24 curves of rank 2 up to X can also come from a couple of different methods. One method is to consider the (conjectural) BSD formula = #X(E) when E has rank 0, 0 otherwise, and note that the torsion and Tamagawa contributions are small compared to the reciprocal of the real pe-riod. 13 A generalization of the Lindelöf hypothesis implies that L(E, 1) is bounded above by something like log N ; this is also small compared to 1/Ω re , and so we view S(E) as possibly taking values from 0 up to about 1/Ω re . Since S(E) should be an integral square, this gives a crude probability of 1 in 1/Ω re of a curve of even parity having a vanishing central value. Summing over curves (which inter alia uses that 1/Ω re is typically about ∆ 1/12 ), this gives 14 a rough count of X 19/24 . A different method to obtain X 19/24 is to estimate the number of integral points on the variety y 2 = x 3 + Axz 2 + Bz 3 for various ranges of (A, B, x, y, z). Though highly speculative, especially for larger ranks where arithmetic considerations may dominate, this predicts an upper bound of size X (21−r)/24 for the number of curves of rank r, yielding the asserted X 19/24 for r = 2. This will be discussed further in a forthcoming paper with A. Granville.

TORSION AND ISOGENIES
We can also count curves that have a given torsion group or isogeny structure. For instance, an elliptic curve with a 2-torsion point can be written as an integral model in the form y 2 = x 3 + ax 2 + bx, where ∆ = 16b 2 (a 2 − 4b); thus, by lattice-point counting, we estimate about √ X curves with absolute discriminant less than X. The effect on the conductor can perhaps more easily be seen by using the Fricke parameterization c 4 = (t + 16)(t + 64)T 2 and c 6 = (t − 8)(t + 64) 2 T 3 of curves with a rational 2-isogeny, and then substituting t = p/q and V = T /q to get The summation over the twisting parameter V just multiplies our estimate by a constant, while ABC estimates imply that there should be no more than X 2/3+ coprime 13 This can be made precise; below we note that 1/Ωre ∆ 1/12 , while the torsion is bounded and the Tamagawa product is bounded by a divisor function.
14 This is vaguely related to the Sarnak estimate of D 3/4 for the count of vanishings in families of quadratic twists, but relies only on the size of the real period. pairs (p, q) with the square-free kernel of pq(p + 64q) smaller than X in absolute value.
So we get the heuristic that almost all curves have no 2-torsion, even under ordering by conductor. Indeed, the exceptional set is so sparse that we can ignore it in our calculations. A similar argument applies for other isogenies, and more generally for splitting of division polynomials. Also, the results [Duke 97] for exceptional primes are applicable here, albeit with a different ordering.

EXPERIMENTS
We wish to provide some experimental data for the above heuristics. However, it is difficult to distinguish numerically between 19/24 and 5/6 in the predictions R(X) ∼ cX 19/24 (log X) 3/8 and A ± (X) ∼ c X 5/6 . Therefore, we instead try to refute the "null hypothesis," namely that there should be a positive proportion of rank-2 curves. In particular, the two large data sets of [Brumer and McGuinness 90] and [Stein and Watkins 02] for curves of prime conductor up to 10 8 and 10 10 show little drop in the proportion of rank-2 curves, and an even smaller drop in the observed average (analytic) rank.
These results led some to speculate that the average rank might (asymptotically) be greater than 0.5, with a positive proportion of elliptic curves having rank 2 or more.
Brumer and McGuinness considered about 310,700 curves with prime conductor less than 10 8 and found an average rank of about 0.978, while Stein and Watkins extended this to over 11 million curves with prime conductor up to 10 10 and found an average rank of about 0.964. Both data sets are expected to be nearly exhaustive 15 among curves with prime conductor up to the given limit. To extend the data in a computationally feasible manner, we chose a selection of curves with prime conductor of size 10 14 . It is nontrivial to get a good data set, since we must account for congruence conditions on the elliptic curve coefficients and the variation of the size of the real period.

Average Analytic Rank for Curves with Prime Conductor near 10 14
As in [Stein and Watkins 02], we divided the (c 4 , c 6 ) pairs into 288 congruence classes with (c 4 ,c 6 ) = c 4 mod 576, c 6 mod 1728 .
Many of these classes force the prime 2 to divide the discriminant, and thus do not produce any curves of prime conductor. For each class (c 4 ,c 6 ), we took the 10,000 parameter selections (c 4 , c 6 ) = 576(1000 + i) +c 4 , 1728(100000 + j) +c 6 for (i, j) ∈ [1.
.1000], and then of these 2,880,000 curves, took the 89,913 models that had prime discriminant (note that all the discriminants are positive). This gives us good distribution across congruence classes, and while the real period does not vary as much as possible, below we will attempt to understand how this affects the average rank. It then took a few months to compute the (suspected) analytic ranks for these curves. We got about 0.937 for the average rank. We then did a similar experiment for curves with negative discriminant given by (c 4 , c 6 ) = 576(−883 + i) +c 4 , 1728(100000 + j) +c 6 for (i, j) ∈ [1..10] × [1..1000], took the subset of 89,749 curves with prime conductor, and found the average rank to be about 0.869. This discrepancy between positive and negative discriminant is also in the Brumer-McGuinness and Stein-Watkins data sets, and indeed was noted in [Brumer and McGuinness 90]. 16 We do not average the results from positive and negative discriminants; the Brumer-McGuinness conjecture, Conjecture 2.1, implies that the split is not 50-50.
In any case, our results show a substantial drop in the average rank, which, at the very least, indicates that the average rank is not constant in the range we considered. The alternative statistic of frequency of positive rank for curves with even parity also showed a significant drop. For curves of prime positive discriminant it was 44.1% for Brumer-McGuinness and 41.7% for Stein-Watkins, but only 36.0% for our data set; for curves of negative discriminant and prime conductor, these numbers are 37.7%, 36.4%, and 31.3%.

Variation of Real Period
Our random sampling of curves with prime conductor of size 10 14 must account for various properties of the curves if our results are to possess legitimacy. Above, we speculated that the real period plays the most significant role, and so we wish to understand how our choice has affected it. Indeed, as was pointed out to us by X.-F. Roblot, the variation of real period from enumerating in 16 "An interesting phenomenon was the systematic influence of the discriminant sign on all aspects of the arithmetic of the curve." a large (c 4 , c 6 )-box is quite different from the result of enumerating by discriminant. However, while this discrepancy with the distribution of the real period may be the weakest link in our experiment, we can still make a reasonable comparison between data sets, due to our assertion that only the size of the real period should matter.
To judge the effect that variation of the real period might have, we did some comparisons with the Stein-Watkins database. First consider curves of positive prime discriminant, and write E as and e 1 > e 2 > e 3 for the real roots of the cubic. We looked at curves with even parity and considered the frequency of positive rank as a function of the root quotient t = e 1 − e 2 e 1 − e 3 , noting that 17 The curves we considered all had 0.617 < t < 0.629. However, in analogy to our consideration of curves ordered by conductor, before counting curves with extra rank we should first simply count curves. Figure 1 indicates the distribution of the root quotient t for the curves of prime (positive) discriminant and even parity from the Stein-Watkins database (more than two million curves meet the criteria). The x-axis is divided up into bins of size 1/1000; there are more than one hundred times as many curves with t < 0.001 as with 0.500 < t < 0.501, with the most extremal dots not even appearing on the graph. Next we plot the frequency of L(E, 1) = 0 as a function of the root quotient in Figure 2. Since there are only about one thousand curves in some of our bins, we do not get such a nice graph. Note that the leftmost and especially the rightmost dots are much below their nearest neighbors and that the graph slopes down in general and drops more at the end. We see no evidence that our results should be overly biased. In particular, the frequency of L(E, 1) = 0 is 41.7% among all curves of even parity and prime discriminant in the Stein-Watkins database, and is 42.8% for the 12,324 such curves with 0.617 < t < 0.629. The function plotted (labeled on the right axis) in Figure 2 is as a function of t, and note that this goes to zero as t → 0, 1; there is nothing canonical about the choice of our t parameter, and we chose it more for convenience than anything else. Similar computations can be made in the case of negative discriminant, which we briefly discuss for completeness (again restricting to curves with even parity where appropriate). Let r be the real root of the cubic polynomial 4x 3 + b 2 x 2 + 2b 4 x + b 6 , and Z > 0 the imaginary part of the conjugate pair of nonreal roots. Letting (1 + 9c 2 /4) 1/12 agm 1, 1 2 + 3c 4 √ 1+9c 2 /4 . We renormalize by taking C = 1 2 + arctan(c)/π, and graph the distribution of curves versus C in Figure 3. The symmetry of the graph might indicate that the coordinate transform is reasonable. 19 All our curves have 0.555 < C < 0.557.
Next we plot the frequency of L(E, 1) = 0 as a function of the root quotient in Figure 4. Again we also graph the function Ω re |∆| 1/12 on the right axis. Here the drop-off is more pronounced than with the curves of positive discriminant. Note the floating dot around C = 1 2 . Indeed, the hundred closest curves with C < 1 2 all have positive rank; this breaks down when the barrier 1 2 is crossed. This is not particularly a mystery: these curves have a 6 = 0 and/or b 6 = 1, and thus have an obvious rational point. Recall that C = 1 2 corresponds to c = 0 =r. We again see no evidence that our results should be biased. In particular, the frequency of L(E, 1) = 0 is 36.4% among all curves of even parity and negative prime discriminant in the Stein-Watkins database, and is 37.0% for the 4695 such curves with 0.555 < C < 0.557.

Other Considerations
The idea that the "probability" that a curve of even parity possesses positive rank should be proportional to √ Ω re is perhaps overly simplistic; in particular, it is not borne out too precisely by the Stein-Watkins data set. We consider curves of positive prime discriminant with even parity; for those with 0.64 < Ω re < 0.65 we have 78,784 curves, of which 45.9% have positive rank, while of the 9872 with 0.32 < Ω re < 0.325, we have 36.0% with positive rank, for a ratio of 1.28, which is not too close to √ 2. One consideration here is that we have placed a discriminant limit on our curves, and there are curves with larger discriminant and 0.32 < Ω re < 0.325 that we have not considered. This, however, is in contrast to the idea that only the real period should be of import.
One possibility is that curves with small discriminant and/or large real period have smaller probability of L(E, 1) = 0 than our estimate of c √ Ω re would suggest. Indeed, it might be argued (perhaps due to arithmetic considerations, or perhaps explicit formulas for the zeros of L-functions) that curves with such small discriminant cannot realize their nominal expected frequency of positive rank.
Unfortunately, we cannot do much to quantify these musings, since the effect would likely be in a secondary term, making it difficult to detect experimentally. Note also that a relative depression of rank for curves of small discriminant would give a reason for the nearconstant average rank observed by Brumer-McGuinness and Stein-Watkins.

Mordell-Weil Lattice Distribution for Rank-2 Curves
We have other evidence that curves of small discriminant might not behave quite as expected. We undertook to compute generators for the Mordell-Weil group for all 2,143,079 curves of (analytic) rank 2 of prime conductor less than 10 10 in the Stein-Watkins database. 20 J. E. Cremona ran his mwrank program [Cremona 05] on all these curves, and it was successful in provably finding the Mordell-Weil group for 2,114,188 of these. For about 2500 curves, the search region was too big to find the 2-covering quartics via invariant methods, while around 8500 curves had a generator of large height that could not be found, and over 18,000 had 2-Selmer rank greater than 2. We then used the FourDescent machinery of Magma, which reduced the number of problematic curves to 54. Of these, 19 have analytic X of 16.0, and we expect that either 3-descent or 8-descent [Stamminger 05] will complete (assuming the generalized Riemann hypothesis to compute the class group) the Mordell-Weil group verification; for the 35 other curves, there is likely a generator of height more than 225, which we did not attempt to find. 21 We then looked at the distribution of the Mordell-Weil lattices obtained from the induced inner product from the height pairing; since all of our curves have rank 2, we get 2-dimensional lattices. We are not so interested in the size of the obtained lattices, but more in their shape. Via the use of lattice reduction (which reduces to continued fractions in this case), given any two generators we can find the point P of smallest positive height on the curve. By normalizing P to be the unit vector, we then get a vector in the upper half-plane corresponding to another generator Q.
Via the standard reduction algorithm, we can translate Q so that it corresponds to a point in the fundamental domain for the action of SL 2 (Z). Finally, by replacing Q by −Q if necessary, we can ensure that this point is in the right half of the fundamental domain (in other words, we must choose an embedding for our Mordell-Weil lattice). In this manner, for each rank-2 curve we associate a unique point z = x + iy in the upper half-plane with x 2 + y 2 ≥ 1 and 0 ≤ x ≤ 1 2 . With no other guidance, we might expect that the obtained distribution for the z is given by 22 the Haar mea- 20 We also computed the Mordell-Weil group for curves with higher ranks but do not describe the obtained data here. 21 A bit more searching might resolve a few of the outstanding cases, but the extremal case of [0, 0, 1, −237882589, −1412186639384] appears to have a minimal generator of height more than 600, so that other methods are likely to be needed in order to find it. Indeed, T. A. Fisher [Fisher 07] has recently used 6-descent and 12-descent to find the missing generators on these 35 curves. 22 Siegel [Siegel 45] similarly uses Haar measure to put a natural measure on n-dimensional lattices of determinant 1. sure (dx dy)/y 2 . We find, however, that this is not borne out too well by experiment. In particular, we should expect that 1/2 π/6 ≈ 95.5% of the curves should have y ≥ 1, while the experimental result is about 93.5%. Furthermore, we should expect that the proportion of curves with y ≥ Y should die off like 1/Y as y → ∞; however, we get that 35.6% of the curves have y ≥ 2, only 9.7% have y ≥ 4, while 1.97% have y ≥ 8 and 0.35% have y ≥ 16.
The validity of the vertical distribution data might be arguable based on concerns regarding the discriminant cutoff of our data set, but the horizontal distribution is also skewed. If we consider only curves with y ≥ 1, then we should get uniform distribution in the x-aspect; however, Table 1 shows that we do not have such uniformity.
We cannot say whether these unexpected results from the experimental data are artifacts of choosing curves with small discriminant; it is just as probable that our Haar-measure hypothesis concerning the lattice distribution is simply incorrect. At the suggestion of D. B. Zagier, we made a density plot of the ratio between the experimental and conjectural counts for bins in the fundamental domain; see Figure 5, where the axes are switched and the y-coordinate is plotted logarithmically. At the right edge of the graph, the conjectural amount (≈ 1500 per bin) is typically ten times the experimental amount; this increases to a factor of 100 for y ≈ 70. Overall, our data seem to imply that the lattices are not as skinny and are less orthogonal than might be guessed.

Symmetric Power L-Functions
Similar to questions about the vanishing of L(E, s), we can ask about the vanishing of the symmetric power Lfunctions L(Sym 2k−1 E, s). We refer the reader to [Martin and Watkins 06] for more details about this, but mention that due to conjectures of Deligne and more generally Bloch and Beȋlinson [Rapoport et al. 88], we expect that we should have a formula similar to that of Birch and Swinnerton-Dyer, stating that should be rational with small denominator. Here, for k odd, Ω + is the real period and Ω − the imaginary period, with this reversed for k even. As noted in [Buhler et al. 97], the order of vanishing of the central L-value should be related to the rank of the Griffiths group of the symmetric power variety.
Ignoring the contribution from the conductor, and crudely estimating that Ω + ≈ Ω − ≈ 1/∆ 1/12 , an application of discretization as before gives that the probability that L(Sym 2k−1 E, s) has even parity and that L(Sym 2k−1 E, k) = 0 is bounded from above by c(log ∆) 3/8 · 1/∆ k 2 /12 .Again following the analogy of the above, we can then get an upper bound c k ( )X 5/6−k 2 /24+ (for every > 0) for the number of curves with conductor less than X with even-signed symmetric (2k − 1)st power and L(Sym 2k−1 E, k) = 0.
It could be argued that we should order curves according to the conductor of the symmetric power L-function rather than that of the curve, but we do not think such concerns are that relevant to our imprecise discussion. In particular, the above estimate predicts that there are finitely many curves with extra vanishing when k ≥ 5 (that is, finitely many extra vanishings for the ninth symmetric power and beyond).
It should be said that this heuristic will likely mislead us about curves with complex multiplication, for which the symmetric power L-function factors (it is imprimitive in the sense of the Selberg class), with each factor having a 50% chance of having odd parity. However, even ignoring CM curves, the data 23 of [Martin and Watkins 06] find a handful of curves for which the 9th, 11th, and even the 13th symmetric powers appear (up to 12 dig-its of precision) to have a central zero of order 2. We find this surprising, and it casts some doubt about the validity of our methodology of modeling vanishings.

Quadratic Twists of Higher Symmetric Powers
The techniques we used earlier in this paper have also been used to model vanishings in quadratic twist families, and we can extend the analyses to symmetric powers.
6.6.1 Non-CM Curves. We fix a non-CM curve E and let E d be its dth quadratic twist, taking d to be a fundamental discriminant. From an analogue of the Birch and Swinnerton-Dyer conjecture we expect to get a rational number with small denominator from the quotient 24 We have that (with the periods reversed when d < 0), and so we expect the number of fundamental discriminants |d| < D such that L(Sym 3 E d , s) has even parity with L(Sym 3 E d , 2) = 0 to be given crudely (up to log factors) by d<D c/ √ d 2 . So we expect about (log D) b quadratic twists with double zeros for the third symmetric power; generalizing predicts finitely many extra vanishings for higher (odd) powers.
We took the curves 11a: [0, −1, 1, 0, 0] and 14a: [1, 0, 1, −1, 0], and computed either L(Sym 3 E d , 2) or L (Sym 3 E d , 2) for all fundamental discriminants d with |d| < 5000. We did the same for 15a: [1, 1, 1, 0, 0] for |d| < 4000. We then looked at the number of vanishings (to nine digits of precision). For 11a we found 58 double zeros and one triple zero (indicated by a star in Table 2), while for 14a we found 88 double zeros and three triple zeros, and 15a yielded 83 double zeros and two triple zeros. It is quite difficult to accrue much data, mostly due to the fast growth of the conductor; for elliptic curves, Rubinstein [Conrey et al. 06] does not compute L(E d , 1) directly, but rather uses the Waldspurger formula (as made explicit in works such as [Pacetti and Tornaría 08]) and then just computes the coefficients of a modular form of weight 3 2 by enumerating lattice points in an ellipsoid. 24 The contribution from the conductor actually comes from nonintegral Tamagawa numbers from the Bloch-Kato exponential map, and in the case of quadratic twists, the twisting parameter d should not appear in the final expression. 6.6.2 CM Curves. Next we consider CM curves, for which we can compute significantly more data, but the modeling of vanishings is slightly different. Let E be a rational elliptic curve with CM, and ψ its Hecke character. We shall take ψ to be "twist-minimal"; this is not the same as the "canonical" character of Rohrlich [Rohrlich 80, Rohrlich and Montgomery 80], but rather we just take E to be a minimal (quadratic) twist. Indeed, we shall consider only 11 different choices of E, given (up to isogeny class) by 27a, 32a, 36a, 49a, 121a, 256a, 256b, 361a, 1849a, 4489a, and 26569a, noting that 27a/36a and 32a/256b are respectively cubic and quartic twist pairs. In the tables herein, these can appear in a briefer format, such as 67 2 for 4489a.
We normalize the Hecke L-function L(ψ, s) to have s = 1 be the center of the critical strip. For d a fundamental discriminant, we let ψ d be the Hecke Grössencharakter ψ twisted by the quadratic Dirichlet character of discriminant d. Finally, note that the symmetric powers L(Sym 2k−1 ψ, s) are just L(ψ 2k−1 , s), where we must take ψ 2k−1 to be the primitive underlying Grössencharakter.
We then expect L(ψ 3 , 2)(2π)/Ω im (E) 3 to be rational with small denominator. We can then use discretization as before to count the expected number of fundamental discriminants |d| < D for which the L-function L(ψ 3 d , s) has even parity but vanishes at the central point; since we have , we expect the number of discriminants d with even parity and L(ψ 3 d , 2) = 0 to be crudely given by d<D 1/ √ d 3/2 , so we should get about D 1/4 such discriminants up to D. Alternatively, we note that ψ 3 corresponds to a weight-4 modular form, so that there is a weight-5 2 Shimura lift of it whose coefficients are related to L(ψ 3 d , 2) via the Waldspurger correspondence; on considering how often these coefficients should vanish, we obtain a similar heuristic.
For higher symmetric powers, we expect that is rational with small denominator, and thus that there should be finitely many quadratic twists of even parity with vanishing central value.
We took the above eleven CM curves and took their (fundamental) quadratic 25 twists up to 10 5 . We must 25 The quartic twists of 32a and cubic twists of 27a/36a might also give interesting data; already in the early 1980s, N. M. Stephens computed that the seventh symmetric power of the Hecke character for y 2 = x 3 +127x yields a double central vanishing. See [Greenberg 83] for related information.   be careful to exclude twists that are isogenous to other twists. In particular, we need to define a primitive discriminant for a curve with CM by an order of the field K; this is a fundamental discriminant d such that disc(K) does not divide d, expect that K = Q(i) when d > 0 is additionally primitive when 8 d. Note also that 27a and 36a have the same symmetric cube L-function. Table 3 lists our results for counts of central double zeros (to 32 digits) for the L-functions of the third, fifth, and seventh symmetric powers. 26 Tables 4 and 5 list the primitive discriminants that yield the double zeros. The notable signedness can be explained via the sign of the functional equation. 27 We are unable to explain the paucity of double zeros for twists of 26569a; [Liu and Xu 04] has the latest results on the vanishing of such L-functions, but their bounds are far from the observed data. Similarly, the last-listed double zero for 4489a at 67,260 seems a bit odd. We stress, however, that we fully expect the asymptotic prediction of cD 1/4 (log D) b to be correct here, our suspicion being that the constant c for 26569a is rather small.
There appear to be implications vis-à-vis higher vanishings in some cases; for instance, except for 27a, in the thirteen cases that L(ψ 5 d , s) has a double zero at s = 3, we have that L(ψ d , s) also has a double zero at s = 1.
Similarly, the seventh symmetric power for the 27,365th twist of 121a has a double zero, as does the third symmetric power, while the L-function of the twist itself has a triple zero. Also, the 22,909th twist of 36a has double zeros for its first, third, and fifth powers (note that 36a does not appear in Table 4, since the data are identical to those for 27a).

Comparison between the CM and Non-CM Cases.
For the twist computations for the symmetric powers, we can go much further (about 20 times as far) in the CM case because the conductors do not grow as rapidly. For the third symmetric power, the crude prediction is that we should have (asymptotically) many more extra vanishings for twists in the CM case than in the non-CM case, but this is not borne out by the data. Additionally, we have no triple zeros in the CM case (where the data set is almost one hundred times as large), while we already have six for the non-CM curves. 28 This is directly antithetical to our suspicion that there should be more extra vanishings in the CM case. As before, this might cast some doubt on our methodology of modeling vanishings.
In [Rodriguez Villegas and Zagier 91, Section 8], the authors mention the possibility of a formula of Waldspurger type for the twists of the Hecke Grössencharakters, but it does not seem that an exact formula has ever appeared. Using the techniques devel-  oped by Basmaji and Frey [Frey 94], we are able to compute the weight-5 2 lift for (say) the symmetric cube of Hecke Grössencharakter attached to 49a. However, since we are currently unable to write it as a twisted ternary theta series as in [Rosson and Tornaría 07], it does not seem to aid our computations. In the non-CM case, it has been noted by R. Schulze-Pillot that a special case of a result of Ramakrishnan and Shahidi [Ramakrishnan and Shahidi 07] reinterprets the symmetric cube L-function as a weight-3 spinor L-function associated to a degree-2 Siegel modular form; again we are currently unable to use this in our computations.