## New Pattern Found In Prime Numbers 509

Posted
by
Soulskill

from the benford-and-sons dept.

from the benford-and-sons dept.

stephen.schaubach writes

*"Spanish Mathematicians have discovered a new pattern in primes that surprisingly has gone unnoticed until now. 'They found that the distribution of the leading digit in the prime number sequence can be described by a generalization of Benford's law. ... Besides providing insight into the nature of primes, the finding could also have applications in areas such as fraud detection and stock market analysis. ... Benford's law (BL), named after physicist Frank Benford in 1938, describes the distribution of the leading digits of the numbers in a wide variety of data sets and mathematical sequences. Somewhat unexpectedly, the leading digits aren't randomly or uniformly distributed, but instead their distribution is logarithmic. That is, 1 as a first digit appears about 30% of the time, and the following digits appear with lower and lower frequency, with 9 appearing the least often.'"*
## Re:Other bases? (Score:5, Informative)

Benson's Law is actually independent of the number base used. It wouldn't be much of a mathematical property if it wasn't. No matter how you convert a number, you will always see the same bias.

## The real article, and what it does and doesn't say (Score:3, Informative)

You can find the mathematicians' article at http://www.citeulike.org/group/3214/article/3664693 [citeulike.org] or http://arxiv.org/pdf/0811.3302 [arxiv.org] (pdf warning).

I find it interesting that the article doesn't prove any theorems. At least searching for the word "theorem" in the pdf only gives references to other theorems. Searching for "proof" gives no hits.

That leaves me thinking: what does this article tell us that we couldn't find out ourselves by ripping through some prime numbers? I thought the real power of math was to say something 100% certain about some infinitude of stuff, so we don't have to go and check every case by hand.

Oh well, I guess every open question needs some results on the form "this holds for all n <= bignum"; say, like the Goldbach Conjecture (every even number n > 2 is the sum of two primes).

## Re:Other bases? (Score:5, Informative)

logarithmicnature of the distribution of the digits, AFAIK.My math degree is getting dusty, but I'm pretty sure that the same pattern could be represented in another base by changing their generalization of Benford's law to include it, and the distribution would look like log(x)/log(9) or log(x)/log(11). Remember, changing the base of a logarithm is easy: for example, log(x)/log(

e) = ln(x)So you get the same distribution, different base.

## Isn't this obvious? (Score:1, Informative)

TFA does provide many details, but I believe everyone doing maths is taught that the number of primes before some number n is approached by n/ln n ( http://en.wikipedia.org/wiki/Prime_counting_function). This has been known for more that a century.

As specified by this formula, the prime density decreases, so it seems obvious that if one considers all primes with a set number of digits, fewer start with a nine than with a one.

Maybe I'm missing something but this does not seem to provide any new information or new pattern.

## Re:Other bases? (Score:5, Informative)

from benfords law link:

"The result holds regardless of the base in which the numbers are expressed, although the exact proportions change."

## What the arXiv paper says (Score:2, Informative)

Before I begin, I am a math phd candidate, but not in number theory. The following is probably better than a lay interpretation, but not an expert either.

Basically, they have generalized BL (Benson's Law) to get a GBL. They then tested the primes in the range [1,10^11] against GBL, and verified they were satisfied. They

DID NOT PROVE THIS HOLDS FOR ALL PRIMES!!!They then went on to conjecture the applications of this to other areas (finance, etc).Though the result is interesting, I really see this paper as a conjecture on the nature of primes, related to the Riemann zeta function. (From what I understand someone has proved zeros of the zeta function follow GBL.)

## Re:Stock market analysis? (Score:2, Informative)

Benford's law can be used to detect fraud (the article states this, I don't have any reason to doubt it). They studied primes and found a pattern that is associated with a related property that they are calling Generalized Benford's Law. Presumably, the generalized rule can be used to detect a wider range of unnatural activity than Benford's law itself.

## Re:On the density of prime numbers (Score:5, Informative)

Maybe if I had read the prime number theorem, I would have known that it's O(n / log n), which is somewhat bigger...

## Benford's law explained (Score:2, Informative)

This reminds me of an interesting article [dspguide.com] (PDF) I found a while back which explains Benford's law nicely. To quote:

## Re:Other bases? (Score:5, Informative)

Numbers are objects, I wish people would understand that numbers are just distinctions. The whole of mathematics is really just a language of form and structure, a system to systematize and decribe structure and forms (relationships are a type of form).

## Re:Other bases? (Score:5, Informative)

Benford's law works by the observation that, when numbers come up in certain real world contexts, the fluctuations you get in numbers should be proportional to the numbers themselves. Phrased differently, variations tend to be relative, not absolute. Because of this, if you have a very large range of random numbers from many real world measurements, then you would expect the number between t and t*(1.0001) not to vary too much for small changes in t. Let us try to use this observation very coarsely. Among the numbers with 6 digits, the number that look like 1xxxxxx (those between 100000 and 200000) should be about the same the number between 200000 and 400000. The same thing happens with the numbers with 5 digits or 7 digits or n digits (assuming that you have a wide range of random numbers, and the numbers are the kind that come from certain sorts of real world measurements). Additionally, you can get distributions for the first two digits, the first three digits, etc.

This observation doesn't depend on the base that you're working with.

Now, with the prime numbers, they have a distribution that is different from a lot of real world measurement data. The number of primes between n and n+d is approximately d/ln(n), where ln is the log with base e and d is small compared to n. So the number of primes between 500000 and 600000 is about 100000/ln(500000), and the number of primes between 500000 and 600000 is about 100000/ln(600000). By using this, and being slightly more careful, one can determine fairly easily the distribution of the leading terms of the prime numbers.

This is not a hard result. I would say that any professional mathematician who knew about the basic distribution of the primes could derive the distribution of the leading digis of the prime numbers fairly easily if anybody actually asked them to. The reason nobody mentioned this before is that nobody actually cares. While Benford's law does have applications to fraud detection, this new result does not. It's one of those things that makes people say "ooh, a pattern!" but which is just an easy and somewhat mundane corollary to a well known theorem.

## Not surprising at all (Score:4, Informative)

If you had asked me about the distribution of first digits of prime #s yesterday I probably would have guessed logarithmic, regardless of base (except for binary, of course).

Think about it. We know that primary # are distributed logarithmically. A set of N digit #s has equal subsets of numbers starting with 1, 2, 3, etc. Those subsets are equal in size, exclusive and completely ordered with respect to each other. So it follows that the # of prime #s in consecutive subsets would be a logarithmic function. And if you add the sizes of prime subsets for each starting digit, you'll still get a logarithmic distribution.

Nothing to see here, move along.

m

## Re:If you're dealing with phone numbers (Score:5, Informative)

In the original system design, all area codes had a middle digit of 0 or 1. The convention was that a middle digit of 1 was used for area codes that only covered part of a state, while a middle digit of 0 was used for area codes that covered entire states. Furthermore, an area code could not begin with a 1 or a 0. and an area code with a middle digit of 1 couldn't have 1 as the third digit. (This left the shortest dial time area code for a statewide code as 201, which went to New Jersey.)

As early as the late 1950s, the idea of single area codes for some states went out the window (with NJ splitting into 201 and 609 in 1958) because of increasing population and proliferation of phone service.

By the late 1980s, the rules were further changed to allow for area codes with middle digits other than 1 or 0. Area codes like 989 and 979 weren't introduced until the late 1980s at the very earliest, by which point very few people were still using rotary phones. At one point, I had heard that the middle digit value of 9 was reserved for the future to allow for four digit area codes, but I can't vouch for the accuracy of that recollection. There are plenty of other rules, some of which you can see summarized here... [wikipedia.org]

-JMP

## Re:Enron (Score:4, Informative)

At smaller scales than Enron, Benford and other related number distribution analysis schemes are indeed used to find fraud.

Know the time report you fill out for the company you work at? There's a chance that it passes through a filter. If you make up figures for how long you spend at certain tasks, someone higher up may see it with a footnote saying "High probability of data being fictitious". If this pattern repeat itself over months, don't be surprised if your chances of retaining your job diminishes.

On the other hand, you can use the statistics for your own advantage too. Not the least in games and gambling, where guessing your competitor's status and actions can change the odds quite dramatically. Including games and gambling you don't think of as games and gambling, like placing bids in auctions.

## Re:Other bases? (Score:5, Informative)

They are also distributed as Benson's law describes, providing that k is not a rational power of the base. IAAM.

## Re:Other bases? (Score:3, Informative)

Right, but the law is regarding distribution among the digits of possible first digits. Given that 1 is the only possible first digit in base 2, it could still be said to hold.

I don't actually know, I'm guessing here. But it seems like in base-8, you wouldn't be looking to include the digit 9 in your distribution analysis.

## Re:If you're dealing with phone numbers (Score:4, Informative)

Nice idea, but you give the phone company too much credit. In the old days telephone switches still used physical relays (this is well before transistors were invented). This significantly limited the number of connections in progress each switch could handle. Since switches are expensive, you naturally wanted to pass on the call as fast as possible so you could free up the switch for the next caller. A number like '212' wasn't just easy to dial, it was fast — remember this is the era of pulse dialing as well, so a '9' took literally 9 times longer to dial than a '1'. Assigning fast numbers like '212' to New York saved

moneyfor the phone company because Ma Bell could buy fewer switches. Any benefit to the customer was purely accidental.-JS

## Re:Other bases? (Score:5, Informative)

You mean, how many are Mercene primes [wikipedia.org]?

## Re:I'd go for base 12 (Score:4, Informative)

There on on 60's in a minute and 60 of those in an hour because we decided there should be.Actually, it's because the Sumerians and Babylonians used a base-60 counting system.

## Re:Other bases? (Score:5, Informative)

But how many would contain all 1s? Answer that, and provide a proof for your answer, and you'll make math history.

For those who didn't get it: it's not known whether there are infinitely many Mersenne primes [wikipedia.org], which have this form in binary (they're primes of the form 2^n - 1). Similarly, if you could figure out how many primes have

onlytheir first and last bits equal to 1, you would answer a longstanding question about Fermat primes [wikipedia.org] (which are primes of the form 2^n + 1).## Re:Other bases? (Score:3, Informative)

And sure enough, the formula says that the probability of 1 being the first digit of a binary number is 1.