Post List

Mathematics posts

(Modify Search »)

  • May 14, 2013
  • 09:30 PM
  • 30 views

Four color problem, odd Goldbach conjecture, and the curse of computing

by Artem Kaznatcheev in Evolutionary Games Group

For over twenty-three hundred years, at least since the publication of Euclid’s Elements, the conjecture and proof of new theorems has been the sine qua non of mathematics. The method of proof is at “the heart of mathematics, the royal road to creating analytical tools and catalyzing growth” (Rav, 1999; pg 6). Proofs are not […]... Read more »

Rav, Y. (1999) Why Do We Prove Theorems?. Philosophia Mathematica, 7(1), 5-41. DOI: 10.1093/philmat/7.1.5  

  • May 13, 2013
  • 05:30 AM
  • 43 views

Quasi-magical thinking and superrationality for Bayesian agents

by Artem Kaznatcheev in Evolutionary Games Group

As part of our objective and subjective rationality model, we want a focal agent to learn the probability that others will cooperate given that the focal agent cooperates () or defects (). In a previous post we saw how to derive point estimates for and (and learnt that they are the maximum likelihood estimates): , […]... Read more »

  • May 8, 2013
  • 11:30 PM
  • 38 views

Evolutionary games in set structured populations

by Artem Kaznatcheev in Evolutionary Games Group

We have previously discussed the importance of population structure in evolutionary game theory, and looked at the Ohtsuki-Nowak transform for analytic studies of games on one of the simplest structures — random regular graphs. However, there is another extremely simple structure to consider: a family of inviscid sets. We can think of each agent as [...]... Read more »

Tarnita, C., Antal, T., Ohtsuki, H., & Nowak, M. (2009) Evolutionary dynamics in set structured populations. Proceedings of the National Academy of Sciences, 106(21), 8601-8604. DOI: 10.1073/pnas.0903019106  

  • April 26, 2013
  • 05:27 AM
  • 87 views

Tartaglia-Pascal triangle and quantum mechanics

by Marco Frasca in The Gauge Connection

The paper I wrote with Alfonso Farina and Matteo Sedehi about the link between the Tartaglia-Pascal triangle and quantum mechanics is now online (see here). This paper contains as a statement my theorem that provides a connection between the square root of a Wiener process and the Schrödinger equation that arose a lot of interest [...]... Read more »

  • April 20, 2013
  • 03:17 AM
  • 79 views

Probabilities reveal shape of climate change

by Andy Extance in Simple Climate

David Stainforth from the London School of Economics and his colleagues have developed a new way to analyse weather data and understand which aspects of climate have changed most on a local level, showing European trends with less than a 2% chance of happening at random.... Read more »

Chapman, S., Stainforth, D., & Watkins, N. (2013) On estimating local long-term climate trends. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371(1991), 20120287-20120287. DOI: 10.1098/rsta.2012.0287  

  • April 16, 2013
  • 11:37 AM
  • 64 views

Homing Pigeons Never Stop Learning Ways to Get Home

by Elizabeth Preston in Inkfish





A young homing pigeon must learn quickly how to find its way home from the strange neighborhoods where humans insist on leaving it. At first the bird does this by relying on its crudest instincts, returning to its roost along a route full of youthful zigzags. Over time, though, it refines its methods. A mature pigeon takes a much simpler route, because it has drawn itself a more complex map.

Homing pigeons have been subjected to all kinds of research. The latest study used GPS devices, which the birds carried in little Teflon backpacks. Ingo Schiffner of the Queensland Brain Institute and Roswitha Wiltschko of Goethe-Universität Frankfurt studied pigeons at three different ages: juveniles (6 to 7 months old), yearlings (the same pigeons in their second year of life, after going through a training program), and older trained birds (at least two years of age).

Wearing their tracking harnesses, the birds were released from various sites that ranged from 3.2 to 23.5 kilometers away from their home loft in Frankfurt, Germany. Here are the routes some of the birds took when returning to their home (the square) from a release site 6.8 kilometers away (the triangle):






You'll notice that some pigeons traveled by more, shall we say, scenic routes than others. The researchers calculated each bird's "efficiency" and found that the youngest group of pigeons were the least direct fliers. On average, they traveled more than three times the distance of a straight-line trip between the two points. (The pigeon researchers, in a bit of a mixed-species metapor, refer to this ideal trip as a "beeline.")



The two older groups of birds were much more efficient, flying no more than 25 percent farther than they needed to. Since their youthful zigzagging days, they had gone through a training program that had them practicing as far as 40 kilometers from their home roost. Now more familiar with the features of the landscape around their home, they could navigate it easily.



But that didn't mean the pigeons stopped refining their internal maps after their first year. Schiffner and Wiltschko also calculated something called the "correlational dimension," which is the number of factors that seem to be contributing to a system—in this case, pigeon navigation. 



Previous research has suggested that homing pigeons have multiple tools in their navigational toolkit. In various experiments, "pigeons have been deprived of visual cues, magnetic cues, olfactory cues, infrasound cues, and their gravitational sense," Schiffner says, "yet pigeons are still able to find their way home." Rather than relying on just one tool at a time, they seem to use several.



The correlation dimension is meant to count how many tools each pigeon uses to complete a trip. The youngest pigeons usually hovered close to 2. But year-old pigeons had a somewhat higher score, and the oldest pigeons were closer to 3. In his previous research, Schiffner says, pigeons have seemed to use as many as 4 types of navigational cues simultaneously. 



This suggests that pigeons keep refining their mental maps as they age, adding new elements—visual landmarks, say, or the smell of a local factory—to others such as sunlight and magnetic fields. "I cannot say yet which factors pigeons are using," Schiffner says, but he believes the factors add up with age. He and Wiltschko report their results in the Journal of Experimental Biology. Schiffner adds, "I assume that pigeons continue to learn and integrate new information into their navigational map as they grow older."

Attaching GPS devices to animals is currently trendy; there's a whole new journal dedicated to the subject. But humans have a long history of rigging our technologies to pigeons. At the start of the 20th century, German apothecary Julius Neubronner designed and patented a little camera on a harness for homing pigeons to carry (he had previously used the birds to ferry prescriptions and drugs for his patients).

The German military toyed with using Neubronner's pigeon-camera technology for reconnaissance during World War I. With the cameras hooked to timers, the birds could take pictures above enemy lines and carry them back home. These days we're attaching our instruments to the accommodating birds not for the sake of spying on our enemies, but to decode the secrets of the pigeons themselves.






Schiffner, I., & Wiltschko, R. (2013). Development of the navigational system in homing pigeons: increase in complexity of the navigational map Journal of Experimental Biology DOI: 10.1242/jeb.085662

Images: Homing pigeons by Amanda Dague (via Flickr); figure from Schiffner and Wiltschko; pigeon cameras by Julius Neubronner (via Wikimedia Commons).

... Read more »

  • April 15, 2013
  • 05:30 AM
  • 23 views

Mathematical Turing test: Readable proofs from your computer

by Artem Kaznatcheev in Evolutionary Games Group

We have previously discussed the finicky task of defining intelligence, but surely being able to do math qualifies? Even if the importance of mathematics in science is questioned by people as notable as E.O. Wilson, surely nobody questions it as an intelligent activity? Mathematical reasoning is not necessary for intelligence, but surely it is sufficient? [...]... Read more »

  • April 13, 2013
  • 07:22 AM
  • 101 views

Alternate histories back unique modern warmth claims

by Andy Extance in Simple Climate

Creating and averaging thousands of slightly different historic temperature records shows that Northern hemisphere 21st century temperatures are almost certainly unique in the last 600 years, according to Harvard University’s Martin Tingley.... Read more »

  • April 8, 2013
  • 12:00 PM
  • 99 views

Programming playground: Cells as (quantum) computers?

by Artem Kaznatcheev in Evolutionary Games Group

Nearly a year ago, the previous post in this series introduced a way for programmers to play around with biology: a model that simulated the dynamics of a whole cell at unprecedented levels of details. But what if you want to play with the real thing? Can you program a living cell? Can you compute [...]... Read more »

Bonnet J, Yin P, Ortiz ME, Subsoontorn P, & Endy D. (2013) Amplifying Genetic Logic Gates. Science. PMID: 23539178  

  • April 4, 2013
  • 02:19 PM
  • 61 views

Kids Learn Better When Teachers Wave Their Hands

by Elizabeth Preston in Inkfish




Maybe it's no mistake that we talk about "grasping" new ideas. When we find our hands moving wildly as we try to explain something, maybe we shouldn't feel ridiculous. Research in math classrooms has found that kids learned better when a teacher used gestures—and their grip on the new material improved even more after the lesson ended.

Teachers who gesture more or less while they speak can have other differences too, of course: they might use different intonation or vocabulary, or have more or less energy. University of Iowa psychologist Susan Wagner Cook and her coauthors, though, were only interested in the effect of teachers' hand motions. To isolate this factor, they created a series of videos.

In the videos, aimed at elementary schoolers, a teacher taught a single scripted lesson. The subject of the lesson was equivalence, the idea that what's on one side of an "=" must be equal to the other side.

In one set of videos, the teacher used her hand to indicate "one side" and "the other side" of an equation. A second set of videos showed the same teacher reading the same script, but she kept her hands at her sides. The researchers made several recordings and chose the ones in which the teacher's intonation was the most consistent, ensuring that the only difference between the lessons was her hands.

The kids who watched the videos were 184 boys and girls from 22 classrooms in central Michigan schools. Most were in second or third grade, and a few were in fourth. The kids had taken a pretest to make sure they weren't already familiar with this mathematical idea.

Each classroom watched a videotaped lesson, either the one with hand gestures or the one without. Immediately afterward, they took a test with questions such as:

7 + 2 + 4 = 7 + __
Kids who had understood the lesson would answer "6."

A day later, the kids had a second set of test questions spring on them. First they answered the same type of questions that they had the day before. Then they saw a second set of questions designed to make them "transfer" the rules they'd learned to new situations. For second graders, this meant trickier addition problems such as:

6 + 4 + 2 = __ + 3
in which none of the numbers on the right side matched the left. Third and fourth graders had to transfer their new skills to multiplication problems such as:

5 x 2 x 3 = __ x 3

Kids who had seen the lesson with gestures did significantly better than the no-gesture kids on the first test. A day later, they again outperformed the hands-free group—and beat their own test scores from the day before. Their understanding of the lesson seemed to have gotten even better in the 24 intervening hours. (This wasn't true of kids who watched the hands-free lesson.) Finally, the gesture group did better on the test of transferred skills.

A couple factors could explain why students learned better from a gesturing teacher, the authors write. Hand movements might help them pay better attention to the teacher, for example. And seeing a repeated hand motion across different problems might reinforce how those problems are similar.

That doesn't answer the question of why students continued to improve over the next 24 hours. Susan Wagner Cook explains that shortly after we form new memories, those memories are stabilized or "consolidated" in our minds. Consolidation can make new memories even stronger.

"We do know that motor memory is often consolidated during sleep," Cook says. Seeing another person's hands moving may have built motor (movement) memories in kids' minds, as if they were pointing and waving their own hands. "One possibility is that memories encoded with gesture are more likely to be consolidated during sleep," Cook says. "We are trying to figure this out!"

Although being able to point to the two sides of an equation seems like a clear advantage in this particular lesson, Cook says the benefit of gesturing goes beyond arithmetic—or even math. Other studies have shown that hand motions help kids learn in a wide range of subjects.

What's new is the idea that gestures help in the future, not only the present. Cook points out that even though some kids learned the lesson just fine without gestures, they didn't show the same improvement over time that the other kids did. Instead of only clarifying, gestures may help kids grasp their new knowledge more tightly. (Please imagine a fist-closing gesture to drive home this idea.)


Image: sleepinyourhat (via Flickr)

Cook, S., Duffy, R., & Fenn, K. (2013). Consolidation and Transfer of Learning After Observing Hand Gesture Child Development DOI: 10.1111/cdev.12097

... Read more »

  • March 29, 2013
  • 02:00 AM
  • 94 views

Individual versus systemic risk in asset allocation

by Yunjun Yang in Evolutionary Games Group

Proponents of free markets often believe in an “invisible hand” that guides an economic system without external controls like government regulations. Therefore a highly efficient economic equilibrium can be created if all market participants act purely out of self-interest. In the paper titled “Individual versus systemic risk and the Regulator’s Dilemma.”, Beale et al. (2011) applied [...]... Read more »

Beale N., Rand D. G., Battey H., Croxson K., May R. M., & Nowak M. A. (2011) Individual versus systemic risk and the Regulator's Dilemma. Proceedings of the National Academy of Sciences, 108(31), 12647-12652. DOI: 10.1073/pnas.1105882108  

  • March 19, 2013
  • 04:30 AM
  • 130 views

Mathematical models in finance and ecology

by Artem Kaznatcheev in Evolutionary Games Group

Theoretical physicists have the reputation of an invasive species — penetrating into other fields and forcing their methods. Usually these efforts simply irritate the local researchers, building a general ambivalence towards field-hopping physicists. With my undergraduate training primarily in computer science and physics, I’ve experienced this skepticism first hand. During my time in Waterloo, I [...]... Read more »

May RM, Levin SA, & Sugihara G. (2008) Ecology for bankers. Nature, 451(7181), 893-5. PMID: 18288170  

  • March 14, 2013
  • 06:40 AM
  • 152 views

A Higgs particle but which one?

by Marco Frasca in The Gauge Connection

After Moriond conference last week, and while Moriond QCD and Aspen conferences are running yet, an important conclusion can be drawn and it is the one given in this CERN press release. The particle announced on 4th July last year is for certain a Higgs particle as it has spin 0, positive parity and couples [...]... Read more »

Marco Frasca. (2013) Revisiting the Higgs sector of the Standard Model. arXiv. arXiv: 1303.3158v1

Marco Frasca. (2009) Exact solutions of classical scalar field equations. J.Nonlin.Math.Phys.18:291-297,2011. arXiv: 0907.4053v2

  • March 13, 2013
  • 08:46 AM
  • 170 views

Predicting Technological Progress: Putting Moore’s Law to the Test

by gunnardw in The Beast, the Bard and the Bot

Being able to predict the pace of technological development could be quite useful for a lot of people. No surprise then, that several models (or ‘laws’) have been posited that aim to describe how technological progress will unfurl (the most famous one probably being Moore’s law, for those interested: original article here). However, these laws [...]... Read more »

  • March 13, 2013
  • 02:40 AM
  • 258 views

Brain Lateralization - Logical Left vs Creative Right

by Vivek Misra in Beautiful Mind

Broad generalizations are often made in popular psychology about one side or the other having characteristic labels, such as "logical" for the left side or "creative" for the right. These labels need to be treated carefully; although a lateral dominance is measurable, both hemispheres contribute to both kinds of processes.In psychology and neurobiology, the theory is based on what is known as the lateralization of brain function. So does one side of the brain really control specific functions? Are people either left-brained or right-brained? Like many popular psychology myths, this one has a basis in fact that has been dramatically distorted and exaggerated.Language functions such as grammar, vocabulary and literal meaning are typically lateralized to the left hemisphere, especially in right handed individuals. Although 95% of right-handed people have left-hemisphere dominance for language, 18.8% of left-handed people have right-hemisphere dominance for language function. Additionally, 19.8% of the left-handed have bilateral language functions. Even within various language functions (e.g., semantics, syntax, prosody), degree (and even hemisphere) of dominance may differ. The processing of visual and auditory stimuli, spatial manipulation, facial perception, and artistic ability are represented bilaterally, but may show right hemisphere superiority. Numerical estimation, comparison and online calculation depend on bilateral parietal regions while exact calculation and fact retrieval are associated with left parietal regions, perhaps due to their ties to linguistic processing.  Dyscalculia is a neurological syndrome associated with damage to the left temporo-parietal junction. This syndrome is associated with poor numeric manipulation, poor mental arithmetic skill, and the inability to either understand or apply mathematical concepts. The right brain-left brain theory grew out of the work of Roger W. Sperry, who was awarded the Nobel Prize in 1981. While studying the effects of epilepsy, Sperry discovered that cutting the corpus collosum could reduce or eliminate seizures.However, these patients also experienced other symptoms after the communication pathway between the two sides of the brain was cut. For example, many split-brain patients found themselves unable to name objects that were processed by the right side of the brain, but were able to name objects that were processed by the left-side of the brain. Based on this information, Sperry suggested that language was controlled by the left-side of the brain.Depression is linked with a hyperactive right hemisphere, with evidence of selective involvement in "processing negative emotions, pessimistic thoughts and unconstructive thinking styles", as well as vigilance, arousal and self-reflection, and a relatively hypoactive left hemisphere, "specifically involved in processing pleasurable experiences" and "relatively more involved in decision-making processes". Additionally, "left hemisphere lesions result in an omissive response bias or error pattern whereas right hemisphere lesions result in a commissive response bias or error pattern." The delusional misidentification syndromes, reduplicative paramnesia and Capgras delusion are also often the result of right hemisphere lesions. There is evidence that the right hemisphere is more involved in processing novel situations, while the left hemisphere is most involved when routine or well-rehearsed processing is called for. Later research has shown that the brain is not nearly as dichotomous as once thought. For example, recent research has shown that abilities in subjects such as math are actually strongest when both halves of the brain work together.Taylor, I. & Taylor, M. M. (1990). Psycholinguistics: Learning and using Language. Pearson. ISBN 978-0-13-733817-7. p. 367Beaumont, J.G. (2008). Introduction to Neuropsychology, Second Edition. The Guilford Press. ISBN 978-1-59385-068-5. Chapter 7Ross, E., & Monnot, M. (2008). Neurology of affective prosody and its functional–anatomic organization in right hemisphere Brain and Language, 104 (1), 51-74 DOI: 10.1016/j.bandl.2007.04.007... Read more »

George MS, Parekh PI, Rosinsky N, Ketter TA, Kimbrell TA, Heilman KM, Herscovitch P, & Post RM. (1996) Understanding emotional prosody activates right hemisphere regions. Archives of neurology, 53(7), 665-70. PMID: 8929174  

Dehaene, S., Piazza, M., Pinel, P., & Cohen, L. (2003) THREE PARIETAL CIRCUITS FOR NUMBER PROCESSING. Cognitive Neuropsychology, 20(3-6), 487-506. DOI: 10.1080/02643290244000239  

  • March 11, 2013
  • 04:00 PM
  • 166 views

Ecological public goods game

by Artem Kaznatcheev in Evolutionary Games Group

As an evolutionary game theorist working on cooperation, I sometimes feel like a minimalist engineer. I spend my time thinking about ways to design the simplest mechanisms possible to promote cooperation. One such mechanism that I accidentally noticed (see bottom left graph of results from summer 2009) is the importance of free space, or — [...]... Read more »

Hauert, C., Holmes, M., & Doebeli, M. (2006) Evolutionary games and population dynamics: maintenance of cooperation in public goods games. Proceedings of the Royal Society B: Biological Sciences, 273(1600), 2565-2571. DOI: 10.1098/rspb.2006.3600  

  • March 6, 2013
  • 11:20 AM
  • 176 views

Hey Hey! We’re The Monkeys!

by Miss Behavior in The Scorpion and the Frog

 A tamarin rock star (photographed by Ltshears at Wikimedia)Our moods change when we hear music, but not all music affects us the same way. Slow, soft, higher-pitched, melodic songs soothe us; upbeat classical music makes us more alert and active; and fast, harsh, lower-pitched, dissonant music can rev us up and stress us out. Why would certain sounds affect us in specific emotional ways? One possibility is because of an overlap between how we perceive music and how we perceive human voice. Across human languages, people talk to their babies in slower, softer, higher-pitched voices than they speak to adults. And when we’re angry, we belt out low-pitched growly tones. The specific vocal attributes that we use in different emotional contexts are specific to our species… So what makes us so egocentric to think that other species might respond to our music in the same ways that we do?A serene tamarin ponders where he placed his smoking jacket (photographed by Michael Gäbler at Wikimedia)Chuck Snowdon, a psychologist and animal behaviorist at the University of Wisconsin in Madison, and David Teie, a musician at the University of Maryland in College Park, teamed up to ask whether animals might respond more strongly to music if it were made specifically for them. Cotton-top tamarins are squirrel-sized monkeys from northern Colombia that are highly social and vocal. As in humans (and pretty much every other vocalizing species studied), they tend to make higher-pitched tonal sounds when in friendly states and lower-pitched growly sounds when in aggressive states. But tamarin vocalizations have different tempos and pitch ranges than our tempos and pitch ranges.Chuck and David musically analyzed recorded tamarin calls to determine the common attributes of the sounds they make when they are feeling friendly or when they are aggressive or fearful. Then they composed music based on these attributes, essentially creating tamarin happy-music and tamarin death metal. They also composed original music based on human vocal attributes. They played 30-second clips of these different music types to pairs of tamarins and measured their behavior while the song was being played and for the first 5 minutes after it had finished. They compared these behavioral measures to the tamarins’ behavior during baseline periods (time periods not associated with the music sessions).An example of happy tamarin music (Copyright by David Teie and available through Biology Letters) can be found here.An example of aggressive tamarin music (Copyright by David Teie and available through Biology Letters) can be found here.As the researchers had predicted, tamarins were much more affected by tamarin music than by human music. Happy tamarin music seemed to calm them, causing the tamarins to move less and eat and drink more in the 5 minutes after the music stopped. Compared to the happy tamarin music, the aggressive tamarin music seemed to stress them out, causing the tamarins to move more and show more anxious behaviors (like bristling their fur and peeing) after the music stopped. The tamarins also showed lesser reactions to the human music. They showed less anxious behavior after the happy human music played and moved less after the aggressive human music played. So, human voice-based music also affected the tamarins to some degree, but not as strongly. This may be because there are some aspects of how we communicate emotions with our voice that are the same in tamarins. (How did the tamarin music make you feel?) Can you imagine what we could do with this idea of species-specific music? Well, David and Chuck did! They have since developed music for cats using similar techniques. Although they're still working on the paper, they have said that the cats prefered and were more calmed by cat music compared to human music. You can find samples and get your own copies here. We often think of vocal signals conveying messages in particular sounds, like words and sentences. But calls seem to do much more than that, making the emotions and behaviors of those listening resemble the emotions of those calling.Want to know more? Check this out:Snowdon, C., & Teie, D. (2009). Affective responses in tamarins elicited by species-specific music Biology Letters, 6 (1), 30-32 DOI: 10.1098/rsbl.2009.0593... Read more »

  • March 4, 2013
  • 09:13 PM
  • 140 views

How Many English Tweets are Actually Possible?

by Jon Wilkins in Lost in Transcription

So, recently (last week, maybe?), Randall Munroe, of xkcd fame, posted an answer to the question "How many unique English tweets are possible?" as part of his excellent "What If" series. He starts off by noting that there are 27 letters (including spaces), and a tweet length of 140 characters. This gives you 27140 -- or about 10200 -- possible strings.

Of course, most of these are not sensible English statements, and he goes on to estimate how many of these there are. This analysis is based on Shannon's estimate of the entropy rate for English -- about 1.1 bits per letter. This leads to a revised estimate of 2140 x 1.1 English tweets, or about 2 x 1046. The rest of the post explains just what a hugely big number that is -- it's a very, very big number.

The problem is that this number is also wrong.

It's not that the calculations are wrong. It's that the entropy rate is the wrong basis for the calculation.

Let's start with what the entropy rate is. Basically, given a sequence of characters, how easy is it to predict what the next character will be. Or, how much information (in bits) is given by the next character above and beyond the information you already had.

If the probability of a character being the ith letter in the alphabet is pi, the entropy of the next character is given by

– Σ pi log2 pi
If all characters (26 letter plus space) were equally likely, the entropy of the character would be log227, or about 4.75 bits. If some letters are more likely than others (as they are), it will be less. According to Shannon's original paper, the distribution of letter usage in English gives about 4.14 bits per character. (Note: Shannon's analysis excluded spaces.)

But, if you condition the probabilities on the preceding character, the entropy goes down. For example, if we know that the preceding character is a b, there are many letters that might follow, but the probability that the next character is a c or a z is less than it otherwise might have been, and the probability that the next character is a vowel goes up. If the preceding letter is a q, it is almost certain that the next character will be a u, and the entropy of that character will be low, close to zero, in fact.

When we go to three characters, the marginal entropy of the third character will go down further still. For example, t can be followed by a lot of letters, including another t. But, once you have two ts in a row, the next letter almost certainly won't be another t.

So, the more characters in the past you condition on, the more constrained the next character is. If I give you the sequence "The quick brown fox jumps over the lazy do_," it is possible that what follows is "cent at the Natural History Museum," but it is much more likely that the next letter is actually "g" (even without invoking the additional constraint that the phrase is a pangram). The idea is that, as you condition on longer and longer sequences, the marginal entropy of the next character asymptotically approaches some value, which has been estimated in various ways by various people at various times. Many of those estimates are in the ballpark of the 1.1 bits per character estimate that gives you 1046 tweets.

So what's the problem?

The problem is that these entropy-rate measures are based on the relative frequencies of use and co-occurrence in some body of English-language text. The fact that some sequences of words occur more frequently than other, equally grammatical sequences of words, reduces the observed entropy rate. Thus, the entropy rate tells you something about the predictability of tweets drawn from natural English word sequences, but tells you less about the set of possible tweets.

That is, that 1046 number is actually better understood as an estimate of the likelihood that two random tweets are identical, when both are drawn at random from 140-character sequences of natural English language. This will be the same as number of possible tweets only if all possible tweets are equally likely.

Recall that the character following a q has very low entropy, since it is very likely to be a u. However, a quick check of Wikipedia's "List of English words containing Q not followed by U" page reveals that the next character could also be space, a, d, e, f, h, i, r, s, or w. This gives you eleven different characters that could follow q. The entropy rate gives you something like the "effective number of characters that can follow q," which is very close to one.

When we want to answer a question like "How many unique English tweets are possible?" we want to be thinking about the analog of the eleven number, not the analog of the very-close-to-one number.

So, what's the answer then?

Well, one way to approach this would be to move up to the level of the word. The OED has something like 170,000 entries, not counting archaic forms. The average English word is 4.5 characters long (5.5 including the trailing space). Let's be conservative, and say that a word takes up seven characters. This gives us up to twenty words to work with. If we assume that any sequence of English words works, we would have 4 x 10104 possible tweets.

The xkcd calculation, based on an English entropy rate of 1.1 bits per character predicts only 1046 distinct tweets. 1046 is a big number, but 10104 is a much, much bigger number, bigger than 1046 squared, in fact.

If we impose some sort of grammatical constraints, we might assume that not every word can follow every other word and still make sense. Now, one can argue that the constraint of "making sense" is a weak one in the specific context of Twitter (see, e.g., Horse ebooks), so this will be quite a conservative correction. Let's say the first word can be any of the 170,000, and each of the following zero to nineteen words is constrained to 20% of the total (34,000). This gives us 2 x 1091 possible tweets.

That's less than 1046 squared, but just barely.

1091 is 100 billion time the estimated number of atoms in the observable universe.

By comparison, 1046 is teeny tiny. 1046 is only one ten-thousandth of the number of atoms in the Earth.

In fact, for random sequences of six (seven including spaces) letter words to total only to 1046 tweets, we would have to restrict ourselves to a vocabulary of just 200 words.

So, while 1046 is a big number, large even in comparison to the expected waiting time for a Cubs World Series win, it actually pales in comparison to the combinatorial potential of Twitter.

One final example. Consider the opening of Endymion by John Keats: "A thing of beauty is a joy for ever: / Its loveliness increases; it will never / Pass into nothingness;" 18 words, 103 characters. Preserving this sentence structure, imagine swapping out various words, Mad-Libs style, introducing alternative nouns for thing, beauty, loveliness, nothingness, alternative verbs for is, increases, will / pass prepositions for of, into, and alternative adverbs for for ever and never.

Given 10000 nouns, 100 prepositions, 10000 verbs, and 1000 adverbs, we can construct 1038 different tweets without even altering the grammatical structure. Tweets like "A jar of butter eats a button quickly: / Its perspicacity eludes; it can easily / swim through Babylon;"

That's without using any adjectives. Add three adjective slots, with a panel of 1000 adjectives, and you get to 1047 -- just riffing on Endymion.

So tweet on, my friends.

Tweet on.

C. E. Shannon (1951). Prediction and Entropy of Written English Bell System Technical Journal, 30, 50-... Read more »

C. E. Shannon. (1951) Prediction and Entropy of Written English. Bell System Technical Journal, 50-64. info:/

  • March 4, 2013
  • 01:50 PM
  • 164 views

Distributed control of uncertain systems using superpositions of linear operators - Likelihood calculus paper series review part 3

by Travis DeWolf in studywolf

The third (and final, at the moment) paper in the likelihood calculus series from Dr. Terrence Sanger is Distributed control of uncertain systems using superpositions of linear operators. Carrying the torch for the series right along, here Dr. Sanger continues investigating the development of an effective, general method of controlling systems operating under uncertainty. This is the paper that delivers on all the promises of building a controller out of a system described by the stochastic differential operators we’ve been learning about in the previous papers. In addition to describing the theory, there are examples of system simulation with code provided! Which is a wonderful, and sadly uncommon, thing in academic papers, so I’m excited. We’ll go through a comparison of Bayes’ rule and Markov processes (described by our stochastic differential equations), go quickly over the stochastic differential operator description, and then dive into the control of systems. The examples and code run-through I’m going to have to save for another post, though, just to keep the size of this post reasonable.... Read more »

  • March 3, 2013
  • 12:55 PM
  • 188 views

Why Are People Bad at Evaluating Risks?

by Eric Horowitz in peer-reviewed by my neurons

Using evidence or data to communicate risk to the American public can be a fool’s errand. The most publicized “la, la, la, I can’t hear you!” moments involve people ignoring dangers that threaten ideology or political beliefs. Others may choose to ignore risks because immediate short-term pleasures are too alluring. [...]... Read more »

join us!

Do you write about peer-reviewed research in your blog? Use ResearchBlogging.org to make it easy for your readers — and others from around the world — to find your serious posts about academic research.

If you don't have a blog, you can still use our site to learn about fascinating developments in cutting-edge research from around the world.

Register Now

Research Blogging is powered by SMG Technology.

To learn more, visit seedmediagroup.com.