The Rodriguez File (part 4)

Daniel EastwoodA reader responded to Daniel Eastwood’s efforts (click here to see the exertions) to solve ‘The Rodriguez File’:

“Mrs Rodriguez may have meant ‘straight vertical lines of 5 or 6 spaces’. You have drawn lines that have different angles. Would that make a difference?”

Naturally, Dan Eastwood, again dug into the file and replied:

“Yes it does. I interpreted her question to mean the diagonal line ending at about the extra ‘e’ before lucubrations. Rereading this now, I think I misunderstood her intent. Vertical lines are much easier to count though:
(Length, # of): 2, 23; 3, 2; 4, 0; 5, 1; 6, 1. This is an average length of 0.33, but a Poisson distribution (my original hypothesis) is most certainly not correct. Unfortunately, this makes the math harder.This isn?t a complete answer, but it?s the best I can do now:

There are 36 cases where a space on one line is followed by a space on the subsequent line (this counts longer lines several times) out of 15 subsequent rows. This is an average of 36/15 = 2.4 per pair of (subsequent) rows. In an 80 character line, the probability that a space will be followed in the next row by another space is 2.4/80 = 0.03.

Assuming the characters are random and the rows independent, then the probability a line of length 2 being followed by a third subsequent space is (0.03)^2 = 0.0009, a fourth is (0.003)^4 = 0.000027, and so on.

So the probability of getting straight lines of spaces 5 or 6 rows long is pretty small. Even without the math, I would guess that my assumptions of randomness and independence are probably not true. […] Short of creating a bootstrap simulation of a random distribution of words in the paragraph and the ‘lines’ that result, I don?t see any way to get at this problem. The distribution of words isn?t really random, so maybe it?s not too surprising that those lines appear.”

The Rodriguez Text with Vertical Lines

Improbable Research