A couple of days ago… well, maybe a bit more, - last week, I bumped into the photocopies of the original data set of Francis Galton which he had put together with the help of a survey in which he paid 500 pounds (!!!) to every family that completed his questionaire. Interestingly enough one of the conditions of the survey was that he will return the completed questionaire to the family after processing it, which he did. Yes, I’m looking at you, crookbook/shmancestry kindergarten, you’d better start learning fast from the relative of Charles Darwin.

    Anyways, I transcribed it myself (and checked several times) then processed it, because it required an imputation of numbers instead of the categorical ‘short’-‘tall’ (or even ‘idiotic’) entries. The photocopies (displayed by means of links to the pictures on another site), the resulting CSV files, including the main: “galton-family-heights-imputed-final.csv” are in the virtual organization Galton’s data. I will write about this data set and what I’m going to do with it a little bit later.

P.S. I really feel for the little fifth daughter of family #176 marked as ‘idiotic’ in the Galton’s data set. She was less than 57.5 inches tall… Was her name ‘Cinderella’? Who knows.

Later.