Smoking data

Table of Contents


In 1972-1974 a survey of the electoral roll, largely concerned with thyroid disease and heart disease, was carried out in Whickham, a mixed urban and rural district near Newcastle upon Tyne, United Kingdom (Tunbridge et al. 1977). Twenty years later a follow-up study was conducted (Vanderpump et al. 1995).

The present data contain observations from the n=1,314 women who were classified either as current smokers or as never having smoked at the initial survey. For information, there were few women (162) who had smoked but stopped, and only 18 whose smoking habits were not recorded. The data contain the 20-year survival status for all the women.


Variable Explanation (Unit)
age Age at initial survey
smoker yes/no: smoking status at initial survey
death yes: death within 20 years from initial survey, no: alive

Data download

Format Download
text (csv) smoking.csv
rda (for R) smoking.rda

Load data into R



Vanderpump MPJ, Turnbridge WMG, French JM, et al. The incidence of thyroid disorders in the community: a twenty-year follow-up of the Whickham survey. Clin Endocrinol. 1995;43:55–69.

Turnbridge WMG, Evered DC, Hall R, et al. The spectrum of thyroid disease in a community. Clin Endocrinol. 1977;7:481–493.

Appleton DR, French JM, Vanderpump MPJ. Ignoring a covariate: an example of Simpson’s paradox. Am Statistician. 1996;50:340–341.

Created: 2020-10-31 Sat 13:45