Primary Biliary Cirrhosis, sequential data

This data set is a follow-up to the original PBC data set, and contains the follow-up laboratory data for each study patient. An analysis based on the enclised data is found in Murtaugh PA. Dickson ER. Van Dam GM. Malinchoc M. Grambsch PM. Langworthy AL. Gips CH. "Primary biliary cirrhosis: prediction of short-term survival based on repeated patient visits." Hepatology. 20(1.1):126-34, 1994.

The primary PBC data set contains only baseline measurements of the laboratory paramters. This data set contains multiple laboratory results, but only on the first 312 patients. Some baseline data values in this file differ from the original PBC file, for instance, the data errors in prothrombin time and age which were discovered after the orignal analysis, during research work on dfbeta residuals. (These two data points are discussed in Fleming and Harrington, figure 4.6.7). Another major difference is that there was significantly more follow-up for many of the patients at the time this data set was assembled.

One "feature" of the data deserves special comment. The last observation before death or liver transplant often has many more missing covariates than other data rows. The original clinical protocol for these patients specified visits at 6 months, 1 year, and annually thereafter. At these protocol visits lab values were obtained for a large pre-specified battery of tests. "Extra" visits, often undertaken because of worsening medical condition, did not necessarily have all this lab work. The missing values are thus potentially informative, and violate the usual "missing at random" (MCAR or MAC) assumptions that are assumed in analyses. Because of the earlier published results on the Mayo PBC risk score, however, the 5 variables involved in that computation were usually obtained, i.e., age, bilirubin, albumin, prothrombin time, and edema score.

  • case number
  • number of days between registration and the earlier of death, transplantion, or study analysis time
  • status: 0=alive, 1=transplanted, 2=dead
  • drug: 1= D-penicillamine, 0=placebo
  • age in days, at registration
  • sex: 0=male, 1=female
  • day: number of days between enrollment and this visit date, remaining values on the line of data refer to this visit.
  • presence of ascites: 0=no 1=yes
  • presence of hepatomegaly 0=no 1=yes
  • presence of spiders 0=no 1=yes
  • presence of edema 0=no edema and no diuretic therapy for edema; .5 = edema present without diuretics, or edema resolved by diuretics; 1 = edema despite diuretic therapy
  • serum bilirubin in mg/dl
  • serum cholesterol in mg/dl
  • albumin in gm/dl
  • alkaline phosphatase in U/liter
  • SGOT in U/ml (serum glutamic-oxaloacetic transaminase, the enzyme name has subsequently changed to "ALT" in the medical literature)
  • platelets per cubic ml / 1000
  • prothrombin time in seconds
  • histologic stage of disease
    sample SAS code to read the data