Laboratory information is an important part of clinical diagnostics, and many laboratory test results make for good predictor variables (Donze et al., 2013; Sahni et al., 2018). The LABS table includes fields that describe the laboratory test name, abbreviation, LOINC code, and result:
There are some different approaches to including lab information in the final table. One way would be to include the raw lab result as a continuous variable. However, this leads to a problem because the result would be NULL for most labs. We could potentially navigate around this issue by imputing a value in the normal range when it is missing. Another approach would be to have a binary variable for a lab test result that is in the abnormal range. This solves the missing data problem, since if the result is missing it would be zero. However, a BNP value of 1,000 (which indicates severe CHF) would be no different than a BNP value of 350 (which indicates mild CHF) with this method. We will demonstrate both approaches in this chapter.
Also note that the Lab_value field sometimes contains special characters, for example in the troponin result. These will need to be removed and the lab values interpreted accordingly. Culture results (not included in this example) are completely textual, often naming specific bacterial strains instead of numbers.
Again, we repeat that this is a simplified example and that many of the common labs that would be drawn for these patients (for example, WBC count, hemoglobin, sodium, potassium, and so on) are excluded here.