Clinical study data is usually collected without knowing what kind of data is going to be collected in advance. In addition, all of the possible data points that can apply to a patient in any given clinical study is almost always a superset of the data points that are actually recorded for a given patient. As a result of this, clinical data resembles a set of sparse data with an evolving data schema. To help researchers at the Moffitt Cancer Center better manage clinical data, a tool was developed called GURU that uses the Entity Attribute Value model to handle sparse data and allow users to manage a database entity’s attributes without any changes to the database table definition. The Entity Attribute Value model’s read performance gets faster as the data gets sparser but it was observed to perform many times worse than a wide table if the attribute count is not sufficiently large. Ultimately, the design trades read performance for flexibility in the data schema.
Identifer | oai:union.ndltd.org:USF/oai:scholarcommons.usf.edu:etd-8278 |
Date | 04 November 2017 |
Creators | Quintero, Michael C. |
Publisher | Scholar Commons |
Source Sets | University of South Flordia |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Graduate Theses and Dissertations |
Page generated in 0.0048 seconds