Jump to content
  • This article explains how to improve Hospital Management with Data Science by Predicting the Length of Stay and Readmission Risk of patients. It complements the Hospital Management AI App demo available from the Spotfire Interactive Demo Gallery.

     

    Introduction

    Hospitals can be some of the most complex organizations to manage due to the large amount of data about patients and diverse management processes and assets. From recent work with our customer the National University Hospital System (NUHS), we have learned the importance of applying data science to improve the hospital's daily operation and patient care. Therefore, we have used Spotfire and Spotfire Data Science to develop this Hospital Management AI Application that provides data analytics for historical patients' data requiring privacy protection and advanced analytics for predicting the Length of Stay and Readmission Risk for patients using machine learning models.

    image.thumb.png.3b5ddb1b453902db9ace103593aae036.png

     

    For demo purposes, we are using a public dataset of US patients that contains over 200,000 records of visiting in history and have built analytical workflows based on this dataset. We can divide the end-to-end analytics into the following four parts.

    SFCoverPage.thumb.png.e4cae1819ca95854e5c9d12ced4221fd.png

    Figure 1. Cover page that shows the different parts of the demo

    Part 1 Understand Historical Data of Patients

    Our historical patient dataset contains typical information that a hospital would collect, such as the timestamps of different stages during the admission process, how the patients arrived at the hospital, required resources for the treatment, total expenses until discharging, etc.

    SFPart1.thumb.png.4fe50922369be063e7125d47935afe90.png

    Figure 2. Dashboard Part 1 for Historical Data Exploration

    The most important aspect of monitoring on the dashboard is the admission process, so we created KPIs to indicate the waiting time for admission to triage and triage to consultation and created the specific patient journey to understand the whole journey of the selected patient. The dashboard allows for easy marking of rows in the data table to compare the waiting time at different hours of the day and months of the year, thus obtaining a better understanding of the admission process, especially for the peak hours.

    In order to improve resource management, the bar chart shows the number of patients registered by month and selection boxes for filtering patients by bed type. Users can clearly see the trend of the number of patients registered over time, and apply the built-in time-series data forecast function to understand the estimated future demand for different bed types, as shown in Figure 3 and Figure 4 below.

    PatientsForecastingFunction.png.2905649dd9dbd5b16264eac9a6599d77.png

    Figure 3. Applying the time-series data forecasting function to the bar chart in Spotfire
     

    PatientsForecasting.png.622d066f5f55c32a554a340b1fd432dd.png
    Figure 4. Patients registered by month in history with forecast values in dotted lines

    Part 2 Protect Sensitive Information of Patients

    We have included another dataset of the patient's profile in this demo to show options for protecting sensitive information. This is an essential step to ensure that sensitive information can be restricted to be accessible to specific users only.

    SFPart2.thumb.png.ee11823c4efdb3aec23c96a71725ede8.png
    Figure 5. Dashboard Part 2 for displaying data analytics with sensitive data protection

    An effective method for hiding sensitive information is masking string columns with a special character, applying string permutation on the character level, adding random deviation to the numerical columns, aggregating numbers into a range, etc. This can be achieved by adding data transformation or writing a few lines of Python code in the Spotfire dashboard, the result is shown in Figure 6. 

    PatientsDataMasking.png.7828d1eaf6e5216b326146be11a68c8d.png
    Figure 6. Masked patient profile data

    To illustrate how this was achieved, here is an example of only showing the first and last characters of some string columns and masking other characters with a random number of asterisk symbols, we can calculate new columns using a simple Python function as displayed below.

    PythonDataFunctionforDataMasking.png.dc07a48d40d3e7807cc5070348e5b0e0.png
    Figure 7. Applying Python function for data masking in Spotfire

    Part 3 Predict the Length of Stay for Patients

    Patient hospital length of stay is measured for in-hospital admissions as the number of days that the patient would stay in the hospital during a single admission. Knowing the predicted length of stay helps provide an optimal level of patient care and improves the effectiveness of hospital resource management.

    SFPart3.thumb.png.6cfa5f62813bb5deccb5127186835423.png

    Figure 8. Dashboard Part 3 for predicting the Length of Stay for patients

    This dashboard page shows information about the machine learning model for predicting the Length of Stay including the validation results and model summary, as well as the predictions for test patients. The model is created from Data Science, where data scientists can develop workflows to process the patients' historical data, conduct feature engineering work, train the predictive models, and deploy the models to the production environment, as displayed in Figure 9. From the control panel on the dashboard, users can click the button to trigger the workflows in Data Science to train the Length of Stay prediction model and the results will be automatically refreshed on the dashboard. The enablement for Spotfire users to build, use, and execute a workflow in Spotfire Data Science is achieved using the Data Function for Data Science in Spotfire. For more info, please read this article Data Function for Spotfire Data Science - Team Studio in Spotfire explaining step-by-step how to set up the connection.

    TDSWorkFiles.png.87ca11732a570d0ca48df0ebf54740fb.png
    Figure 9. Hospital management workflows in TIBCO Data Science

    Users can define various workflows in TIBCO Data Science using out-of-box drag-and-drop operators to achieve different goals. For instance, the workflow in Figure 10 shows how to replace null values, generate more features, and apply correlation filtering for feature selection, and the workflow in Figure 11 shows how to train and export a machine-learning model and then use it to make predictions on test data. Since the workflows leverage Spark pipelines and optimizations, the calculations can be really fast against big data.

    TDSWorkflowFeatureGeneration.thumb.png.5ea74e72e231c34093c5dfb896ab22fc.png
    Figure 10. The workflow of feature generation for modeling in TIBCO Data Science
     

    TDSWorkflowLOSModeling.thumb.png.130b58c661dd49e115807fe339336805.png
    Figure 11. The workflow of modeling in TIBCO Data Science

    Part 4 Predict the Readmission Risk for Patients

    Readmission is another critical metric to evaluate for improving hospital management, which can be defined as a patient being admitted by the hospital within a time interval after being discharged. Hospital readmission is undesired for patients? well-being and expensive for the hospital, therefore identifying patients with a high readmission risk can help physicians make better decisions on treatment and discharge process, improve patients' well-being and reduce the total cost.

    SFPart4.thumb.png.3500fcd4abf2e578f3acd8cf81033774.png

    Figure 12. Dashboard Part 4 for predicting the Readmission Risk for patients

    We also use Spotfire Data Science to perform machine learning. As patients' readmission statuses are calculated and flagged as a binary label ("Yes"/"No"), then binary classification algorithms such as Logistic Regression and Gradient Boosting are used to train models to predict the readmission risk. On the dashboard, the blue color highlights those patients that are predicted to be at a high risk of readmission, and the yellow color highlights the probability of the predicted readmission risk. 

    Summary

    In summary, the Hospital Management AI App demonstrates the following:

    • Hospital administrators can use a customized Spotfire dashboard to generate insights from patient data using machine learning, to make better decisions and reduce management costs, without needing to understand how to build the data science models.

    • Masking sensitive data for privacy protection can be achieved using built-in functions or Python scripts available in Spotfire.
    • Training machine learning models with large datasets for predicting patients? length of stay and for predicting readmission risk can become simple tasks using the data science platform

    If you would like to explore more AI Apps, please visit Spotfire Demo Gallery.


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...