Introduction
Hospitals can be some of the most complex organizations to manage due to the large amount of data about patients and diverse management processes and assets. From recent work with our customer the National University Hospital System (NUHS), we have learned the importance of applying data science to improve the hospital's daily operation and patient care. Therefore, we have used Spotfire and Spotfire Data Science to develop this Hospital Management AI Application that provides data analytics for historical patients' data requiring privacy protection and advanced analytics for predicting the Length of Stay and Readmission Risk for patients using machine learning models.
For demo purposes, we are using a public dataset of US patients that contains over 200,000 records of visiting in history and have built analytical workflows based on this dataset. We can divide the end-to-end analytics into the following four parts.
Figure 1. Cover page that shows the different parts of the demo
Part 1 Understand Historical Data of Patients
Our historical patient dataset contains typical information that a hospital would collect, such as the timestamps of different stages during the admission process, how the patients arrived at the hospital, required resources for the treatment, total expenses until discharging, etc.
Figure 2. Dashboard Part 1 for Historical Data Exploration
The most important aspect of monitoring on the dashboard is the admission process, so we created KPIs to indicate the waiting time for admission to triage and triage to consultation and created the specific patient journey to understand the whole journey of the selected patient. The dashboard allows for easy marking of rows in the data table to compare the waiting time at different hours of the day and months of the year, thus obtaining a better understanding of the admission process, especially for the peak hours.
In order to improve resource management, the bar chart shows the number of patients registered by month and selection boxes for filtering patients by bed type. Users can clearly see the trend of the number of patients registered over time, and apply the built-in time-series data forecast function to understand the estimated future demand for different bed types, as shown in Figure 3 and Figure 4 below.
Figure 3. Applying the time-series data forecasting function to the bar chart in Spotfire
Figure 4. Patients registered by month in history with forecast values in dotted lines
Part 2 Protect Sensitive Information of Patients
We have included another dataset of the patient's profile in this demo to show options for protecting sensitive information. This is an essential step to ensure that sensitive information can be restricted to be accessible to specific users only.
Figure 5. Dashboard Part 2 for displaying data analytics with sensitive data protection
An effective method for hiding sensitive information is masking string columns with a special character, applying string permutation on the character level, adding random deviation to the numerical columns, aggregating numbers into a range, etc. This can be achieved by adding data transformation or writing a few lines of Python code in the Spotfire dashboard, the result is shown in Figure 6.
Figure 6. Masked patient profile data
To illustrate how this was achieved, here is an example of only showing the first and last characters of some string columns and masking other characters with a random number of asterisk symbols, we can calculate new columns using a simple Python function as displayed below.
Figure 7. Applying Python function for data masking in Spotfire
Part 3 Predict the Length of Stay for Patients
Patient hospital length of stay is measured for in-hospital admissions as the number of days that the patient would stay in the hospital during a single admission. Knowing the predicted length of stay helps provide an optimal level of patient care and improves the effectiveness of hospital resource management.
Figure 8. Dashboard Part 3 for predicting the Length of Stay for patients
This dashboard page shows information about the machine learning model for predicting the Length of Stay including the validation results and model summary, as well as the predictions for test patients. The model is created from Data Science, where data scientists can develop workflows to process the patients' historical data, conduct feature engineering work, train the predictive models, and deploy the models to the production environment, as displayed in Figure 9. From the control panel on the dashboard, users can click the button to trigger the workflows in Data Science to train the Length of Stay prediction model and the results will be automatically refreshed on the dashboard. The enablement for Spotfire users to build, use, and execute a workflow in Spotfire Data Science is achieved using the Data Function for Data Science in Spotfire. For more info, please read this article Data Function for Spotfire Data Science - Team Studio in Spotfire explaining step-by-step how to set up the connection.
Figure 9. Hospital management workflows in TIBCO Data Science
Users can define various workflows in TIBCO Data Science using out-of-box drag-and-drop operators to achieve different goals. For instance, the workflow in Figure 10 shows how to replace null values, generate more features, and apply correlation filtering for feature selection, and the workflow in Figure 11 shows how to train and export a machine-learning model and then use it to make predictions on test data. Since the workflows leverage Spark pipelines and optimizations, the calculations can be really fast against big data.
Figure 10. The workflow of feature generation for modeling in TIBCO Data Science
Figure 11. The workflow of modeling in TIBCO Data Science
Part 4 Predict the Readmission Risk for Patients
Readmission is another critical metric to evaluate for improving hospital management, which can be defined as a patient being admitted by the hospital within a time interval after being discharged. Hospital readmission is undesired for patients? well-being and expensive for the hospital, therefore identifying patients with a high readmission risk can help physicians make better decisions on treatment and discharge process, improve patients' well-being and reduce the total cost.
Figure 12. Dashboard Part 4 for predicting the Readmission Risk for patients
We also use Spotfire Data Science to perform machine learning. As patients' readmission statuses are calculated and flagged as a binary label ("Yes"/"No"), then binary classification algorithms such as Logistic Regression and Gradient Boosting are used to train models to predict the readmission risk. On the dashboard, the blue color highlights those patients that are predicted to be at a high risk of readmission, and the yellow color highlights the probability of the predicted readmission risk.
Summary
In summary, the Hospital Management AI App demonstrates the following:
-
Hospital administrators can use a customized Spotfire dashboard to generate insights from patient data using machine learning, to make better decisions and reduce management costs, without needing to understand how to build the data science models.
- Masking sensitive data for privacy protection can be achieved using built-in functions or Python scripts available in Spotfire.
- Training machine learning models with large datasets for predicting patients? length of stay and for predicting readmission risk can become simple tasks using the data science platform
If you would like to explore more AI Apps, please visit Spotfire Demo Gallery.
Recommended Comments
There are no comments to display.