Time Series Modeling for Mortality Predictions

1 Jun 2025

📂 Source code unavailable due to collaboration restrictions.

This project focuses on working with complex healthcare data from the Physionet 2012 Challenge dataset, which contains 48 hours of intensive care data used to predict patient mortality. It involves handling noisy, sparse, and irregularly-sampled multi-variate data through preprocessing, exploration, supervised learning, representation learning, and leveraging large language models (LLMs) for embedding extraction and analysis.

My Contributions

Performed data preprocessing and exploratory analysis, including data transformation, handling missing values, standardization, and distribution visualization
Implemented representation learning using an LSTM Autoencoder, trained linear probes based on extracted embeddings for comparison, and visualized embeddings with t-SNE
Overall discussion and key findings summarization

Below, you’ll find some results of our reports. Data Exploration Model Performance Comparison LSTM Autoencoder Architecture

data-preprocessing
representation-learning
healthcare
python
pytorch