Data Science Projects
For the past 5 years, I have been applying my knowledge in data science, machine learning, statistics and data management to clean, transform, analyze, quality check, build and integrate large datasets using Python, R and SQL, while working in the healthcare industry.
In my current role for a 5-year longitudinal project, I manage a diverse range of data, including protected health information, clinical interview notes, cognitive assessment data, physiological, neuroimaging (MRI, EEG, PET), and biological assay data (DNA, drug screen). The data is stored in varying formats, structures, and systems, requiring me to ensure seamless integration and efficient management across multiple databases and servers (PostgreSQL, XNAT, LabKey, REDCap, Microsoft Access, Tableau Server, SciNet HPC).
In addition, I extract valuable insights from collected data by utilizing various data science libraries. For instance, I clean, transform and condense large amounts of clinical assessment data from thousands of questions into concise and meaningful summaries, making it easier for clinicians to understand their clients' functioning. Also, I analyze recruitment data to determine the effectiveness of recruitment strategies and make predictions of future sample size.
To effectively convey my findings and messages to stakeholders, I visualize data using interactive dashboards and write at professional, technical level. For example, I design Tableau dashboards, R Shiny App and create HTML reports with Plotly graphs using R Markdown.