Synthetic data generation
Generative modelling of medical data for anonymisation purposes
Synthetic data generation with GANs
The usage of healthcare data in the development of artificial intelligence (AI) models is associated with issues around personal integrity and regulations. Patient data can usually not be freely shared and thus, the utility of it in creating AI solutions is limited.
What and why?
In this project, the aim is to explore generative modelling techniques (GANs) for generating synthetic data and inspect the impact synthetic data has on modelling performance. Additionally, comparisons of performance between machine learning models developed from real and synthetic data will be performed as well as assessing and comparing data leakage.
Contribiution
Main tasks:
- test GANs to generate artificial data (images and text),
- use synthetic data (conditional and unconditional GANs) for balancing classes and examine biases,
- use augmentation for balancing classes,
- test different ratios real/fake (using provided models with help of master students),
- explain classification results using XAI methods,
- examine controllability in Latent Space (master students),
- combine text and image data in multimodal classification task.
Technologies used: Python, Pytorch
Methods used: Deep Neural Networks, Skin Diseases Detection and Recognition, Explainable artificial intelligence, Multimodal learning
Deliverables
Github code:
- Modified version of the StyleGAN2-ADA for skin lesions generation
- Experiments using different XAI methods and ISIC2020 dataset
- Tabular data generation
- Mutlimodality for skin lesions classification
Medium posts:
- Artificial Intelligence In Healthcare: Is synthetic data the future for improving medical diagnosis? | by Sylwia Majchrowska and Sandra Carrasco | Towards Data Science
- Artificial Intelligence in Healthcare Part II | by Sandra Carrasco and Sylwia Majchrowska | MLearning.ai
- On the evaluation of Generative Adversarial Networks | by Sandra Carrasco and Sylwia Majchrowska | Towards Data Science
Presentations/workshops:
- Inauguration of AICC at SUH 29th November 2021: EYE FOR AI
- Workshops for Paris University AI4Healthcare 10th February 2022: ai4healthcare workshops
- Women in Data Science Ljubljana 12th March 2022: WiDS2022 Ljubliana workshops
- Event for Chalmers students at AI Sweden 21th April 2022: Edge Lab - SUH