Presentation: Filling in the Data Gaps with AI-driven Data Augmentation
Dec 07, 2022
To correct for bias or rebalance underrepresented populations in datasets, teams have long relied on data oversampling—duplicating rows of data to increase the size of an underrepresented group. But simple oversampling may not lead to robust changes in model predictions, and more robust methods can be difficult to implement and maintain on complicated datasets. Today, there is a better way: data augmentation with synthetic data. By using deep neural networks to train synthetic data models based on real-world data, teams can achieve fine-grained control over rebiasing their data, generating net new samples by seeding conditional models with the demographic distributions they need. Join Ander Steele, PhD and Lead Data Scientist at Tonic.ai, to learn more about filling the data gaps and shifting the bias in your data to make it more representative of the populations your use cases need.