Fantastic Data and Where to Find Them: The Importance of Knowing What (and Who) Is Missing
ML applications are expanding at a rapid rate as we now leverage pre-trained models and ML APIs from the likes of Google, Amazon, IBM, and Microsoft. Innovation comes with risk, however. By heavily depending on these trained models/transfer learning, applications are also heavily dependent on the data on which these models were trained, for better or for worse. Often times, these models can freeze bias and ensure your application under-serves many of your users. We will discuss the data which backs these models, how they were constructed, who and what is missing, and important effects.