Big data has come into our lives in numerous ways, and many of them are a scourge on our lives. Big data, in and of itself, is not to blame, but the uses to which it is put are often outrageous. Cathy O'Neil is a mathemetician and data scientist that has put a lot on the line by releasing information on how this data is used prior to her retirement. One should note that she has taken the time to prepare this book such that the average reader can pick up this book and understand the dangers of Big Data and how it is being used.
Algorithms are a set of processes that are based on Big Data, what happened in the past, and a definition of success. We use the concept of math, that many are afraid of or do not understand enough about, to implement these algorithms.
You may wonder why is this so important? Answer: the areas of this book that cover racism.
Maybe a deeper set of questions is needed to reveal the issue at hand.
1) If big data is based on the past, we can look to the black systemic racism example that banks used location-based reasons to decline loans to minorities. The past is flawed and if that is where this information is being derived from, but there is no one that is going through this data to remove the flaws of the past.
2) Who's definition of success are these algorithms based on? If the definition is racist or not inclusive in itself... why would the algorithm be?
Cathy O'Neil Ted Talk
Today we can see many similarities as insurance companies offer rebates and incentives based on where you live and implement them into their data. This information may be masked by other statistics, but all in all, the effects are the same on us. If you overlay this information on a map with a population of minorities, the truth is clear. Though the reasoning may use other statistics to back their claims, but despite this, the same system was formulated at a time to stop blacks from monetary gain. This same philosophy could and is easily applied to all minorities today. Us Toronto Tamils know that insurances in Scarborough and Markham (higher Tamil & minority population) are severely higher than Niagara or Jackson's Point regions (primarily white).
Cathy uses this book to show how job opportunities, bank loans, insurance, and anything and almost everything incorporates these same systemic disparities that will further the gap between the rich and the poor into algorithms.
“Simpson’s Paradox: when a whole body of data displays one trend, yet when broken into subgroups, the opposite trend comes into view for each of those subgroups.”
― Cathy O'Neil