Okay, back to gender discrimination in hiring
Think back to the Amazon discrimination in hiring case, the algorithm learned that words such as “Female Soccer”, or “women’s chess club captain” were highly weighted to being female. At the same time, the model also learned that females were less represented in Technology fields and most outperforming employees were male because males account for about 70% of the technology field on average [6].
The two things the algorithm learned were true but combined, the model learned that males were more preferred because males were more successful in the tech field because males were more represented in the tech field. Think about it, if you wanted to know which candidates in the tech field were successful, you would train an algorithm on existing candidates in the Tech field (which would be mostly male), pick the most successful ones, learn the most successful one’s traits, and compare it with new candidates.
The problem is if you are picking from a majority male pool, you will mostly pick males as top candidates, not females. Amazon tried to remove gender biases because of this but the algorithm learned other things that were highly weighted to being female and was able to discriminate against being female.
You could go into a lot of theory about why algorithms pay attention to different things but for our case, algorithms could be positive because it helps you discover something that you never considered before, or they could be negative because it is weighting a part of the input that is not actually important. This is where the biases in artificial intelligence algorithms occur.
But first, Let's talk about the good!
Artificial Intelligence can process tons of data sources and correlate input-output pairs to give usable solutions and current artificial intelligence technology does it very well. The ability to be able to search an image in google and find a result or hum a sound in Spotify and find a result is something that was not previously imaged 20 years ago. Most of the benefits seen in artificial intelligence are allowing people to work more efficiently and have products and services personalized to them. Artificial intelligence is behind most personal recommendation systems like Tick Tok, YouTube, and Netflix and it controls the information we have access to. By interacting with the technology, the weights and biases get updated to your preferences which eliminate the need for you to store that information in your mind.
A great disadvantage is that the human mind is limited by its personal experience. It cannot generate completely new information that it has never experienced before or make connections to data or information it had not previously interacted with. Artificial intelligence is not limited by one data source (i.e. human experience) but by multiple streams of information. It can work to generate novel ideas and concepts and connect the dots between information from varying sources. This is why technologies that can find new information from existing knowledge are popular. Being able to generate video content from a sentence you input or understand a document through in-depth and well-written summaries is allowing people to access information that would have been almost impossible to tie together.
This is also beneficial across industries where understanding more data can lead to better outcomes such as improved health care. Artificial intelligence has the potential to tie symptoms to diseases at a far greater accuracy than humans can because of the lack of other human experiences. People who utilize artificial intelligence are more likely to discover novel ideas, products, and services than those who do not.
Okay, now the bad
I can go on and on with specific examples of how AI is solving so many problems in multiple industries by being better at connecting dots from multiple sources and doing it more efficiently and repeatably, but this does not mean that there are no challenges that exist with the technology. These challenges are what keep polarized users that avoid the technology actively. Artificial intelligence has multiple challenges as with any technology but some that will be touched on in this article are generalizability, bias, accessibility, and privacy.
Generalizability
This is commonly referred to as, in the artificial intelligence community, the ability of an algorithm to be able to take any input from different areas and produce usable outputs. For example, an algorithm that has been trained to extract text from legal documents and summarize them in a way that a layperson can read them which if you have ever tried to read terms and conditions or apartment lease agreements would be awesome to have. But this algorithm may not work in the document summary of a medical document.
This is because again, it is all about data. Data used to train the legal document summary to achieve impressive accurate results may have excluded data from different fields in order not to confuse the input-output pairs correlation. The majority of the work that is published right now is based on specific and narrow applications that work really well in those specific tasks. Artificial intelligence is bounded by the data presented to it at the time of training.
Some technologies are being introduced that bridge these gaps of specificity because they are trained on large (really large) amounts of data so the machine learning algorithm has sufficient examples of input-output pairs that it can relate multiple adjacent concepts together. These algorithms are limited because you need a whole lot of data to begin with so only the AI powerhouse companies (Amazon, Google, Meta, Microsoft, and IBM) have been able to produce these models which raises other issues of bias and accessibility.
Bias
The term bias used here refers not to a number but a preference for something. All humans have biases that are ingrained within them that are formed based on everyone’s life experiences. It is not necessarily always bad as it helped us to survive by preferring things that brought about a benefit (pleasure) and avoiding things that did not bring about benefits (pain). Humans can find it difficult to recognize what exactly they are biased towards but most of the time, when pointed out, there is typically a reason that bias exists. Unfortunately, in the case of computers, it is hard to point out and understand a reason why biases exist in algorithms, but a lot of research is going into what could be a way to prevent biases from slipping into algorithms.
Good data does not mean free of technical errors only but that the data is representative enough for the prediction task. Because biases exist in the human world can easily be transferred to a computer. For example, data gotten from text conversations on the internet can bias non-English speakers, people who don’t have access to the internet or people who rarely contribute on the platforms which historically have been gender specific, etc. In this case, referring to this data as a good representation of human communication is not accurate and can lead to the program being trained from a limited perspective view.
These are the issues that are typically found in popular media when a company’s algorithm gets input from a user and produces racist output as was the case in google [7]. The algorithm had learned to associate the input of a male black person with being a criminal (Mug shots) because the text on the internet most likely referred to the same. Therefore, some researchers are calling for a more diverse workforce of machine learning scientists to be able to pinpoint these issues at the time of data gathering to reduce some of these negative effects.