By R. Greg Szrama III
Moore’s law, proven out over the past five decades, describes exponential growth in the computing power of microprocessors. This vast increase in computing technology is powering ever-faster global networks, massive data centers for e-commerce behemoths like Amazon, and of course artificial intelligence (AI). Even as computing power continues to grow exponentially, more recently driven by graphics processors than by traditional CPUs, modern AI processes are still inevitably bottlenecked by available processing power.
But what exactly do we mean by artificial intelligence? Hollywood has trained us to think of Skynet when imagining what artificial intelligence looks like—the malevolent network behind the dystopia in the Terminator franchise. News media doesn’t help when it highlights sensational possibilities like Microsoft demonstrating live real-time audio translation from one language to another. These are not mainstream or even necessarily realistic visions (sorry, Skynet) of what AI does for us.
In the everyday world, artificial intelligence systems focus on understanding complex datasets and relationships, then deriving some conclusion about them. These processes often lack even the most basic of user interfaces. One common use case for artificial intelligence, and one that made unfortunate news for Microsoft in 2016 (which we’ll look at later), is automated chatbots, with the ability to understand basic questions and provide relevant links for further information. Even these systems tend to operate silently on a server, interacting through a messaging program developed by someone else.
How Artificial Intelligence Is Used Today
If artificial intelligence is so different from common expectation, what form does it take? Today, AI systems largely analyze and provide details on large or complex datasets. These datasets may be sensor inputs, natural language patterns, or even photographs and video. These systems may be thought of as complex models tuned by exposure to relevant data and which possess the ability to self-tune and become more accurate over time.
In one example, Liberty Mutual is experimenting with an AI-powered smartphone app that analyzes photographs of collision damage to provide an initial assessment of damage and a high-level estimate for repair costs. The system is trained using photographs from thousands of previous car crashes and associated damages. End-users interact with the AI system by taking collision photos on their smartphone, which are uploaded to a remote server where the analysis is performed. These type of AI models can be continuously calibrated during use by feeding final assessments into the system as additional comparative data points.
Assurant’s selection of Shift Technology’s FORCE fraud detection platform is another example of AI use in the insurance industry. Assurant hopes to mitigate current fraud while gaining insight into developing fraud patterns; analysis of these developing patterns can then be used to protect against future fraud. Claims are sent to the FORCE platform for analysis and fraud handlers at Assurant then receive real-time alerts of suspicious activity. They are then able to adjudicate the flagged cases, allowing analysts to focus their time and energy on cases most likely to contain fraudulent activity.
What Powers Artificial Intelligence Solutions?
Before beginning any implementation, the problem to be solved or analyzed must be carefully defined and controlled. An overly broad problem statement will quickly become too complex for current machine learning algorithms to process. Consider natural language processing: We may need multiple AI models to handle parsing parts of speech, determining where colloquialisms are in use, understanding context relationships, etc. The results of these individual models then must be compiled and interpreted in a way that produces accurate and relatable results. The first step of implementing any AI system, therefore, is breaking larger problems down into individual steps that are simple, specific, and directed enough for a given algorithm to solve.
Once the problem is broken down and the correct models applied, the next step is training the system. This involves feeding the system different inputs and telling it how they should be classified. In most cases this is a manual process called supervised learning, where a user acting as trainer or teacher must classify inputs for the model until it learns to detect similar inputs on its own. Preparing data for this training step is crucial to getting accurate results and involves selecting and tagging inputs for both positive and negative cases. Training an image recognition system to detect cats, for example, will require supplying thousands of images of different types of cats—while also supplying images of dogs and other animals, or with no animals at all. Care must also be given for how new data points are collected to continually refine the model to better represent current inputs and evolve to process new inputs over time.
Google’s spam filtering in its Gmail platform showcases the value of massive amounts of data in developing a high-quality artificial intelligence model. Google performed the initial calibration of its filter before fielding the system with a feedback loop for users to “Report Spam” or mark a message as “Not Spam.” As a result, fewer than 1 spam message in 1,000 makes it to the average inbox, while fewer than 5 messages in 10,000 are improperly marked as spam messages. The system is flexible enough that, along with ongoing updates from user feedback, it’s able to evolve to detect new patterns in spam or malicious emails and thwart new threats as they’re deployed by malicious actors.
The Tay chatbot, deployed by Microsoft, provides a counterexample of why appropriate input data, and proper calibration of responses, is so critical to training an AI system. Launched in 2016 to “converse” with the general public on various social media platforms, the system was intended to learn to communicate in a personalized way with millennials. The staff training Tay’s responses included, among others, improvisational comedians. Online users wasted no time in teaching Tay to respond with racist and/or sexist comments. Microsoft took Tay offline after less than 24 hours. A lack of control in Tay’s responses, coupled with the general nature of internet users, cause a PR nightmare for Microsoft.
Once the problem is defined and plans are laid for training an AI system, the next question is allocation of appropriate computing resources. How are data points collected—for example, are inputs sent from a computer system as in the fraud detection example, or are they collected from real-time sensor input, as in Progressive’s collection of over 22 billion miles of driving data through its Snapshot app? The large data sets involved in training AI systems demand massive amounts of storage and archiving capability. Training an AI model is also incredibly computationally expensive. A voice recognition model for a given language, for example, may refine down to a size that fits comfortably on a standard smartphone, but doing so requires complex neural networks utilizing thousands of CPUs and many thousands of hours of recorded audio.
How Do We Do All This?
The various datacenter requirements for establishing a complex AI system can be overwhelming to plan, not to mention expensive to deploy and maintain. Thankfully, growing alongside machine learning technologies, and sometimes leading and enabling their growth, are various publicly available computing clouds. Additionally, the major players (including Amazon, Microsoft, Google, and IBM) are focusing on AI capabilities as a major differentiator and driver of growth in their cloud computing businesses.
These cloud providers are particularly well suited to deliver on the technical requirements for AI processing. Those requirements include plenty of storage space, data collection networks, raw processing power, and machine learning engines. Understanding these requirements and how they’re delivered helps illustrate why “the cloud” is inevitably used in the same breath as any mention of deploying new AI solutions.
In a sense, data storage and data collection go hand in hand. A major selling point of cloud storage providers is they claim theoretically unbounded storage capacity. Amazon’s Simple Storage Service (S3), for example, allows for single files up to 5 terabytes in size, with no limit on the number of such files. For certain data mining use cases, petabyte-scale (that is, thousands of terabytes) storage is common. This storage elasticity is critically important when collecting real-time data, as with Progressive and others collecting telematics.
Collecting data and getting it into your storage solution is almost always a harder problem to solve than allocating the raw storage itself. If users are submitting claims information, for example, they will expect the same level of responsiveness wherever on the globe they may be. Cloud providers ease this friction by hosting entry points to their networks in globally distributed locations. In the case of Google Cloud, these entry points are gateways to Google’s own private global internet. This ability to receive, process, and then act on customer data is key to engaging and retaining customers.
Just like with storage, cloud providers also sell their services on the ability to ramp up computing capacity to any scale a particular computing problem may require. Going back to the example of training a speech recognition model for a given language, AI applications frequently have high peak usage of computing resources during the time when the model is actively being trained. Neural networks, like those used in Google’s TensorFlow, may also rely on graphical processing units (GPUs) to efficiently perform such training. Cloud providers offer the ability to create new virtual machines, including the use of expensive GPUs, on demand—typically within minutes. This responsiveness drastically shortens the amount of time traditionally spent purchasing, shipping, installing, and configuring these resources in traditional data warehouses, which may take weeks to complete.
The final piece is the machine learning engines themselves. Considering the technical expertise required to select and tune a particular model, this may be the most difficult piece in establishing a new AI solution. Fortunately, as mentioned before, this is also a key area in which cloud providers are focusing their attention. Microsoft, for example, has developed a drag-and-drop-style interface in its Azure Machine Learning Studio product to aid in selecting, training, tuning, and deploying AI models.
Now What?
We know what AI is, what it requires, and how it can be delivered. The next inevitable questions are: Can it really deliver in a way that makes all this worth it? And how many jobs will intelligent systems replace? The answer to the first is an unqualified yes. The second is trickier and is probably the wrong question to ask. As with the advent of micro-computing, or really any technological innovation, there is some challenge to the status quo but, just like micro-computing, AI mainly serves as a tool to let us focus our energy on more complex situations.
The evolution of chess serves as a compelling example of how AI models will be used in the future. After losing to IBM’s Deep Blue in 1997, the first time a computer chess player beat a world champion, Gary Kasparov developed a method of play where human and computer players cooperate to plan moves, allowing players to fine-tune their strategies and thereby play better games. This partnership lets the human player focus on developing overall strategy, reading his or her opponent, and playing to human strengths, while the computer calculates and ranks possible moves for the human to consider, working through movement possibilities far more efficiently than a human could.
This type of AI/human collaboration will become much more commonplace as data collection systems continue to evolve. We discussed Progressive and its collection of data through the Snapshot program, but this technology is still in its early phases, collecting only basic telematics data on acceleration, braking, and trip distances. As this technology matures, involving more sensors measuring the vehicle and the environment around it, the amount of data collected will grow exponentially. This volume of data, however, would require hours of time for a human to gain an understanding of a single customer. A properly trained AI model, however, can process the same data in a fraction of the time, allowing the insurer to focus on delivering value to the customer through coverage more precisely tailored to the customer’s needs.
This concept of using AI to model and understand behavior will prove useful in more complex undertakings as well, such as underwriting international shipping. Systems powered by AI, trained using combinations of proprietary data and publicly available datasets from governments and nongovernment organizations, will help better assess risks involving foreign governments, regulatory churn, natural disasters, criminal elements, exchange rates, and more. Teams of actuaries with different specialties can then use these AI models to refine their piece of the overall risk assessment and, as with advanced chess, arrive at a more accurate and complete work product.
Other areas of insurance, such as claims processing or usage-based policies, will also see benefits from AI systems. A major area of focus for AI systems is training them to understand legal contracts, as with IBM’s Watson Compare & Comply technology. As these systems come to understand and apply more complex legal requirements, they will also begin powering smart contracts using secure blockchain technology.[i] These contracts will be written in a way that events immutably recorded in a blockchain ledger trigger execution of payments. These same types of systems could also automatically record the time and place for the usage of a vehicle or residence via the sharing economy. Using AI to automate risk assessment processes, an actuary could go so far as to break down the risk of failure in a complex system by pricing a vehicle, for example, based on the failure rate of a certain brand of tire independent of the failure rate of the transmission or fuel injector system.
Where Does This Leave Us?
For the foreseeable future, individual artificial intelligence systems will remain specialized, capable of solving only well-defined, properly structured problems. As Microsoft’s Tay experiment illustrates, AI systems are also only as good and as accurate as the data with which they are trained. While we may be afraid of being replaced by new technological developments, the truth is our experience is more valuable than ever. Freed from the routine parts of our jobs, we will have greater time to focus on analyzing systemwide trends, and to solve more complex problems with a greater degree of accuracy than ever before.
GREG SZRAMA III is a software engineering manager at Accenture, where he focuses on high-throughput data processing applications. He is also a Google certified professional cloud architect.
References
Moore’s law and AI: technologyreview.com/s/607917/how-ai-can-keep-accelerating-after-moores-law/
Liberty Mutual and damage assessment: www.libertymutualgroup.com/about-lm/news/news-release-archive/articles/solaria-labs-unveils-new-developer-portal
Assurant adopts Shift’s FORCE platform: shift-technology.com/us-based-assurant-selects-shift-technologys-force-claims-fraud-detection-solution/
Gmail spam filter: gmail.googleblog.com/2015/07/the-mail-you-want-not-spam-you-dont.html
Microsoft and the Tay chatbot: techcrunch.com/2016/03/24/microsoft-silences-its-new-a-i-bot-tay-after-twitter-users-teach-it-racism/
Progressive/Snapshot: media.corporate-ir.net/media_files/irol/81/81824/arInter/17_annual/assets/pdf/Progressive-2017-AR.pdf
Cloud computing expansion and AI: zdnet.com/article/cloud-providers-ranking-2018-how-aws-microsoft-google-cloud-platform-ibm-cloud-oracle-alibaba-stack/
AWS 5 TB per file, unlimited files: aws.amazon.com/s3/features/
Amazon Redshift Spectrum, exabyte analysis: aws.amazon.com/blogs/aws/amazon-redshift-spectrum-exabyte-scale-in-place-queries-of-s3-data/
IBM Watson for contract analysis: ibm.com/blogs/watson/2018/02/watson-compare-comply/
AI for smart contracts: legalconsortium.org/uncategorized/combining-blockchain-and-ai-to-make-smart-contracts-smarter/
[i] For more, see “The Promise of Blockchain,” Actuarial Software Now; Contingencies supplement; November/December 2017.
Mainstream AI Capabilities |
---|
Many everyday technologies are powered by AI systems; digital assistants like Siri or Alexa are one good example. Siri uses a voice recognition model created through use of a neural network AI system. While Siri has a wide range of capabilities built in, like taking photos or performing web searches, it must be made aware of new apps or phone features through extensions to the voice recognition model.
Another mainstream technology is computer vision, which allows computers to identify objects and people in still images or video. In the future, such systems will allow autonomous vehicles to navigate our streets. In current usage, hospital robots like Tug use computer vision systems to learn their environment, avoid obstacles, and perform basic tasks like delivering food or medications to patients. Computer vision is also used in tagging or processing of images and video. Facebook uses vision models to automatically identify people in uploaded images. Some smart home security platforms use computer vision to identify things like people or pets in recorded video. Lighthouse’s system even allows you to make voice queries like, “What happened when Bob the handyman was here last Tuesday?” This robust capability combines a voice recognition model with computer vision tagging of videos to perform fast searches for specific data. |
How Secure Are Public Cloud Providers? |
---|
Cloud service providers stake their reputation, and therefore their entire business, on their ability to keep private data secure. They provide numerous tools for securing data in transit (to and from the cloud) and at rest (in a data store). These tools carry the risk of being misused, however, which can lead to unwanted data disclosures. When looking at certain business domains that carry intense regulatory scrutiny and a high risk of damage to clients should data be disclosed—like insurance—understanding these security considerations is critical.
Methods for protecting data in transit vary based on the type of data and where it’s going. Client data accessed via a webpage primarily uses industry-standard HTTPS (typically encrypted via transport layer security) connections. When dealing with sensitive financial data or personally identifiable information (PII), virtual private networks (VPNs) provide another layer of security. Where organizations more often run into issues with cloud storage is in managing their data at rest. Cloud storage locations (called buckets) have powerful access controls that allow fine-tuning who has access to which files. This control allows ready enforcement of the principle of least privilege. Unfortunately, when misused, data may be susceptible to disclosure. Booz Allen Hamilton, for example, in May 2017 inadvertently exposed battlefield imagery and certain administrator credentials through a misconfigured Amazon S3 account. Organizations using cloud storage must pay careful attention to security hygiene to prevent these disclosures. |