Cihan Biyikoglu is the Executive Vice President of Product for RMS, responsible for product management across the full suite of RMS models and risk management tools. He has extensive experience of leading product management for innovative machine learning and big data analytics solutions at Fortune 500 companies over the last 20 years.
As a former Vice President of Product at Databricks and Redis Labs, Cihan both developed the product strategy and roadmap for open source technologies such as Apache Spark and Redis and respective enterprise offerings in the public and private cloud platforms.
Cihan also work on products at Microsoft, Couchbase and Twitter, where he focused on on-premise and cloud offerings in data and analytics space. At Microsoft, Cihan focused on the incubation of the Azure Cloud Platform in its early days and the SQL Server product line both of which have grown to multi-billion dollar businesses for Microsoft.
Cihan holds a number of patents in the data management and analytics space, and has a master’s degree in Database Systems and a bachelor’s degree in Computer Engineering.
The Risk Data Open Standard (RDOS) Steering Committee that guides and oversees this standard was busy in 2019 working to shape the RDOS, to enable the risk industry to simplify risk management and risk data portability. The RDOS has traveled a long journey and is now proudly an open standard.
Much has been
published already about what the RDOS is. So, for this post I won’t focus on
the mechanics of the RDOS, instead I’ll focus on the “why and how” the RDOS is
open, and how it enables anyone in the industry to use and contribute to the
Just in case you are not familiar with this new standard, I will quickly introduce the RDOS, but the majority of the post will be on diving into the “open” part of the RDOS, and to unpack how we are reusing learnings from existing “open source and open standards” that have successfully created international and industry wide collaborations.
The picture below on the left shows the extensive flooding at industrial parks north of Bangkok, Thailand. Western Digital had 60 percent of its total hard drive production coming from the country – floods disrupted production facilities at multiple sites to dramatically affect a major, global supply chain. And the picture on the right – showing flooding on the New York Subway from Hurricane Sandy, caused widespread disruption and nearly US$70 billion of losses across the northeastern U.S.
In both examples, the analysis of risk should not only help with physical protection measures such as stronger buildings through improved building codes or better defenses, but also the protection available through financial recovery. Providing financial protection is the job of the financial services and insurance industries. Improving our understanding of and practices in risk analytics as a field is one of the most interesting problems in big data these days, given the increasing set of risks we have to watch for.
How Does Risk Analytics Work?
Obviously, the risk landscape is vast. It stretches from “natural” events – such as severe hurricanes and typhoons, to earthquakes to “human-generated” disasters, such as cyberattacks, terrorism and so on.
The initial steps of risk analytics start with understanding the exposure – this is the risks a given asset, individual etc. are exposed to. Understanding exposure means detailing events that lead to damage and the related losses that could result from those events. Formulas get more complicated from here. There is a busy highway of data surrounding this field. Data engineers, data scientists, and others involved in risk analytics work to predict, model, select, and price risk to calculate how to provide effective protection.
Data Engineering for Risk Analytics
Let’s look at property-focused risks. In this instance, risk analytics starts with an understanding of how a property – such as a commercial or a residential building, is exposed to risk. The kind of events that could pose a risk and the associated losses that could result from those events depends on many variables.
The problem is that in today’s enterprise, if you want to work with exposure data, you have to work with multiple siloed systems that have their own data formats and representations. These systems do not speak the same language. For a user to get a complete picture, they need to go across these systems and constantly translate and transform data between them. As a data engineer, how do you provide a unified view of data across all systems? For instance, how can you enable a risk analyst to understand all kinds of perils – from a hurricane, a hailstorm to storm surge, and then roll this all up so you can guarantee the coverage on these losses?
There are also a number of standards used by the
insurance industry to integrate, transfer, and exchange this type of
information. The most popular of these formats is the Exposure Data Model (EDM).
However, EDMs and some of their less popular counterparts (Catastrophe Exposure
Database Exchange – CEDE, and Open Exposure Data – OED) have not aged well and
have not kept up with the industry needs:
These older standards are property centric; risk analytics requires an accommodation and understanding of new risks, such as cyberattacks, liability risks, and supply chain risk.
These older standards are propriety-designed for single systems that do not take into account the needs of various systems, for example, they can’t support new predictive risk models.
These standards don’t come with the right containment to represent high fidelity data portability – the exposure data formats do not usually represent losses, reference data, and settings used to produce the loss information that can allow for data integrity.
These standards are not extensible. Versioning and dependencies on specific product formats (such as database formats specific to version X of SQL Server etc) constantly make data portability harder.
This creates a huge data engineering challenge. If you
can’t exchange information with high fidelity, forget getting reliable
insights. As anyone dealing with data will say: garbage in, garbage out!
For any data engineer dealing with risk analytics, there is great news. There is a new open standard that is designed to remove shortcomings of the EDM and other similar formats. This new standard has been in the works for several years. It is the Risk Data Open Standard. The Risk Data Open Standard (RDOS) is designed to simplify data engineering. It is designed to simplify integrating data between systems that deal with exposure and loss data. It isn’t just RMS working to invent and validate this standard in isolation. A steering committee of thought leaders from influential companies is working on validating the Risk Data OS.
The Risk Data OS will allow us to work on risk analytics much more effectively. This is the way we can better understand the type of protection we need to create to help mitigate against climate change and other natural or human-made disasters. You can find details on the Risk Data OS here. If you are interested in the Risk Data OS, have feedback, or would like to help us define this standard, you can email the Risk Data OS steering committee by clicking here .