Introducing a graph data-model for market analytics: The emerging hydrogen economy in 2020

Lukas Strohmeier
delphidata
Published in
5 min readJan 21, 2021

--

Over the past years, I have worked with several different methodologies, tools, and approaches to increase the accuracy of industrial market models & forecasts. I am going to introduce a new approach in the form of a case study: A graph-based market model was developed to utilize the power of a graph database for market analytics. Unlike the commonly used relational database (tables; e.g. SQL), a graph database works similar to the human brain. Graph databases aim to store and navigate relationships. By doing so, they enable analysts to explore useful data connections. For many applications, the links between data points are as important as the data itself. The exploration of connections is natural to the graph database. However, it requires much more complex query structures in the case of a relational database. Ultimately, it is more convenient to develop complex market models in a graph database.

Although graph databases are still not commonly known, they have been used by scientists and innovative companies for a long time. Graph algorithms are essential pillars in the success stories of Amazon or Google. Google’s famous PageRank algorithm was developed by Larry Page and Sergey Brin in 1996. The history of graph theory itself even dates back to the 18th century, when Leonard Euler tried to mathematically solve the knight's tour problem

Nowadays, the graph is quickly gaining importance in applications such as fraud detection or supply chain management. In the area of market research, however, I couldn’t find any relevant references yet, so I started from scratch.

I opted to analyze the development of the hydrogen economy in 2020, as this is still a young industry, with fewer market participants and less complexity than mature industries.

Data collection was the first task. After an initial market analysis to identify the key players in the hydrogen segment, I searched and stored all relevant publications and press releases. In a second step, the information was clustered and the relationships between the companies were mapped. For the clustering, it was important to define a set of industries, which segments the dataset in the most applicable way. Therefore, I decided to use the following set of labels:

In total, only 181 companies are included in the analysis, as I decided to focus on important innovators in the hydrogen segment as well as major enterprises.

In the next step, the data was imported into a graph tool, in order to visualize & analyze the set:

The picture above depicts that, large parts of the industry have already created a common network. As the hydrogen economy evolves, the network will be quickly growing and more connections between the players will be created.

Within a network, centrality is the key to success, and centrality is always related to direct and indirect connections of a node. The model assumes, that the companies with the highest degree of centrality will become the market leaders of the future. This is basically a translation of social network theory into the business context.

This implicates the following: companies that are organized in remote sub-networks, need to quickly develop business relations with corporations who are already part of the main ecosystem. In the first week of 2021, Korean SK group announced a significant investment in Plug Power. As a result, Bloom Energy’s community (right-hand side of the chart) will be integrated into the main ecosystem. In our model, this move of SK group doesn’t only boost its own business prospects, it also significantly increases the long-term business outlook for Bloom Energy.

Let’s get back to our basic 2020 model: If we just focus on 2020, the following companies are in the best position to achieve long-term success in the hydrogen sector. Interestingly, not only pure-play hydrogen companies make the list.

The used centrality indicator is called closeness. A high closeness value allows to easily distribute information across the network and act as strong influencers (McKnight 2014). Additionally, a high closeness enables strong insights into developments within the whole system. As a result, it is assumed that the enterprises with the highest closeness centrality will be the most effective at marketing their own products, as well as being the most successful in monitoring technological trends within the network.

My model suggests, that the Nordic countries are currently in the pole position to become the “silicon valley of hydrogen.” Nevertheless, this position will be challenged by a lot of actors.

Within the network, 42 communities have been detected. A community within a network is defined as a set of nodes, which are more densely connected to each other than to the rest of the network (Radicchi et al. 2004). In network theory, communities affect various processes, such as the flow of information.

The assumption for the demonstrated model is, that the communities will help their members to commonly grow their hydrogen business. Being part of the same community doesn’t mean that the companies in the network maintain business relations with all members of the community, but developing those strategic ties could prove beneficial for the strength of the whole community.

In the emerging h2 economy, we will not only see rivalry among individual companies, we might in fact witness a rivalry of competing company communities. Suppliers and clients across industries will move hand in hand, to emerge as leaders in hydrogen.

One of the 42 communities:

What’s next for this model?

I plan to further develop and evaluate the accuracy of this model over the next few years. Currently, the most emphasis is given on integrating financial figures as well as regional long-term forecasts for hydrogen into the dataset. Furthermore, the set is currently being extended to include business relations from 2015–2020.

This kind of market analysis could develop into a compelling forecasting method for emerging industries, in which traditional forecasts are not reliable enough. It could also be used as a capable tool for competitive intelligence since its main purpose is to identify the market potential of individual companies.

In case you have questions, don’t hesitate to contact me at lukas@delphidata.io

Disclaimer: We assume no liability for its accuracy, completeness, or timeliness. All content published by delphi data labs is for informational purposes only, you should not construe any such information or other material as legal, tax, investment, financial, or other advice.

--

--