Indepth blockchain network analysis with AI and Big Data

By
Blaise from Nyctale

iExec is a blockchain-based decentralized cloud computing platform. The company has conducted an ICO in April 2017. Its token RLC is to be used through the smart contract associated to iExec platform. This report aims at studying the RLC token and the behavior of actors within its network.

For this study, we interpret the transactional data-structure as directed graphs, while implementing the state-of-the-art in machine learning for graph analysis. Considering the network structure as a directed graph allows us to obtain a visual representation of the overall network structuration. In addition to other algorithms such as centrality analysis and outlier detection, we managed to identify wallets with specific behavior like exchange platform’s wallet or wallets used by iExec team.

We perform this as a preliminary step to exclude a few wallets (around 30 in this case) from our data set to focus on users and individual investors. Then, this study focuses on network activity from individual actors.

Introduction: General statistics about iExec network activity

The number of new wallets and transactions highlight the activity’s intensity of iExec network. Over the period considered, these number varied by a factor of seven between periods of lowest activity and highest ones. These metrics mainly highlight the newcomers. On the other hand, some wallets become inactive, while holding some tokens or not.

On these different periods, different key events on the project development roadmap have generated several activity peaks:

1. Bittrex listing

2. Release v1 + Airdrop

3. Binance listing

4. Ubisoft partnership announcement

5. Intel partnership announcement at Consensus 2018

6. 1st worker drop

Looking at the average and total volume of transactions on the RLC network, it is clear that ICO has been the most intense period for the network.

If we focus on the network’s activity without considering the ICO-period, we note a high fluctuation on both metrics (average and total volume of transactions). At the same time, these variations are not that much correlated to the activity related to new wallets and transactions. This means hype periods don’t generate as much volume than they attract new users.

This is confirmed by the fact that average and total volumes have different behaviors. During highly speculative periods, the average volume remains low whereas the total volume reaches a peak. During these periods, there is a surge in active small actors attracted by hype around the project.

Looking at the number of senders and receivers for the RLC token, we understand that there is almost all the time more receivers than senders, which means a constant expansion for the iExec network.

This is computed for each period on active wallets without considering inactive ones.

Thus, we formalize an expansion ratio to highlight this point:

• When the ratio is inferior to 1, the network is being expanded;

• When the ratio is superior to 1, the network is being contracted.

In this way, this expansion indicator doesn’t insure that active wallets remain active. It is more about knowing if there are more people holding the tokens at the end of the period than at the beginning.

We can also look at the general distribution of RLC token within the community (exchange platforms’ wallets excluded). We also don’t consider wallets with a balance inferior to 10 RLC.

We note here that around 90% of wallets currently hold less than 10,000 tokens, while the large majority owns a quantity between 10 and 1,000 RLC.

*The number of wallets represented here is related to wallets with a balance superior to 10 tokens at the end of each month. Some active wallets don’t match this criterion.

The distribution of wallets’ weight in each category is inversely proportional to the distribution of wallet’s number. Around 90% of the network value is in the hand of the 10% who hold more than 10,000 tokens.

This representation also highlights the impact of December’s hype and Binance listing. The beginning of 2018 is characterized by a sudden surge in the number of wallets holding the RLC tokens, with mainly small balances. This evolution has nearly no impact on the weight’s distribution. On the other hand, the weight’s distribution has been constantly evolving since April 2017 to December 2017 with a decrease for wallets with a balance superior to 1,000,000 RLC.

Part 1: Behavior analysis

To go further in the understand of RLC’s network activity, we have used advanced machine learning algorithms to batch wallets following different kind of behaviors. This has been computed for each month, and we have simplified the representation with four types of behaviors (incoming investor, trader, outgoing investor, and holder) weighted with four different levels of activity (detailed below).

We saw that activity peaks are mainly attracting incoming investors and traders, while creating holders over the next period. Outgoing investors are cashing out more intensely during and after these peaks, while trading activity is quickly decreasing after each peak.

For the last few months, the number of active wallets has been low and most of the network is composed of holders. There are currently almost 7,000 wallets on the iExec network, while a total of 27,000 have been used since ICO.

The used algorithms enable us to obtain very precise batches of wallets who have a similar behavior. For each cluster, we gather activity details (number of transactions, total volume involved, balance) to precisely characterize them each month. We obtain a variable number of clusters per period.

We have built several indicators to interpret the type / size / intensity of wallet’s activity while creating different labels to determine their main orientation.

Here are the main rules used on each period:

• We consider wallets as incoming / outgoing investors for an increase / decrease superior to 25% of their wallet’s initial balance, with more than 100 tokens added / removed.

• When the balance remains stable (-25% to +25%), we note wallets as holders when their trading volume is inferior to 10% of their balance, with a balance superior to 100 tokens.

• Traders are so the one with a stable balance and a trading volume superior to 10% of their balance.

• We have also batched remaining clusters in two categories related to inactive wallets or wallets with residual activity (balance < 100 tokens and volume < 250 tokens). They won’t be represented in the previous and following graphs as they have very little impact on the global behavior of the network.

We have settled a second labelling system to characterize clusters in their size / intensity. For each type of behavior, rules don’t involve the same indicator for us to better interpret the different labels while remaining in the same order of magnitude. We use the trading volume to characterize traders, the balance for holders, and the number of tokens added / removed for investors:

• Micro: between 100 and 1,000 tokens;

• Little: between 1,000 and 10,000 tokens;

• Medium: between 10,000 and 100,000 tokens;

• Big: superior to 100,000 tokens;

We are also currently working on a user interface for actors to determine their own rules while interacting with the different graphs.

Looking at the same behavior categories while counting the total balance in number of tokens reveal that big holders continuously own a large majority of tokensThe network activity seems to have reached an all-time low on the last few months, with very few active actors.

We also note a decreasing number of tokens in the hand of the community, which means more and more tokens owned on a few top exchange wallets (around 30 excluded).

Finally, the volume exchanged by wallets in our different behavior categories shows the differences between major activity period. The volume considered here is the sum of inbound and outbound volumes for each category. This means there is double counting in these figures.

The three first months have been the more active for the overall iExec network, around the ICO and the first exchange platform listing. The December hype and Binance listing didn’t succeed in making volume higher than in these first three months. Intel’s partnership announcement is also clearly visible on May 2018.

Since several months, the network remains stable with a low exchanged volume activity.

Part 2: Behavior’s paths cross

While representing different metrics related to the different batches (number of wallets, balance or volume), we have put aside the relation between these categories. With the behavior’s paths cross view, we represent the way each category is fueling the other ones on the next period.

In addition to our previous categories, we find there three new categories (in black, grey and white): the one associated to inactive wallets (no balance / no volume), the one for residual activity (linked to our labelling rules), and the one for new wallets which are to be active on the next period (NOT_CREATED). This last label is used to represent the behavior distribution of new wallets on each next period.

With this representation, we can analyze the consistency of each category upon time, and their mutual relations from one period to the next one.

Then, we use the transfer matrix to analyze the behavior’s path cross representation.

For each wallet, we consider each couple of consecutive months and note its categories: we have a set of ordered pairs where every element represents a category transition of one wallet from a month to the next one. On the transfer matrix, we represent the distribution of pairs — lines account for the start month and column for the second one. This representation is normalized line-wise.

The transfer matrix gives us more insights on the way batches are fueling each other:

– Incoming investors are mainly fueling holder category, with a few becoming directly outgoing investors;

– Traders and outgoing investors are largely becoming inactive;

– Holders are largely remaining holders;

– Inactive wallets are largely remaining inactive.

We also note that 40% of new wallets (labeled as NOT_CREATED) are going directly to the residual activity category. A lot of wallets are dealing with a balance inferior to 100 tokens and a volume inferior to 250 tokens.

This transfer matrix sums up all the behavior evolution during our period of interest. Using it as a network performance indicator enables us to analyze the impact of main news regarding the project development.

Conclusion

To conclude on this study, our developments for analytic tools that can precisely describe and interpret a global token network activity are highly successful. We have demonstrated in this report that, with a rigorous and methodological process, transactional flows reveal all their meaning with the adequate data science approach.

About iExec, its token is clearly suffering the bear market. But this is not specific to RLC — it seems to be the case for all the other tokens as well. For now, developers are mainly accessing iExec’s marketplace through testnets (Kovan or Rinkeby). Knowing this, we did not focus on the utility smart contract in this report. However, when the market will be more mature, it will be definitely interesting to monitor the evolution of real usage compared to all financial transactions.

From now, we are interested in performing advanced comparisons on different token networks, as well as following the evolution of other promising projects to measure and analyze their adoption rate. Then, we are looking for other tokens to study, to deliver different kind of insights and to support the overall blockchain ecosystem in its maturing journey.

If you are interested to get your own token study, feel free to contact us for more details on our pricing modalities: contact@nyctale.io


All the data highlighted in this report comes from Nyctale’s own algorithms, fueled with Ethereum blockchain data. During the process to produce this report, we have defined different hypothesis for several important analytic steps:

• To exclude top exchange’s wallets;

• To cluster all the wallets based on similar behaviors;

• To characterize and label behavior’s categories.

Other methods could deliver different outcomes, regarding the hypothesis used.

Access to the report study on slideshare: https://www.slideshare.net/BlaiseCavalli/indepth-blockchain-network-analysis-with-ai-and-bigdata