Subsquid Processor

Repository

GitHub - litentry/squid: Litentry's Squid is a multi chain data indexer for Substrate based networks
GitHub

Overview

A NodeJS application built with Subsquid, using Postgres for data storage, GraphQL for exposing the data to clients, and Docker, Nginx and Letsencrypt for deployment.

Getting Started

1
git clone https://github.com/litentry/squid
2
cd squid
3
yarn
4
yarn db:init
5
yarn dev:kusama # runs the kusama processor
6
7
# open a new terminal
8
yarn query-node # runs the graphql terminal
Copied!

Workflow

Generating Metadata

Metadata from the chain is required to allow us to build the types of the events and extrinsics data. We store the data in chains/{network}/{network}Versions.json. To generate (or update) this file run:
1
make exploreKusama # replace the network name for other networks
Copied!

Generating Types

The generation of types relies on the metadata file (see above), and the typegen file (chains/{network}/{network}Typegen.json). The properties of the typegen file are:
  • outDir: the folder to generate the types files in (../../src/types/{network})
  • chainVersions: the metadata file generated in the section above
  • typesBundle: this is for the Polkadot API types (the ones used in ApiPromise.create), if Subsquid have the types already you can simply specify the network name, if not you have to download them and enter the relative path to the file here.
  • events: an array of events you want to process using the format pallet.event
  • extrinsics: an array of extrinsics you want to process using the format pallet.extrinsic
To index new events/extrinsics on an existing processor add to the relevant array and run:
1
make typegenKusama # replace the network name for other networks
Copied!

Defining the Schema

In schema.graphql we define the entities for our graph, then use codegen to generate models. As the processor graph is designed to be used within Litentry Graph is it vital that everything in the schema is prefixed with "Substrate" to avoid conflicts.
All blockchain indexers must be idempotent else the data will not be accurate. This is especially important in our multi chain Squid. The IDs of the entities must be unique to the network, else the individual network processors will overwrite each other's records.
In the processors we set the postgres isolation level to "REPEATABLE READ" instead of the default "SERIALIZE" level. This is because the default locks the tables preventing us from running multiple processors at the same time. This highlights the importance of idempotent schema design and event/extrinsic processing logic.
For entities that have a many to one relationship with SubstrateAccount "network:blockNumber:index" (where index is the index of the event or extrinsic on the block) is a reliable ID.
Once the schema is defined we generate the models (in src/models) by running:
1
yarn codegen
Copied!

Creating Migrations

After updating the schema and generating the models we need to create a new migration by running:
1
yarn migration
Copied!
To run the new migration run:
1
yarn migrate
2
3
# or do this instead for a fresh database:
4
yarn db:init
Copied!

Handling Events & Extrinsics

The handlers are in src/handlers and they are consumed in src/processors/{network}Processor.ts.
For each event or extrinsic the handler needs to construct the types properly based on the network's metadata version used on the block being indexed. We do this at the top of the handler file, first declaring a consistent type the handler will use, then a function to use the types generated by typegen to extract the data accurately.
The functions to get the types for the events and extrinsics are a little verbose and tediously repetitive, but if we don't use accurate types based on versioned metadata we are likely to introduce bugs.