Documentation
Data Sources
Most of the data used through the portal comes from the Filecoin chain. There is also some off-chain data like Datacap Applications or Storage Provider's Reputation scores that are collected from other places.
Deals
Deals data is available on chain and can be obtained in different ways:
-
Doing a
StateMarketDealsJSON-RPC call and parsing the returned JSON. If you don't have a node running, you can use Glif nodesStateMarketDealsperiodic dump on S3 (direct link). -
Using an oracle like
fil-naive-marketwatch. - Reconstructing the deals state from Lily tables.
Clients
Clients can be derived from the deals dataset and expanded with the following sources:
-
Datacap. From Datacap Stats API calling
https://api.datacapstats.io/api/getVerifiedClients. You get a JSON of verified clients in the FIL+ program that contains client names, Datacap application data and other self-reported data. Alternatively, this data can be obtained by parsing the relevant GitHub repositories issues and comments.
Storage Providers
Storage Providers can be derived from the deals dataset. More information about providers can be collected in the following sources:
- Location using the different provider.quest endpoints/datasets.
Reputation Data
Reputation is both obtained from FilRep (methodology) and augmented with custom metrics around deals. For example, what is the average replication of a deal for the SP?
Energy Data
Energy data is available from Filecoin Green (Model API and Green Scores API)
FVM
Filecoin Virtual Machine data is trickier to get. Some sources:
- Directly from the FVM dashboard.
- Some metrics are available in the Spacescope API
Messages
A few teams across the ecosystem are indexing Filecoin Messages. The most comprehensive source are Beryx and FilInfo.
Data Indexers
Besides the data sources mentioned above, there are a few data indexers that provide data in a more structured way.
- Starboard - FVM
- Dev Storage
- Beryx
- Spacegap
- Filecoin Green
- Filecoin CID Checker
- File.app
- Filecoin Traces
- Filecoin RPC Providers
- Glif Explorer
- DMOB Messages Database powering FilInfo
- Filrep
- FilFox and API
- FilScan and API
- Filutils decoding message params (example).
- Blockscout and API for FEVM.
JSON-RPC Endpoints
Nodes usually implement all the JSON-RPC methods needed to get the data.
- Glif -
https://api.node.glif.io -
Zondax -
https://api.zondax.ch/fil/node/mainnet/rpc/v1 -
Laconic -
https://fil-mainnet-1.rpc.laconic.com/rpc/v1 -
Provider Quest -
https://lotus.miner.report/mainnet_api/0/node/rpc/v0 - More at filecoin.io docs!
- More at Chainlist!
Code
Using Datasets
The Filecoin Data Portal publishes up to date dataset on a daily bases as static Parquet files. You can then use any tool you want to explore and use these datasets! Let's go through some examples.
Python
You can use the pandas library to read the Parquet files. You
can play with the datasets in Google Colab for free. Check this sample notebook.
JavaScript
You can use the duckdb Obervable client library to read the Parquet files and run SQL queries on them. Check this
sample Observable JS notebook to see how to explore and visualize the datasets.
Dune
Some of the datasets built by the pipelines are also available in Dune. You can use the Dune SQL editor to run queries on these datasets. Check this one on Dune.
Google Sheets
The pipelines that are executed to generate the datasets are also pushing the data to Google Sheets. You can access the data directly from these Google Sheets:
You can create a new personal Google Sheet and use the IMPORTRANGE function to read data from these sheets and be able to plot or add more transformations
on top.
BI Tools
Depending on the BI tool you are using, you can connect to the Parquet files directly, use the Google Sheets as a data source, or you'll need to load the data into a database like PostgreSQL or BigQuery. There are
Evidence
Filecoin Pulse is a website build with Evidence using the Filecoin Data Portal datasets. You can check the source code on GitHub to see how to use the datasets in Evidence.
Observable Framework
Another alternative is to use the Observable framework to create dashboards and visualizations. You can use parquet files as data sources and generate beautiful static websites providing dashboards and reports like Filecoin in Numbers, a dashboard built with Observable Framework on top of the Portal open datasets. You can check the source code on GitHub too.
Others
Do you have any other tool you want to use to explore the datasets? Reach out and let's explore how to use the datasets with your favorite tools!