Blockscan
The txs.blockscan
spider is designed to collect transaction data from blockscan explorers like
Etherscan.
Specifically, txs.blockscan
aims at searching for the source and destination of funds from a given address.
Usage
To run the spider, use the following command:
scrapy crawl txs.blockscan \
-a source=0xYourSourceAddress \
-a apikeys=YourApiKey1,YourApiKey2 \
-a endpoint=https://api.etherscan.io/api \
-a strategy=BlockchainSpider.strategies.txs.BFS \
-a allowed_tokens=0xTokenAddress1,0xTokenAddress2 \
-a out=/path/to/your/data \
-a out_fields=hash,address_from,address_to,value,token_id,timestamp,block_number,contract_address,symbol,decimals \
-a enable=BlockchainSpider.middlewares.txs.blockscan.ExternalTransferMiddleware,BlockchainSpider.middlewares.txs.blockscan.Token20TransferMiddleware \
-a start_blk=1000000 \
-a end_blk=1500000 \
-a max_pages=1 \
-a max_page_size=10000
Parameters
source
: The source address to start collecting transactions. This parameter is required.apikeys
: A comma-separated list of API keys for accessing the blockchain explorer. At least one API key is required.endpoint
: (optional) The API endpoint of the blockchain explorer. Default ishttps://api.etherscan.io/api
. See the support chains section for more endpoints.strategy
: (optional) The traversal strategy for collecting transactions. Default isBlockchainSpider.strategies.txs.BFS
(Breadth-First Search). BlockchainSpider implements other advanced strategies, e.g., Poison, APPR, and TTR. For more details, please refer to Transaction tracing section.allowed_tokens
: (optional) A comma-separated list of token contract addresses to filter token transfers. If not provided, all tokens are included. It should be noted that some strategies (especially TTR methods) may automatically analyze token types and collect token transfer records that are different from the given type.out
: (optional) The output directory for storing the collected data. The default is./data
.out_fields
: (optional) A comma-separated list of fields to include in the output. the default isaddress_from,address_to,block_number,contract_address,decimals,gas,gas_price,hash,id,symbol,timestamp,token_id,value
, and other fields includeisError
,input
, andnonce
.enable
: (optional) A comma-separated list of middlewares to enable during the spider run. Add different middlewares to trace different types of assets. See the available middleware section for details. The default isBlockchainSpider.middlewares.txs.blockscan.ExternalTransferMiddleware
.start_blk
: (optional) The starting block number for data searching. If not specified, the spider will start from the first block.end_blk
: (optional) The ending block number for data searching. If not specified, the spider will search the data until the latest block.max_pages
: (optional) Maximum number of money transfer pages per middleware request for an address. Default is1
.max_page_size
: (optional) Maximum number of money transfers per page (<=10000). Default is10000
.
Available middlewares
The spider uses several middlewares to handle different types of token transfers. Below are the available middlewares and their functionalities:
BlockchainSpider.middlewares.txs.blockscan.ExternalTransferMiddleware
: Trace external transfers (e.g., ETH transfers) from the transaction data.BlockchainSpider.middlewares.txs.blockscan.InternalTransferMiddleware
: Trace internal transfers (e.g., contract-to-contract calls) from the transaction data.BlockchainSpider.middlewares.txs.blockscan.Token20TransferMiddleware
: Trace ERC-20 token transfers from the transaction data.BlockchainSpider.middlewares.txs.blockscan.Token721TransferMiddleware
: Trace ERC-721 (NFT) token transfers from the transaction data.