您的位置:首页 > 其它

以太坊(Ethereum) - 节点时间未同步和区块同步失败案例分析

2018-06-26 15:24 627 查看

背景

以太坊技术搭建的区块链网络,节点间需要保证时间一致,才能正常有序的发送交易和生成区块,使得众多节点共同维护分布式账本(区块数据+状态数据)。但是,网络中节点的系统时间不一致回出现什么现象呢,我们通过2个例子来分析一下。

问题现象:

1. 以太坊节点启动时,报出时间未同步的警告日志:

WARN [03-20|17:20:54] System clock seems off by -1m7.049442998s, which can prevent network connectivity 

WARN [03-20|17:20:54] Please enable network time synchronisation in system settings. 

2. 以太坊节点间进行区块同步时,出现回滚和同步失败的警告日志:

WARN [03-19|18:07:10] Rolled back headers                      count=181  header=72832->72832 fast=74752->74752 block=0->0

WARN [03-19|18:07:10] Synchronisation failed, dropping peer    peer=d3f5b2f51ba7acc6 err="retrieved hash chain is invalid"

 

问题分析:

我们先来看看日志信息:

DEBUG[D 03-20|18:19:06] Synchronising with the network           peer=d3f5b2f51ba7acc6 eth=63 head=fea6e3…5abc2d td=533542548562 mode=full
DEBUG[03-20|18:19:06] Retrieving remote chain height           peer=d3f5b2f51ba7acc6
DEBUG[03-20|18:19:06] Fetching batch of headers                id=d3f5b2f51ba7acc6 conn=staticdial count=1   fromhash=fea6e3…5abc2d skip=0   reverse=false
DEBUG[03-20|18:19:06] Remote head header identified            peer=d3f5b2f51ba7acc6 number=80698 hash=fea6e3…5abc2d
DEBUG[03-20|18:19:06] Looking for common ancestor              peer=d3f5b2f51ba7acc6 local=80690 remote=80698
DEBUG[03-20|18:19:06] Fetching batch of headers                id=d3f5b2f51ba7acc6 conn=staticdial count=13  fromnum=80498 skip=15  reverse=false
DEBUG[03-20|18:19:06] Found common ancestor                    peer=d3f5b2f51ba7acc6 number=80690 hash=4cba72…c3fce5
DEBUG[03-20|18:19:06] Directing header downloads               peer=d3f5b2f51ba7acc6 origin=80691
DEBUG[03-20|18:19:06] Downloading block bodies                 origin=80691
DEBUG[03-20|18:19:06] Downloading transaction receipts         origin=80691
DEBUG[03-20|18:19:06] Fetching batch of headers                id=d3f5b2f51ba7acc6 conn=staticdial count=128 fromnum=80882 skip=191 reverse=false
DEBUG[03-20|18:19:06] Fetching batch of headers                id=d3f5b2f51ba7acc6 conn=staticdial count=192 fromnum=80691 skip=0   reverse=false
DEBUG[03-20|18:19:06] Fetching batch of headers                id=d3f5b2f51ba7acc6 conn=staticdial count=192 fromnum=80699 skip=0   reverse=false
DEBUG[03-20|18:19:06] Inserting downloaded chain               items=8  firstnum=80691 firsthash=4a10d7…4b050e lastnum=80698 lasthash=fea6e3…5abc2d
DEBUG[03-20|18:19:06] Trie cache stats after commit            misses=285 unloads=0
DEBUG[03-20|18:19:06] Dereferenced trie from memory database   nodes=4 size=1.20kB  time=11.957µs gcnodes=18 gcsize=5.92kB  gctime=68.067µs livenodes=528 livesize=149.81kB
DEBUG[03-20|18:19:06] Inserted new block                       number=80691 hash=4a10d7…4b050e uncles=0 txs=0 gas=0 elapsed=10.024ms
DEBUG[03-20|18:19:06] Trie cache stats after commit            misses=285 unloads=0
DEBUG[03-20|18:19:06] Dereferenced trie from memory database   nodes=5 size=1.23kB  time=11.865µs gcnodes=23 gcsize=7.16kB  gctime=79.686µs livenodes=526 livesize=149.56kB
DEBUG[03-20|18:19:06] Inserted new block                       number=80692 hash=c0f08c…f3196f uncles=0 txs=0 gas=0 elapsed=1.194ms
DEBUG[03-20|18:19:06] Downloaded item processing failed        number=80694 hash=5802eb…4fc564 err="future block: 1521541188 > 1521541176"
DEBUG[03-20|18:19:06] Header download terminated               peer=d3f5b2f51ba7acc6
DEBUG[03-20|18:19:06] Transaction receipt download terminated  err="receipt download canceled (requested)"
DEBUG[03-20|18:19:06] Block body download terminated           err="block body download canceled (requested)"
DEBUG[03-20|18:19:06] Synchronisation terminated               elapsed=102.384528ms
WARN [03-20|18:19:06] Synchronisation failed, dropping peer    peer=d3f5b2f51ba7acc6 err="retrieved hash chain is invalid"
DEBUG[03-20|18:19:06] Removing Ethereum peer                   peer=d3f5b2f51ba7acc6
DEBUG[03-20|18:19:06] Ethereum message handling failed         id=d3f5b2f51ba7acc6 conn=staticdial err=EOF
DEBUG[03-20|18:19:06] Removing p2p peer                        id=d3f5b2f51ba7acc6 conn=staticdial duration=7.895s    peers=0 req=false err="useless peer"
DEBUG[03-20|18:19:09] Adding p2p peer                          name=Geth/v1.8.0-unstable...                           addr=128.32.37.209:40404   peers=1
DEBUG[03-20|18:19:09] Ethereum peer connected                  id=6bf05c91058746aa conn=dyndial    name=Geth/v1.8.0-unstable/linux-amd64/go1.9.1
DEBUG[03-20|18:19:09] Ethereum handshake failed                id=6bf05c91058746aa conn=dyndial    err="Genesis block mismatch - 5e31ddf9f93f71ee (!= 29ff4c23644fa9ee)"
DEBUG[03-20|18:19:09] Removing p2p peer                        id=6bf05c91058746aa conn=dyndial    duration=179.772ms peers=0 req=false err="Genesis block mismatch - 5e31ddf9f93f71ee (!= 29ff4c23644fa9ee)"
DEBUG[03-20|18:19:11] Removed dead node                        b=12 id=35f339215dc5d4ae ip=83.83.179.65
DEBUG[03-20|18:19:14] Trie cache stats after commit            misses=285 unloads=0
DEBUG[03-20|18:19:14] Dereferenced trie from memory database   nodes=1 size=564.00B time=13.063µs gcnodes=24 gcsize=7.72kB  gctime=92.277µs livenodes=529 livesize=150.20kB
DEBUG[03-20|18:19:14] Inserted new block                       number=80693 hash=01b472…8b7890 uncles=0 txs=0 gas=0 elapsed=28.188ms
INFO [03-20|18:19:14] Imported new chain segment               blocks=1  txs=0 mgas=0.000 elapsed=28.317ms     mgasps=0.000 number=80693 hash=01b472…8b7890 cache=150.62kB
INFO [03-20|18:19:14] Mining too far in the future             wait=14s
DEBUG[03-20|18:19:14] Reinjecting stale transactions           count=0
DEBUG[03-20|18:19:16] Removed dead node                        b=15 id=5dabdd5adb72cb1a ip=36.110.209.136
DEBUG[03-20|18:19:23] Found seed node in database              id=71e3445d0b9af99a addr=218.85.66.28:30399    age=8h25m6.047057248s
DEBUG[03-20|18:19:23] Found seed node in database              id=ba1ae648ebbd893d addr=106.167.183.143:30303 age=16h13m35.047096949s

 

   我们可以看到在同步第80694号区块时出现了错误,那我们通过web3.eth.getBlock接口查询一下此区块的内容。    
> > eth.getBlock(80694)
{
difficulty: 9770747,
extraData: "0xd983010802846765746887676f312e392e328777696e646f7773",
gasLimit: 4712388,
gasUsed: 0,
hash: "0x5802ebeb71ce647dd874b5c7d20130ea73e7f81ca71a1d86b78adce4a14fc564",
logsBloom: "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
miner: "0x609d020ed7873ba9d5732d67109546365ef3c49c",
mixHash: "0x1f10d92fb51d6d7c522f3b1ac6f79bca752c04ae4b01381a9a2193c41c896beb",
nonce: "0x0255559b5d0e03c3",
number: 80694,
parentHash: "0x01b472d3ed73290ba95428179b958be6d4ffec4f790f0ebe417075f11e8b7890",
receiptsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
sha3Uncles: "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
size: 540,
stateRoot: "0xf32f401b3c4ed1f14f22108c2f47e9cf77c4565738482459a84cb572b9ca79de",
  timestamp: 1521541188,
  totalDifficulty: 533503417848,
transactions: [],
transactionsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
uncles: []
}

 

我们可以看到此区块的时间戳为:1521541188,即时间是Tue Mar 20 2018 18:19:48 GMT+0800 (CST)。

但是,我们可以通过日志发现处理此块时,本节点的系统时间为18:19:06,区块的生成时间比本节点系统时间晚了42秒,其时间差超过了所允许的时间跨度(常量maxTimeFutureBlocks的默认值为30秒)

 

代码分析

在分析此问题时,需要查看的以太坊相关源代码才能了解其原理,有兴趣的读者可以一起来阅读一下。我将本文涉及的2处异常的直接相关的代码标记为红色。

节点启动时检查系统时间的逻辑:

go-ethereum/p2p/discover/ntp.go

// checkClockDrift queries an NTP server for clock drifts and warns the user if
// one large enough is detected.
func checkClockDrift() {
drift, err := sntpDrift(ntpChecks)
if err != nil {
return
}
if drift < -driftThreshold || drift > driftThreshold {
log.Warn(fmt.Sprintf("System clock seems off by %v, which can prevent network connectivity", drift))
log.Warn("Please enable network time synchronisation in system settings.")
} else {
log.Debug("NTP sanity check done", "drift", drift)
}
}

ntp.go的checkClockDrift函数所依赖的一些变量如下:

https://github.com/ethereum/go-ethereum/blob/9fd76e33af367752160ab0e33d1097e1e9aff6e4/p2p/discover/udp.go

// Timeouts
const (
respTimeout = 500 * time.Millisecond
sendTimeout = 500 * time.Millisecond
expiration  = 20 * time.Second
​
ntpFailureThreshold = 32               // Continuous timeouts after which to check NTP
ntpWarningCooldown  = 10 * time.Minute // Minimum amount of time to pass before repeating NTP warning
driftThreshold      = 10 * time.Second // Allowed clock drift before warning user
)

 

区块同步逻辑:

接收对等节点的区块报文后,开始导入本地账本。

 https://github.com/ethereum/go-ethereum/blob/77da203547b83c70d12552a86bb7beee52f6383e/eth/downloader/downloader.go  
func (d *Downloader) importBlockResults(results []*fetchResult) error {
// Check for any early termination requests
if len(results) == 0 {
return nil
}
select {
case <-d.quitCh:
return errCancelContentProcessing
default:
}
// Retrieve the a batch of results to import
first, last := results[0].Header, results[len(results)-1].Header
log.Debug("Inserting downloaded chain", "items", len(results),
"firstnum", first.Number, "firsthash", first.Hash(),
"lastnum", last.Number, "lasthash", last.Hash(),
)
blocks := make([]*types.Block, len(results))
for i, result := range results {
blocks[i] = types.NewBlockWithHeader(result.Header).WithBody(result.Transactions, result.Uncles)
}
if index, err := d.blockchain.InsertChain(blocks); err != nil {
log.Debug("Downloaded item processing failed", "number", results[index].Header.Number, "hash", results[index].Header.Hash(), "err", err)
return errInvalidChain
}
return nil
}

区块加入本地账本的主逻辑:

https://github.com/ethereum/go-ethereum/blob/63687f04e441c97cbb39d6b0ebea346b154d2e73/core/blockchain.go

......
const (
bodyCacheLimit      = 256
blockCacheLimit     = 256
maxFutureBlocks     = 256
maxTimeFutureBlocks = 30
badBlockLimit       = 10
triesInMemory       = 128
​
// BlockChainVersion ensures that an incompatible database forces a resync from scratch.
BlockChainVersion = 3
)

......

​
// insertChain will execute the actual chain insertion and event aggregation. The
// only reason this method exists as a separate one is to make locking cleaner
// with deferred statements.
func (bc *BlockChain) insertChain(chain types.Blocks) (int, []interface{}, []*types.Log, error) {
// Do a sanity check that the provided chain is actually ordered and linked
for i := 1; i < len(chain); i++ {
if chain[i].NumberU64() != chain[i-1].NumberU64()+1 || chain[i].ParentHash() != chain[i-1].Hash() {
// Chain broke ancestry, log a messge (programming error) and skip insertion
log.Error("Non contiguous block insert", "number", chain[i].Number(), "hash", chain[i].Hash(),
"parent", chain[i].ParentHash(), "prevnumber", chain[i-1].Number(), "prevhash", chain[i-1].Hash())
​
return 0, nil, nil, fmt.Errorf("non contiguous insert: item %d is #%d [%x…], item %d is #%d [%x…] (parent [%x…])", i-1, chain[i-1].NumberU64(),
chain[i-1].Hash().Bytes()[:4], i, chain[i].NumberU64(), chain[i].Hash().Bytes()[:4], chain[i].ParentHash().Bytes()[:4])
}
}
// Pre-checks passed, start the full block imports
bc.wg.Add(1)
defer bc.wg.Done()
​
bc.chainmu.Lock()
defer bc.chainmu.Unlock()
​
// A queued approach to delivering events. This is generally
// faster than direct delivery and requires much less mutex
// acquiring.
var (
stats         = insertStats{startTime: mclock.Now()}
events        = make([]interface{}, 0, len(chain))
lastCanon     *types.Block
coalescedLogs []*types.Log
)
// Start the parallel header verifier
headers := make([]*types.Header, len(chain))
seals := make([]bool, len(chain))
​
for i, block := range chain {
headers[i] = block.Header()
seals[i] = true
}
abort, results := bc.engine.VerifyHeaders(bc, headers, seals)
defer close(abort)
​
// Iterate over the blocks and insert when the verifier permits
for i, block := range chain {
// If the chain is terminating, stop processing blocks
if atomic.LoadInt32(&bc.procInterrupt) == 1 {
log.Debug("Premature abort during blocks processing")
break
}
// If the header is a banned one, straight out abort
if BadHashes[block.Hash()] {
bc.reportBlock(block, nil, ErrBlacklistedHash)
return i, events, coalescedLogs, ErrBlacklistedHash
}
// Wait for the block's verification to complete
bstart := time.Now()
​
err := <-results
if err == nil {
err = bc.Validator().ValidateBody(block)
}
switch {
case err == ErrKnownBlock:
// Block and state both already known. However if the current block is below
// this number we did a rollback and we should reimport it nonetheless.
if bc.CurrentBlock().NumberU64() >= block.NumberU64() {
stats.ignored++
continue
}
​
case err == consensus.ErrFutureBlock:
// Allow up to MaxFuture second in the future blocks. If this limit is exceeded
// the chain is discarded and processed at a later time if given.
max := big.NewInt(time.Now().Unix() + maxTimeFutureBlocks)
if block.Time().Cmp(max) > 0 {
return i, events, coalescedLogs, fmt.Errorf("future block: %v > %v", block.Time(), max)
}
bc.futureBlocks.Add(block.Hash(), block)
stats.queued++
continue
​
case err == consensus.ErrUnknownAncestor && bc.futureBlocks.Contains(block.ParentHash()):
bc.futureBlocks.Add(block.Hash(), block)
stats.queued++
continue
​
case err == consensus.ErrPrunedAncestor:
// Block competing with the canonical chain, store in the db, but don't process
// until the competitor TD goes above the canonical TD
currentBlock := bc.CurrentBlock()
localTd := bc.GetTd(currentBlock.Hash(), currentBlock.NumberU64())
externTd := new(big.Int).Add(bc.GetTd(block.ParentHash(), block.NumberU64()-1), block.Difficulty())
if localTd.Cmp(externTd) > 0 {
if err = bc.WriteBlockWithoutState(block, externTd); err != nil {
return i, events, coalescedLogs, err
}
continue
}
// Competitor chain beat canonical, gather all blocks from the common ancestor
var winner []*types.Block
​
parent := bc.GetBlock(block.ParentHash(), block.NumberU64()-1)
for !bc.HasState(parent.Root()) {
winner = append(winner, parent)
parent = bc.GetBlock(parent.ParentHash(), parent.NumberU64()-1)
}
for j := 0; j < len(winner)/2; j++ {
winner[j], winner[len(winner)-1-j] = winner[len(winner)-1-j], winner[j]
}
// Import all the pruned blocks to make the state available
bc.chainmu.Unlock()
_, evs, logs, err := bc.insertChain(winner)
bc.chainmu.Lock()
events, coalescedLogs = evs, logs
​
if err != nil {
return i, events, coalescedLogs, err
}
​
case err != nil:
bc.reportBlock(block, nil, err)
return i, events, coalescedLogs, err
}
// Create a new statedb using the parent block and report an
// error if it fails.
var parent *types.Block
if i == 0 {
parent = bc.GetBlock(block.ParentHash(), block.NumberU64()-1)
} else {
parent = chain[i-1]
}
state, err := state.New(parent.Root(), bc.stateCache)
if err != nil {
return i, events, coalescedLogs, err
}
// Process block using the parent state as reference point.
receipts, logs, usedGas, err := bc.processor.Process(block, state, bc.vmConfig)
if err != nil {
bc.reportBlock(block, receipts, err)
return i, events, coalescedLogs, err
}
// Validate the state using the default validator
err = bc.Validator().ValidateState(block, parent, state, receipts, usedGas)
if err != nil {
bc.reportBlock(block, receipts, err)
return i, events, coalescedLogs, err
}
proctime := time.Since(bstart)
​
// Write the block to the chain and get the status.
status, err := bc.WriteBlockWithState(block, receipts, state)
if err != nil {
return i, events, coalescedLogs, err
}
switch status {
case CanonStatTy:
log.Debug("Inserted new block", "number", block.Number(), "hash", block.Hash(), "uncles", len(block.Uncles()),
"txs", len(block.Transactions()), "gas", block.GasUsed(), "elapsed", common.PrettyDuration(time.Since(bstart)))
​
coalescedLogs = append(coalescedLogs, logs...)
blockInsertTimer.UpdateSince(bstart)
events = append(events, ChainEvent{block, block.Hash(), logs})
lastCanon = block
​
// Only count canonical blocks for GC processing time
bc.gcproc += proctime
​
case SideStatTy:
log.Debug("Inserted forked block", "number", block.Number(), "hash", block.Hash(), "diff", block.Difficulty(), "elapsed",
common.PrettyDuration(time.Since(bstart)), "txs", len(block.Transactions()), "gas", block.GasUsed(), "uncles", len(block.Uncles()))
​
blockInsertTimer.UpdateSince(bstart)
events = append(events, ChainSideEvent{block})
}
stats.processed++
stats.usedGas += usedGas
stats.report(chain, i, bc.stateCache.TrieDB().Size())
}
// Append a single chain head event if we've progressed the chain
if lastCanon != nil && bc.CurrentBlock().Hash() == lastCanon.Hash() {
events = append(events, ChainHeadEvent{lastCanon})
}
return 0, events, coalescedLogs, nil
}

 

区块头部校验逻辑:

https://github.com/ethereum/go-ethereum/blob/5f8888e11606296c9582496974c0f6b96a882146/consensus/clique/clique.go

// verifyHeader checks whether a header conforms to the consensus rules.The
// caller may optionally pass in a batch of parents (ascending order) to avoid
// looking those up from the database. This is useful for concurrently verifying
// a batch of new headers.
func (c *Clique) verifyHeader(chain consensus.ChainReader, header *types.Header, parents []*types.Header) error {
if header.Number == nil {
return errUnknownBlock
}
number := header.Number.Uint64()
​
// Don't waste time checking blocks from the future
if header.Time.Cmp(big.NewInt(time.Now().Unix())) > 0 {
return consensus.ErrFutureBlock
}
// Checkpoint blocks need to enforce zero beneficiary
checkpoint := (number % c.config.Epoch) == 0
if checkpoint && header.Coinbase != (common.Address{}) {
return errInvalidCheckpointBeneficiary
}
// Nonces must be 0x00..0 or 0xff..f, zeroes enforced on checkpoints
if !bytes.Equal(header.Nonce[:], nonceAuthVote) && !bytes.Equal(header.Nonce[:], nonceDropVote) {
return errInvalidVote
}
if checkpoint && !bytes.Equal(header.Nonce[:], nonceDropVote) {
return errInvalidCheckpointVote
}
// Check that the extra-data contains both the vanity and signature
if len(header.Extra) < extraVanity {
return errMissingVanity
}
if len(header.Extra) < extraVanity+extraSeal {
return errMissingSignature
}
// Ensure that the extra-data contains a signer list on checkpoint, but none otherwise
signersBytes := len(header.Extra) - extraVanity - extraSeal
if !checkpoint && signersBytes != 0 {
return errExtraSigners
}
if checkpoint && signersBytes%common.AddressLength != 0 {
return errInvalidCheckpointSigners
}
// Ensure that the mix digest is zero as we don't have fork protection currently
if header.MixDigest != (common.Hash{}) {
return errInvalidMixDigest
}
// Ensure that the block doesn't contain any uncles which are meaningless in PoA
if header.UncleHash != uncleHash {
return errInvalidUncleHash
}
// Ensure that the block's difficulty is meaningful (may not be correct at this point)
if number > 0 {
if header.Difficulty == nil || (header.Difficulty.Cmp(diffInTurn) != 0 && header.Difficulty.Cmp(diffNoTurn) != 0) {
return errInvalidDifficulty
}
}
// If all checks passed, validate any special fields for hard forks
if err := misc.VerifyForkHashes(chain.Config(), header, false); err != nil {
return err
}
// All basic checks passed, verify cascading fields
return c.verifyCascadingFields(chain, header, parents)
}

 

 

解决方案:

同步系统时间,以Ubuntu为例:

sudo apt install ntpdate

sudo ntpdate 0.asia.pool.ntp.org

 

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: