-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(net): disconnect from malicious nodes if necessary #5899
Conversation
@Getter | ||
@Setter | ||
private long latestSaveBlockTime = System.currentTimeMillis(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why define this variable? Can it be determined by the header block time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It cannot be determined by the header block time. The variable is used to determine how long ago I save the last block. Let's suppose that i am far from newest block, the header block time is useless in this scene.
common/src/main/java/org/tron/common/parameter/CommonParameter.java
Outdated
Show resolved
Hide resolved
|
||
@Getter | ||
@Setter | ||
private int checkInterval = 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not need to be defined as a parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specification requires that the value of check interval is configurable.
|
||
@Getter | ||
@Setter | ||
private int blockNotChangeThreshold = 300; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is recommended to use inactive
instead, and merge the two parameters into one parameter inactiveThreshold
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have completely different definition. blockNotChangeThreshold is the config parameter that means if latest block has stay unchanged in blockNotChangeThreshold seconds, we think the node is isolated by peers.
@@ -178,6 +184,7 @@ private P2pConfig updateConfig(P2pConfig config) { | |||
config.setPort(parameter.getNodeListenPort()); | |||
config.setNetworkId(parameter.getNodeP2pVersion()); | |||
config.setDisconnectionPolicyEnable(parameter.isOpenFullTcpDisconnect()); | |||
config.setNotActiveInterval(parameter.peerNoBlockTime * 1000L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does libp2p need this parameter for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The strategy of Random disconnection use this parameter to filter no block peer. If we send or receive some blocks among peerNoBlockTime from this peer, it cannot be disconnect.
public class MaliciousFeature { | ||
|
||
@Setter | ||
private long advStartTime = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is advStartTime the current time? If the peer is always in synchronization, will it affect other judgments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has a precondition that the peer must be in adv status before judgments:
public void updateBadFeature3() {
long tempTime = Math.max(channel.getLastActiveTime(), advStartTime);
if (!needSyncFromPeer && !needSyncFromUs && System.currentTimeMillis() - tempTime
> resilienceConfig.getPeerNotActiveThreshold() * 1000L) {
zombieBeginTime = tempTime;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not make sense logically. It also does not affect the following function.
@Setter | ||
private long stopBlockInvTime = -1; | ||
@Setter | ||
private long lastRecBlockInvTime = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it defined as the current time? It is possible that the block has never been received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use lastRecBlockInvTime = -1 instead.
|
||
@Getter | ||
@Setter | ||
private boolean testStopInv = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only a field in a feature. The feature should be configurable. If the feature is not enabled, defining this field is useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used to specify whether we test if the peer is malicious. There is alternative method. Only one of them can be used. It can be configured in config.conf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a field in the rule. What I mean is that you should configure it according to the rules, not the fields in the rules. When you want to enable or disable rules, you can just change the configuration without changing the code.
//if peer is not active for too long, test if peer will broadcast block inventory to me | ||
//after I stop broadcasting block inventory to it | ||
peer.getMaliciousFeature().setStopBlockInvTime(System.currentTimeMillis()); | ||
invCheckExecutor.schedule(() -> peer.getMaliciousFeature().updateBadFeature4(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a scheduler to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check that if we receive some inventory from peer in 10 seconds, it is a timer. It is scheduled only once.
public class MaliciousFeature { | ||
|
||
@Setter | ||
private long advStartTime = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not make sense logically. It also does not affect the following function.
} | ||
|
||
//it can only be set from -1 to positive | ||
public void updateBadFeature1() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design is unreasonable. All features should be moved to the ResilienceService you defined for implementation. This class only needs to define fields and set field values.
@@ -1391,6 +1391,7 @@ public void updateDynamicProperties(BlockCapsule block) { | |||
(chainBaseManager.getDynamicPropertiesStore().getLatestBlockHeaderNumber() | |||
- chainBaseManager.getDynamicPropertiesStore().getLatestSolidifiedBlockNum() | |||
+ 1)); | |||
chainBaseManager.setLatestSaveBlockTime(System.currentTimeMillis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assignment statement should not be placed in this method, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We verify whether the node is isolated through the latest writing block time to database. If this time stay unchanged over some minutes, it's isolated. Do you have better solution or place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not deny the timing of assigning this variable, I just feel that the assignment statement is unreasonable in this method, I think you can put it near where this method is called, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can assign it in ResilienceService where it's used, but it's not accurate. Let's try it.
long tempTime = Math.max(channel.getLastActiveTime(), advStartTime); | ||
if (!needSyncFromPeer && !needSyncFromUs && System.currentTimeMillis() - tempTime | ||
> resilienceConfig.getPeerNotActiveThreshold() * 1000L) { | ||
zombieBeginTime = tempTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a peer has been set with zombieBeginTime, will it still be considered a malicious node even if it resumes block interaction before being selected to disconnect? I haven’t seen any other place that updates zombieBeginTime.
public void updateBadFeature4() { | ||
if (zombieBeginTime2 < 0 | ||
&& maliciousFeature.lastRecBlockInvTime < maliciousFeature.stopBlockInvStartTime) { | ||
zombieBeginTime2 = getLatestTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two questions:
- Same question as zombieBeginTime, is recovery not allowed?
- When the inventory is started to be sent but the test end time is not reached, will zombieBeginTime2 be assigned a value after this method is called? Is this assignment operation wrong at this time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zombieBeginTime is not allowed to recover. zombieBeginTime2 will be assigned a value if no inventory received during 10 seconds. Is there any problem ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why aren't zombieBeginTime and zombieBeginTime2 allowed to recover? Even though the node triggered the rules you set, it quickly recovered, or even mistakenly identified as a malicious node by us.
What I mean is, will there be a scenario where updateBadFeature4 is called before 10 seconds? What consequences will occur if this scenario happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems reasonable to recover zombieBeginTime, but zombieBeginTime2 cannot be recovered. updateBadFeature4 will be not called before 10 secondes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updateBadFeature4 will be not called before 10 secondes.
Okay, I took another careful look, and what I'm saying aligns with what you said.
What does this PR do?
To maintain the stability of the network node and prevent isolation, we need to disconnect from malicious nodes on a regular basis. Three strategies are adopted:
We also optimize the strategy of random disconnection.
Why are these changes required?
This PR has been tested by:
Follow up
Extra details