You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the regions info are stored in DatanodeTableValue, which is designed to only to indicate the region placement among the Datanodes. When it's used to store the regions info that are only open in this Datanode, it's fine. However, if we need to retrieve all the regions info, we are facing the dilemma that no place to store them but DatanodeTableValue. A fresh example is #3149: when we are trying to store the WAL options of all the regions, we have to give up the filtering of Datanode owned regions, but to duplicate storing the WAL options in all DatanodeTableValues.
Duplicate storing the whole info of regions is clearly an anti-pattern. It introduces the burden of complexity of updating site. Think this: what if we need to update some WAL options of a region, or we need to put more options into a region's info, do we need to retrieve all the DatanodeTableValues to achieve that? It's of course not scalable.
So we must find a way to overcome this. I'm thinking of redesign the table metadata, adding a new key that stores each region's info, to make table metadata small, clear, and single-responsibility again.
Implementation challenges
We must not make the process of opening regions in Datanode too slow if a new region info key is added. Now opening regions in Datanode can solely dependent on DatanodeTableValue. If the regions info are retrieved from other keys, that creates more requests to the KV store. We must evaluate the extra time costs.
The text was updated successfully, but these errors were encountered:
What type of enhancement is this?
Tech debt reduction
What does the enhancement do?
Currently, the regions info are stored in
DatanodeTableValue
, which is designed to only to indicate the region placement among the Datanodes. When it's used to store the regions info that are only open in this Datanode, it's fine. However, if we need to retrieve all the regions info, we are facing the dilemma that no place to store them butDatanodeTableValue
. A fresh example is #3149: when we are trying to store the WAL options of all the regions, we have to give up the filtering of Datanode owned regions, but to duplicate storing the WAL options in allDatanodeTableValue
s.Duplicate storing the whole info of regions is clearly an anti-pattern. It introduces the burden of complexity of updating site. Think this: what if we need to update some WAL options of a region, or we need to put more options into a region's info, do we need to retrieve all the
DatanodeTableValue
s to achieve that? It's of course not scalable.So we must find a way to overcome this. I'm thinking of redesign the table metadata, adding a new key that stores each region's info, to make table metadata small, clear, and single-responsibility again.
Implementation challenges
We must not make the process of opening regions in Datanode too slow if a new region info key is added. Now opening regions in Datanode can solely dependent on
DatanodeTableValue
. If the regions info are retrieved from other keys, that creates more requests to the KV store. We must evaluate the extra time costs.The text was updated successfully, but these errors were encountered: