-
Notifications
You must be signed in to change notification settings - Fork 367
[Nexthop] fboss2 CLI config prototype #668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0f14820 to
0ef3b1d
Compare
|
This is still WIP, I fixed some small bugs and I'm still working on the unit tests. |
0ef3b1d to
2e88716
Compare
|
Fixed some concurrency bugs found by adding unit tests, and enhanced test coverage further. |
2e88716 to
65f4c50
Compare
65f4c50 to
857d77d
Compare
joseph5wu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a huge PR with lots of changes, I might have a lot of comments all at once.
Overall I think the code structures and basic idea of how the prototype should be look good to me.
However, there're some clarification questions that I would like to hear your thought:
- How many writes/reads thrift object inside one session? My expectation is we only need one read to get the thrift object from the config file at
session startand then all the config commands will only need to update the memory thrift object; finally thesession committo actually write the thrift object to a new config file for the services to reload. I know it might be tricky to meet this expectation for a cli binary. But let's make sure your implementation can actually handle multiple commands but also avoid too many read/write - revision control. I'm fine with the tool automatically generates the revision id. However, I'm not sure it will be easy to identify such revision id in the history. Besides, I remember in your design doc, there's some config metadata to map the revision with the command list. I don't see such logic in this prototype.
- We don't recommend vendors to change our BUCK files as you won't be able to check the buck build signals. Besides I would also recommend to leave those existing unit tests change out of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually don't recommend vendors to touch the BUCK file as they won't be able to verify the changes using buck command.
That's why once we finish landing the PRs, we will adjust the buck file in another diff or the same diff if our on-diff signal complains
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is going to save you time 😁
|
|
||
| namespace facebook::fboss { | ||
|
|
||
| const CommandTree& kConfigCommandTree() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this function will become massive once we keep adding more and more sub-commands.
Maybe we can adjust the returned CommandTree as:
root = {
// Config related subcommands, like applied-info/ history/ reload/ rollback
// Session control subcommands
}
// make the sub class to manage the sub CommandTree so we don't have to update here every time whether there's a new sub command
root.add(CmdConfigInterface.getCommandTree());
root.add(CmdConfigXXX.getCommandTree());
return root;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest refactoring this in a separate change. We already have another change internally that is going to avoid mixing everything together like that.
| case utils::ObjectArgTypeId::OBJECT_ARG_TYPE_FAN_PWM: | ||
| subCmd->add_option("pwm", args, "Fan PWM (0..100) or 'disable'"); | ||
| break; | ||
| case utils::ObjectArgTypeId::OBJECT_ARG_TYPE_MTU: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about how to better manage the sub-commands and arg types of the new config.
We don't wanna blow up the base class with all the different sub-commands
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest refactoring this in a separate change. We already have another change that is going to avoid mixing everything together like that.
| // Get the path to the session config file (~/.fboss2/agent.conf) | ||
| std::string getSessionConfigPath() const; | ||
|
|
||
| // Get the path to the system config file (/etc/coop/agent.conf) | ||
| std::string getSystemConfigPath() const; | ||
|
|
||
| // Get the path to the CLI config directory (/etc/coop/cli) | ||
| std::string getCliConfigDir() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead defining your own path/directory const function,
let's using https://github.com/facebook/fboss/blob/main/fboss/agent/AgentDirectoryUtil.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the code to this:
ConfigSession::ConfigSession() {
username_ = getUsername();
std::string homeDir = getHomeDirectory();
// Use AgentDirectoryUtil to get the config directory path
// getConfigDirectory() returns /etc/coop/agent, so we get the parent to get
// /etc/coop
AgentDirectoryUtil dirUtil;
fs::path configDir = fs::path(dirUtil.getConfigDirectory()).parent_path();
sessionConfigPath_ = homeDir + "/.fboss2/agent.conf";
systemConfigPath_ = (configDir / "agent.conf").string();
cliConfigDir_ = (configDir / "cli").string();
initializeSession();
}however the CLI still needs some more paths of its own, do you want those paths to be defined in AgentDirectoryUtil.h as well even though they are specific to the CLI?
| #ifdef IS_OSS | ||
| class TBgpServiceSvIf { | ||
| // Stub because the Thrift model for BgpService is not open source yet. | ||
| public: | ||
| virtual ~TBgpServiceSvIf() {} | ||
| virtual void getRunningConfig(std::string&) = 0; | ||
| }; | ||
| #endif | ||
|
|
||
| class MockFbossBgpService : public TBgpServiceSvIf { | ||
| public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just put the definition of class MockFbossBgpService behind #ifdef IS_OSS?
And we can adjust CmdHandlerTestBase to only define bgp related class members when it's not IS_OSS?
I worry by using a stub TBgpServiceSvIf, you might miss some expected functions that the caller might try to call and have an unexpected behavior. While with the #ifdef IS_OSS, we can quickly identify whether the caller is trying to verify any bgp expectation and then we can adjust the test accordingly.
Eventually once we have BGP ready for OSS, we can just easily remove the #ifdef IS_OSS
| agentConfig_); | ||
| std::string prettyJson = | ||
| folly::toPrettyJson(folly::parseJson(std::move(json))); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's why some of the test is using EXPECT_THRIFT_EQ from nettools/netrisk/thrift/ThriftUtils.h.
We can introduce a helper function to convert thrift obj to pretty json with the same logic here.
| // Save the updated config | ||
| ConfigSession::getInstance().saveConfig(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think if we only saveConfig() before commit() to avoid multiple write in the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to call saveConfig() at the end of each config command otherwise we'll lose the current change.
| // Specific revision specified | ||
| newRevision = session.rollback(hostInfo, revisions[0]); | ||
| } | ||
| if (newRevision) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the rollback() return 0 to represent no new version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rollback() should never return 0 because a rollback always creates a new revision that is identical to the revision you rolled back to, so if it returns 0 something is really messed up.
|
|
||
| CmdConfigSessionCommitTraits::RetType CmdConfigSessionCommit::queryClient( | ||
| const HostInfo& hostInfo) { | ||
| auto& session = ConfigSession::getInstance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this factory function work here in the cli binary?
I'm assuming the implementation will be like:
- User types
fboss2-dev config session start. Then the fboss2-dev binary will try to get this factory member instance, and no such session exists, so it create a new one. Then the binary exits? - Then user type
fboss2-dev config intf XXX description YYY. So the fboss2-dev binary will run again but because it exited previously, so the getInstance() will still think there's no instance, so it create another copy. But this time, the new copy will actually change the json content and serialize it on the device - Now if user type
fboss2-dev config intf XXX mtu ZZZ. So the fboss2-dev binary will run for the third time from scratch, IIUC, the getInstance() still think there's no instance and hence load the config again, does that mean the previous changing description is wiped?
My expectation of the config session is every time
- when a user run
session start, the fboss2-dev binary will load the thrift object from current config into memory; - and then a user can type as many as config cli they want to change such thrift object but only in memory to avoid too many unnecessary writes;
- finally when a user types
session commit, we actually write the thrift object from memory to this agent-rABC.config and then update the symlink before calling agent reloadConfig() thrift function.
I'm feeling there're some differences b.w your implementations and my expectation. Let me know what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I clarified that during our discussion earlier but getInstance() will automatically clone the config the first time, subsequent calls will returned the already cloned config, hence why when you do config intf XXX description YYY it will change the cloned config and then when you do config intf XXX mtu ZZZ it will further change the cloned config.
857d77d to
1ce685e
Compare
This command just calls `reloadConfig()` on the wedge_agent.
Sample output: ``` Config Applied Information: =========================== Last Applied Time: 2025-10-11 09:29:36.589 Last Coldboot Applied Time: 2025-10-11 06:44:36.741 ```
This can handle only one session per user, the current live config is copied into `~/.fboss2/agent.conf` and when committing an atomic symlink update is performed on /etc/coop/agent.conf, which is the only config file supported by this basic prototype.
Add some helper code to process interface-list arguments.
1ce685e to
79865dc
Compare
Pre-submission checklist
pip install -r requirements-dev.txt && pre-commit installpre-commit runSummary
This is an early prototype implementation of the
fboss2 configcommands, with very rudimentary support for config sessions and 2 config commands for interfaces.Note: the first 2 commits in this stack are actually #628
Test Plan
New unit tests added.