Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdcard: solve backup bug when sd is re-inserted #1319

Merged
merged 1 commit into from
Oct 30, 2024

Conversation

asi345
Copy link
Collaborator

@asi345 asi345 commented Oct 16, 2024

This issue was existent for a long time. While using the BitBox02, when the sdcard is plugged into the device and the user unplugs and replugs it, backup operations (e.g., list backups, restore from backup) failed and throwed error from the firmware side.

Sdcard backup operations first check if the sdcard is inserted before calling sdcard interface funtions. This commit fixes the re-insertion problem by reinitializing the sdcard whenever it is checked if it is inserted.

The problem earlier most probably stems from sdcard being in an unexpected state when it is re-inserted. The forced initialization step fixes the broken states.

@asi345 asi345 requested review from NickeZ and benma October 16, 2024 14:34
@asi345 asi345 self-assigned this Oct 16, 2024
@asi345 asi345 force-pushed the restore-sd-reinsert branch 2 times, most recently from 4642b94 to c39697b Compare October 16, 2024 14:46
Copy link
Collaborator

@benma benma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing find!

@@ -23,6 +23,7 @@ use crate::workflow::sdcard;
pub async fn process(
&pb::InsertRemoveSdCardRequest { action }: &pb::InsertRemoveSdCardRequest,
) -> Result<Response, Error> {
bitbox02::sd::sdcard_init();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there maybe an interrupt or sth to call this when an sdcard is inserted?

If not - would it make sense to call this in sd.c inside the mount function, so it is called whenever an sdcard operation is about to begin? It would be more robust than calling it in the API layer for when we check sd card insertion - a client might never even call this API call.

Is there a cost to calling this function, or is it a no-op if it has already been called?

Also cc @NickeZ, you have probably better ideas than me about how this should work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking of some kind of an interrupt when the card is inserted, but how do you implement a sdcard insertion interruption? That is why I'm only calling this function only when it is needed.

I can move it to _mount function too, but I just wanted to fix the broken state of the sdcard as early as possible. Maybe other operations can also cause some kind of wrong state, this can fix it for everywhere, if the operation first calls sd_card_inserted().

From what I learned from tracing, there should be a cost of calling this function, but it is less than doing it for the first time. But it is not a big cost I believe.

@NickeZ can have clearer suggestions on these too I think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look around all the code involving the sd card. I don't think it is wrong to call sd_mmc_start() which calls sd_mmc_init() becuase sd_mmc_init() will force the state of the driver to NO_CARD. This in turn will force a detection sequence when sd_mmc_check is called later.

Without attaching a debugger and looking at where it fails, I cannot know for sure. But we could run into the same issue that others have, which is that Card Detect and Data 3 reuses the same pin. When the sd-card boots up (in 1-bit mode), that pin is used for Card Detect. Later when "4 bit data mode" is enabled, that pin is used for data. If we don't "reset the controller", it might still be in 4-bit data mode when a new card is inserted.

Not sure exactly what is the "correct" solution here, but I don't see any harm in calling the initialization sequence multiple times.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDHC controller has an interrupt line, but I think that is more related to DMA / data transfers than card detect.

@asi345
Copy link
Collaborator Author

asi345 commented Oct 17, 2024

Also, I don't understand why CI fails (I haven't touched anything related to the error).
And can you please check if the commit successfully addresses problem by trying out these steps:

- Have microSD inserted
- Plug in BitBox02
- Click on choose "recover from microSD"
- Go back to main menu (just hit "Enter")
- Take out MicroSD
- Plug MicroSD back in
- Click on "recover from microSD"

This flow was reproducing the error. If that works now, it means it should be fixed.

@benma
Copy link
Collaborator

benma commented Oct 17, 2024

Also, I don't understand why CI fails

It's because our clang-tidy check runs only on the files that are in the diff, which in this case is sdcard.c. Please try to figure out the error/fix. Could just be casting to (void*) on first glance.

@NickeZ
Copy link
Collaborator

NickeZ commented Oct 17, 2024

I took an even closer look at this today. We are a bit sloppy with the return value of f_mount. So we missed that it returns an error message after reinserting the sd card. Since @asi345 pointed out that re-initializing the driver fixes the issue, I found another way to also solve the issue:

diff --git a/src/sd.c b/src/sd.c
index 9ad53f47..836962df 100644
--- a/src/sd.c
+++ b/src/sd.c
@@ -18,6 +18,7 @@
 #include <string.h>
 
 #ifndef TESTING
+#include "sd_mmc/sd_mmc_start.h"
 #include "driver_init.h"
 #include "sd_mmc.h"
 #endif
@@ -107,7 +108,12 @@ static bool _mount(void)
     sd_mmc_resume_clock();
 #endif
     memset(&fs, 0, sizeof(FATFS));
-    if (f_mount(&fs, "SD", 1) == FR_INVALID_DRIVE) {
+    int res = f_mount(&fs, "", 1);
+    if (res == FR_DISK_ERR) {
+        sd_mmc_start();
+        res = f_mount(&fs, "", 1);
+    }
+    if (res == FR_INVALID_DRIVE) {
 #ifndef TESTING
         sd_mmc_pause_clock();
 #endif
@@ -121,7 +127,8 @@ static bool _mount(void)
  */
 static void _unmount(void)
 {
-    f_mount(NULL, "SD", 1);
+    f_unmount("");
+
 #ifndef TESTING
     sd_mmc_pause_clock();
 #endif
  • Our unmount function is also incorrectly implemented. There is a helper macro that gets it right. I suggest we change to that.
  • I see now that I also removed SD from the argument to mount/umount. Empty string should mount the "default" volume. We anyway set FF_STR_VOLUME_ID to 0, so we don't have support for string volume names. I guess the correct "logical drive number" in our case is "0".

@asi345
Copy link
Collaborator Author

asi345 commented Oct 30, 2024

okay, I'm implementing this version that you suggested, and removing the extra rust-c function calls that I have added.

@asi345 asi345 force-pushed the restore-sd-reinsert branch 5 times, most recently from d7cd8d8 to be644b6 Compare October 30, 2024 12:53
This issue was existent for a long time. While using the BitBox02,
when the sdcard is plugged into the device and the user unplugs
and replugs it, backup operations (e.g., list backups, restore
from backup) failed and throwed error from the firmware side.

Sdcard backup operations first check if the sdcard is inserted
before calling sdcard interface funtions. This commit fixes
the re-insertion problem by reinitializing the sdcard whenever
it is checked if it is inserted.

The problem earlier most probably stems from sdcard being in an
unexpected state when it is re-inserted. The forced initialization
step fixes the broken states.

Signed-off-by: asi345 <[email protected]>
@asi345 asi345 force-pushed the restore-sd-reinsert branch from be644b6 to f4c67e8 Compare October 30, 2024 13:17
Copy link
Collaborator

@NickeZ NickeZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, utack

@asi345 asi345 merged commit d2bebf7 into BitBoxSwiss:master Oct 30, 2024
3 checks passed
#endif
res = f_mount(&fs, "", 1);
}
if (res == FR_INVALID_DRIVE) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if res is some other error?

Copy link
Collaborator

@NickeZ NickeZ Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that check could be res != FR_OK. It would change the behavior of _mount to return false in more cases. But that is probably for the good. Right now we are ignoring errors.

@@ -107,7 +108,14 @@ static bool _mount(void)
sd_mmc_resume_clock();
#endif
memset(&fs, 0, sizeof(FATFS));
if (f_mount(&fs, "SD", 1) == FR_INVALID_DRIVE) {
int res = f_mount(&fs, "", 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int here should probably be FRESULT.

@asi345
Copy link
Collaborator Author

asi345 commented Oct 31, 2024

okay, I'm updating the code according to these reviews, but I am not sure whether I can reopen this pr.
at the worst case I'll open a new pr.

@NickeZ
Copy link
Collaborator

NickeZ commented Oct 31, 2024

okay, I'm updating the code according to these reviews, but I am not sure whether I can reopen this pr. at the worst case I'll open a new pr.

Just make a new branch and a new PR with the fixes!

@asi345
Copy link
Collaborator Author

asi345 commented Oct 31, 2024

Please check : #1320 for these fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants