Skip to content

Conversation

@dhalbert
Copy link
Collaborator

This is a multi-part fix for race conditions occurring on VM shutdown when an SD card mounted in user code is exposed via USB MSC. There can be filesystem operations initiated by the host that are in process when code.py ends., which cause crashes or USB disconnects. In the process of debugging this, I found several other things to clean up and add.

SPI resetting when VM shuts down

Previously, many ports had spi_reset() or reset_spi() routines that reset all SPI buses that were not persistent when the VM shut down. If there were host-initiated USB MSC operations (usually just reads) on a user-mounted filesystem, these forced resets would interfere with those USB MSC operations, or vice versa. For instance, on Espressif, the forced resets would fail if an SPI operation was in progress.

  • Change busio.SPI() to have a finaliser. This obviates the need for a global SPI reset.
  • Remove all reset_spi() and spi_reset(). The common_hal_busio_spi_deinit() routines take care of all SPI resetting necessary when the finalisers are run.
  • Add common_hal_busio_spi_mark_deinit() routines to all busio.SPI implementations, and call them as appropriate in the various SPI.c files.
  • Add a call to common_hal_busio_spi_mark_deinit() in shared-module/displayio/__init__.c when a FourWire bus is copied. This mimics what was already done for displayio I2C bus copies, when busio.I2C() was changed to use a finaliser.
  • Use dynamically allocated mutexes for SPI locking instead of the previous statically allocated mutexes. This now mimics what is done in the I2C implementation.

Don't reset pins before finalisers run

reset_port() is called in main.c before finalisers are run. On all ports, reset_port() would call reset_all_pins(). However, this meant that SPI pins would get reset while SD card operations might still be in progress, allowing the kind of race conditions described above.

  • Move the call to reset_all_pins() out of the port-specific reset_port() routines, and instead call it in main.c after stop_mp(), and so therefore after the finalisers run. All ports provide reset_all_pins(), so it does not need to be called individually in each port. I was initially thinking about add a reset_port_after_finalisers() routine for each port that called reset_all_pins(), but this was not necessary.

sdcardio improvements

  • Fix signature of cmd_nodata, which was bool but should have been int.
  • Return -MP_EIO instead of -EIO several places. The value is the same.
  • Add a persistent_mount flag to sdcardio_sdcard_construct() to distinguish user-mounted SD cards from automounted SD card mounts that persist across VM's. This flag is used to implement the next item.
  • Acquire the SPI lock more carefully in lock_and_configure() in SDCard.c. Once it has been acquired, check whether the VM is still running and whether the mount is persistent before proceeding. If the mount is not persistent and the VM is no longer running, give up the lock.

supervisor/shared/usb/usb_msc_flash.c fixes

  • To be more correct, guard some code with CIRCUITPY_SDCARD_USB instead of CIRCUITPY_SDCARDIO.
  • To be more correct, guard some other code with #ifdef SDCARD_LUN instead of CIRCUITPY_SDCARD_USB.
  • get_vfs() will now check if the LUN passed in is for a user-mounted SD card but the VM is no longer running, and will return NULL. This prevents USB MSC operations that are attempted to be initiated after the VM starts to shut down. However, there are still races that can happen and the other checks above avoid those races.

Other

  • Add a bool vm_is_running() routine, which is true when the VM has finished starting up and is set to false when it starts to shut down. This supports the "VM is running" checks mentioned above. The flag is set and cleared in main.c.
  • For Espressif I2C, delete the mutex on deinit. This had been forgotten, and I stumbled on it it while studying the I2C finaliser implementation.

This is a draft to start with because I need to test the non-Espressif ports for SPI issues, especially with SD cards, and I can use the PR builds for that testing.

@dhalbert dhalbert force-pushed the spi-sd-race-fixes branch 3 times, most recently from f21a2e2 to d708bdf Compare October 31, 2025 16:41
Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good. You should be able to remove more never_reset machinery. You will still need to pass never reset into the pin tracking because we still bulk reset them.

@dhalbert dhalbert marked this pull request as ready for review October 31, 2025 21:49
@dhalbert
Copy link
Collaborator Author

Tested on Metro ESP32-S3, Fruit Jam, Feather nRF52840 with AdaLogger FeatherWing, ESP32-S3 TFT Feather with AdaLogger, PyPortal. All mounted SD card manually except for Fruit Jam. The USB presentation worked and I was able to read files. I could write files when the SD card was mounted readonly=True. No crashes. The display worked fine on the TFT Feather (SPI) and on the PyPortal (not SPI).

Ready for approval to merge.

@dhalbert dhalbert requested a review from tannewt October 31, 2025 21:49
Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@tannewt tannewt merged commit 803be58 into adafruit:main Nov 1, 2025
505 checks passed
@dhalbert dhalbert deleted the spi-sd-race-fixes branch November 1, 2025 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enabling microSD and Display on QtPy ESP32-S3 SPI Bus Results in Crash to Safe Mode

2 participants