Hi!
This is not exactly a regression, as I am not aware of a prior working state, but kernel documentation advises me to CC regressions list anyway¹.
I am trying to put data on an external Kingston XS-2000 4 TB SSD using self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think BCacheFS has any part in the errors I see, but if you disagree feel free to CC the BCacheFS mailing list as you reply.
I am using a ThinkPad T14 AMD Gen 1 with AMD Ryzen 7 PRO 4750U and 32 GiB of RAM.
I connected the SSD onto USB-C port directly with the ThinkPad. lsusb lists it as:
Bus 007 Device 004: ID 0951:176b Kingston Technology XS2000
The SSD is detected as follows:
[20303.913644] usb 7-1: new SuperSpeed Plus Gen 2x1 USB device number 9 using xhci_hcd [20303.926616] usb 7-1: New USB device found, idVendor=0951, idProduct=176b, bcdDevice= 1.00 [20303.926633] usb 7-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [20303.926641] usb 7-1: Product: XS2000 [20303.926647] usb 7-1: Manufacturer: Kingston [20303.926652] usb 7-1: SerialNumber: […] [20303.929078] scsi host0: uas [20303.983859] scsi 0:0:0:0: Direct-Access Kingston XS2000 1000 PQ: 0 ANSI: 6 [20303.984426] sd 0:0:0:0: Attached scsi generic sg0 type 0 [20303.985197] sd 0:0:0:0: [sda] 8001573552 512-byte logical blocks: (4.10 TB/3.73 TiB) [20303.985331] sd 0:0:0:0: [sda] Write Protect is off [20303.985341] sd 0:0:0:0: [sda] Mode Sense: 43 00 00 00 [20303.985579] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [20303.989516] sda: sda1 [20303.989611] sd 0:0:0:0: [sda] Attached SCSI disk
BCacheFS is mounted as follows – but I suspect BCacheFS is not involved in those errors anyway:
[20310.437864] bcachefs (sda1): mounting version 1.3: rebalance_work opts=metadata_checksum=xxhash,data_checksum=xxhash,compression=lz4 [20310.437895] bcachefs (sda1): recovering from clean shutdown, journal seq 5094 [20310.450813] bcachefs (sda1): alloc_read... done [20310.450851] bcachefs (sda1): stripes_read... done [20310.450855] bcachefs (sda1): snapshots_read... done [20310.470815] bcachefs (sda1): journal_replay... done [20310.470824] bcachefs (sda1): resume_logged_ops... done [20310.470835] bcachefs (sda1): going read-write
During rsync'ing about 1,4 TB of data after eventually a hour I got things like this:
[33963.462694] sd 0:0:0:0: [sda] tag#10 uas_zap_pending 0 uas-tag 1 inflight: CMD [33963.462708] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00 [33963.462718] sd 0:0:0:0: [sda] tag#11 uas_zap_pending 0 uas-tag 2 inflight: CMD [33963.462725] sd 0:0:0:0: [sda] tag#11 CDB: Write(16) 8a 00 00 00 00 00 82 c1 c8 00 00 00 04 00 00 00 [33963.462733] sd 0:0:0:0: [sda] tag#15 uas_zap_pending 0 uas-tag 3 inflight: CMD [33963.462740] sd 0:0:0:0: [sda] tag#15 CDB: Write(16) 8a 00 00 00 00 00 82 c1 d2 4c 00 00 01 2f 00 00 [33963.462748] sd 0:0:0:0: [sda] tag#12 uas_zap_pending 0 uas-tag 4 inflight: CMD [33963.462754] sd 0:0:0:0: [sda] tag#12 CDB: Write(16) 8a 00 00 00 00 00 82 c1 d0 00 00 00 02 4c 00 00 [33963.462762] sd 0:0:0:0: [sda] tag#13 uas_zap_pending 0 uas-tag 5 inflight: CMD [33963.462769] sd 0:0:0:0: [sda] tag#13 CDB: Write(16) 8a 00 00 00 00 00 82 c1 d4 00 00 00 00 ff 00 00 [33963.462777] sd 0:0:0:0: [sda] tag#14 uas_zap_pending 0 uas-tag 6 inflight: CMD [33963.462783] sd 0:0:0:0: [sda] tag#14 CDB: Write(16) 8a 00 00 00 00 00 82 c1 ce 00 00 00 00 cc 00 00 [33963.576991] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 9 using xhci_hcd [33963.590793] scsi host0: uas_eh_device_reset_handler success [33963.592857] sd 0:0:0:0: [sda] tag#10 timing out command, waited 180s [33963.592872] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK cmd_age=182s [33963.592881] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00 [33963.592886] I/O error, dev sda, sector 2193734656 op 0x1:(WRITE) flags 0x104000 phys_seg 773 prio class 2 [33963.592898] bcachefs (sda1 inum 1073761281 offset 265216): data write error: I/O [33963.592925] bcachefs (sda1 inum 1073761281 offset 467456): data write error: I/O [33963.592933] bcachefs (sda1 inum 1073761281 offset 470016): data write error: I/O [33963.592939] bcachefs (sda1 inum 1073761281 offset 471552): data write error: I/O [33963.592949] bcachefs (sda1 inum 1073761281 offset 514560): data write error: I/O [33963.592956] bcachefs (sda1 inum 1073761281 offset 517120): data write error: I/O [33963.592963] bcachefs (sda1 inum 1073761281 offset 519168): data write error: I/O [33963.592969] bcachefs (sda1 inum 1073761281 offset 521728): data write error: I/O [33963.592976] bcachefs (sda1 inum 1073761281 offset 523776): data write error: I/O [33963.592983] bcachefs (sda1 inum 1073761281 offset 526336): data write error: I/O
The rsync completed but I did not trust the result, even tough "bcachefs fsck" told me the filesystem structure is okay.
Thus I reran rsync with option "-c" for checksumming. After a long time with data that did match, it started to transfer a file again which should not happen if data would have been identical. As it ran into I/O errors again, I stopped the rsync process.
I looked for that UAS error message and according to the article² I found I disabled UAS as follows:
% cat /etc/modprobe.d/disable-uas.conf # Does not work with external SSD Transcend XS2000 4TB options usb-storage quirks=0951:176b:u
The quirk was applied as I reconnected the devices after unloading both usb-storage and uas modules:
[ 55.871301] usb 7-1: UAS is ignored for this device, using usb-storage instead [ 55.871310] usb-storage 7-1:1.0: USB Mass Storage device detected [ 55.871559] usb-storage 7-1:1.0: Quirks match for vid 0951 pid 176b: 800000
I recreated the BCacheFS filesystem and tried again. This time it did not take more than 10 minutes for the first I/O error to appear. Unless with UAS it made rsync stop with an I/O error immediately. Before that there were several USB resets. Here is the excerpt from dmesg:
[ 795.768306] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd [ 932.976677] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd [ 963.189438] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd [ 1000.057333] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd [ 1036.917137] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd [ 1073.782876] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd [ 1110.647786] usb 7-1: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd [ 1117.163693] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=214s [ 1117.163718] sd 0:0:0:0: [sda] tag#0 CDB: Write(16) 8a 00 00 00 00 00 02 72 20 00 00 00 08 00 00 00 [ 1117.163725] I/O error, dev sda, sector 41033728 op 0x1:(WRITE) flags 0x104000 phys_seg 1551 prio class 2 [ 1117.163739] bcachefs (sda1 inum 1879048481 offset 2572800): data write error: I/O [ 1117.163763] bcachefs (sda1 inum 1879048481 offset 2576384): data write error: I/O [ 1117.163771] bcachefs (sda1 inum 1879048481 offset 2578432): data write error: I/O [ 1117.163779] bcachefs (sda1 inum 1879048481 offset 2580480): data write error: I/O [ 1117.163786] bcachefs (sda1 inum 1879048481 offset 2582528): data write error: I/O [ 1117.163794] bcachefs (sda1 inum 1879048481 offset 2584576): data write error: I/O [ 1117.163803] bcachefs (sda1 inum 1879048481 offset 2586624): data write error: I/O [ 1117.163811] bcachefs (sda1 inum 1879048481 offset 2588672): data write error: I/O [ 1117.163818] bcachefs (sda1 inum 1879048481 offset 2590720): data write error: I/O [ 1117.163824] bcachefs (sda1 inum 1879048481 offset 2592768): data write error: I/O
So even without UAS the device does not seem to like to write data on Linux.
Next steps may involve looking for a firmware update for the external SSD as well as trying to obtain its SMART status. So far I did not succeed in finding the right options for smartctl. In case there is enough evidence that the device is defective I'd try to RMA it.
I will keep a copy of kernel log and I could do some further tests as time permits. So let me know whether you need anything else, but for now the mail is long enough as it is.
[1] https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
[2] How to disable USB Attached Storage (UAS) Last edited on 4 December 2022, at 14:00
https://leo.leung.xyz/wiki/How_to_disable_USB_Attached_Storage_(UAS)
Ciao,
On 2024-02-11 16:42, Martin Steigerwald wrote:
Hi! I am trying to put data on an external Kingston XS-2000 4 TB SSD using self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think BCacheFS has any part in the errors I see, but if you disagree feel free to CC the BCacheFS mailing list as you reply.
This is indeed a known bug with bcachefs on USB-connected devices. Apply the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs...
This and some other commits are already scheduled for -stable.
Holger
Hi Holger!
CC'ing BCacheFS mailing list.
My original mail is here:
https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0@appli... #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
Holger Hoffstätte - 11.02.24, 17:02:29 CET:
On 2024-02-11 16:42, Martin Steigerwald wrote:
Hi! I am trying to put data on an external Kingston XS-2000 4 TB SSD using self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think BCacheFS has any part in the errors I see, but if you disagree feel free to CC the BCacheFS mailing list as you reply.
This is indeed a known bug with bcachefs on USB-connected devices. Apply the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
This and some other commits are already scheduled for -stable.
Thanks!
Oh my. I was aware of some bug fixes coming for stable. I briefly looked through them, but now I did not make a connection.
I will wait for 6.7.5 and retry then I bet.
Best,
On Sun, Feb 11, 2024 at 06:06:27PM +0100, Martin Steigerwald wrote:
Hi Holger!
CC'ing BCacheFS mailing list.
My original mail is here:
https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0@appli... #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
Holger Hoffstätte - 11.02.24, 17:02:29 CET:
On 2024-02-11 16:42, Martin Steigerwald wrote:
Hi! I am trying to put data on an external Kingston XS-2000 4 TB SSD using self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think BCacheFS has any part in the errors I see, but if you disagree feel free to CC the BCacheFS mailing list as you reply.
This is indeed a known bug with bcachefs on USB-connected devices. Apply the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
This and some other commits are already scheduled for -stable.
Thanks!
Oh my. I was aware of some bug fixes coming for stable. I briefly looked through them, but now I did not make a connection.
I will wait for 6.7.5 and retry then I bet.
That doesn't look related - the device claims to not support flush or fua, and the bug resulted in us not sending flush/fua devices; the main thing people would see without that patch, on 6.8, would be an immediate -EOPNOTSUP on the first flush journal write.
He only got errors after an hour or so, or 10 minutes with UAS disabled; we send flushes once a second. Sounds like a screwy device.
Kent Overstreet - 11.02.24, 19:51:32 CET:
On Sun, Feb 11, 2024 at 06:06:27PM +0100, Martin Steigerwald wrote:
[…]
CC'ing BCacheFS mailing list.
My original mail is here:
https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0 @applied-asynchrony.com/T/ #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
Holger Hoffstätte - 11.02.24, 17:02:29 CET:
On 2024-02-11 16:42, Martin Steigerwald wrote:
Hi! I am trying to put data on an external Kingston XS-2000 4 TB SSD using self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think BCacheFS has any part in the errors I see, but if you disagree feel free to CC the BCacheFS mailing list as you reply.
This is indeed a known bug with bcachefs on USB-connected devices. Apply the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c ommi t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
This and some other commits are already scheduled for -stable.
Thanks!
Oh my. I was aware of some bug fixes coming for stable. I briefly looked through them, but now I did not make a connection.
I will wait for 6.7.5 and retry then I bet.
That doesn't look related - the device claims to not support flush or fua, and the bug resulted in us not sending flush/fua devices; the main thing people would see without that patch, on 6.8, would be an immediate -EOPNOTSUP on the first flush journal write.
He only got errors after an hour or so, or 10 minutes with UAS disabled; we send flushes once a second. Sounds like a screwy device.
Thanks for that explanation, Kent.
I am the one with that external Transcend XS 2000 4 TB SSD and I specifically did not CC bcachefs mailing list at the beginning as after seeing things like
[33963.462694] sd 0:0:0:0: [sda] tag#10 uas_zap_pending 0 uas-tag 1 inflight: CMD [33963.462708] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00 […] [33963.592872] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK cmd_age=182s
I thought some quirks in the device to be at fault.
However while Sandisk Extreme Pro 2 TB claims to support DPO and FUA I see
Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
also with other devices like external Toshiba Canvio 4 TB hard disks. Using LUKS encrypted BTRFS on those I never saw any timeout while writing out data issue with any of those hard disks. Also with disabled write cache any cache flush / FUA request should be a no-op anyway? These hard disks have been doing a ton of backup workloads without any issues, but so far only with BTRFS.
I may test the Transcend XS2000 with BTRFS to see whether it makes a difference, however I really like to use it with BCacheFS and I do not really like to use LUKS for external devices. According to the kernel log I still don't really think those errors at the block layer were about anything filesystem specific, but what do I know?
With UAS enabled for Transcend XS2000 I see:
Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
This sounds about right: Without cache flush / FUA request disable write cache.
With UAS disabled, using only usb-storage, however I see:
Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Which appears to be broken to me: If it cannot do cache flush / FUA it should have write cache disabled.
Thus I removed the quirk to disable UAS again. It did not help anyway.
However when I look at the output of "hdparm -I" for that Transcend XS2000 none of this makes sense. Cause it blatantly advertises to support
[…] * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT […] * WRITE_{DMA|MULTIPLE}_FUA_EXT […]
It has firmware revision S9K00107. I see whether I can get this updated in case any update is available. Which is not obvious to me as Kingston only offers to download a Windows application to update the firmware.
I asked them how to do an update on Linux. But am also prepared to run to a friend with Windows system to do the update.
There is no urgency in this, so let's see whether a firmware update may fix anything. In case someone has any additional insight, feel free to add it. Otherwise I consider it case closed unless I retest with either Linux kernel 6.7.5 or 6.8-rc4 and/or after having made a firmware update if available.
Maybe also some other quirks would need to be enabled for that device? I tested it with:
% cat /etc/modprobe.d/disable-uas.conf # Does not work with external SSD Transcend XS2000 4TB options usb-storage quirks=0951:176b:u
but as explained that did not help and thus I disabled UAS disabling quirk again.
Best,
On Mon, Feb 12, 2024 at 04:52:09PM +0100, Martin Steigerwald wrote:
Kent Overstreet - 11.02.24, 19:51:32 CET:
On Sun, Feb 11, 2024 at 06:06:27PM +0100, Martin Steigerwald wrote:
[…]
CC'ing BCacheFS mailing list.
My original mail is here:
https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0 @applied-asynchrony.com/T/ #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
Holger Hoffstätte - 11.02.24, 17:02:29 CET:
On 2024-02-11 16:42, Martin Steigerwald wrote:
Hi! I am trying to put data on an external Kingston XS-2000 4 TB SSD using self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not think BCacheFS has any part in the errors I see, but if you disagree feel free to CC the BCacheFS mailing list as you reply.
This is indeed a known bug with bcachefs on USB-connected devices. Apply the following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c ommi t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
This and some other commits are already scheduled for -stable.
Thanks!
Oh my. I was aware of some bug fixes coming for stable. I briefly looked through them, but now I did not make a connection.
I will wait for 6.7.5 and retry then I bet.
That doesn't look related - the device claims to not support flush or fua, and the bug resulted in us not sending flush/fua devices; the main thing people would see without that patch, on 6.8, would be an immediate -EOPNOTSUP on the first flush journal write.
He only got errors after an hour or so, or 10 minutes with UAS disabled; we send flushes once a second. Sounds like a screwy device.
Thanks for that explanation, Kent.
I am the one with that external Transcend XS 2000 4 TB SSD and I specifically did not CC bcachefs mailing list at the beginning as after seeing things like
[33963.462694] sd 0:0:0:0: [sda] tag#10 uas_zap_pending 0 uas-tag 1 inflight: CMD [33963.462708] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00 […] [33963.592872] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK cmd_age=182s
I thought some quirks in the device to be at fault.
However while Sandisk Extreme Pro 2 TB claims to support DPO and FUA I see
Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
also with other devices like external Toshiba Canvio 4 TB hard disks. Using LUKS encrypted BTRFS on those I never saw any timeout while writing out data issue with any of those hard disks. Also with disabled write cache any cache flush / FUA request should be a no-op anyway? These hard disks have been doing a ton of backup workloads without any issues, but so far only with BTRFS.
I may test the Transcend XS2000 with BTRFS to see whether it makes a difference, however I really like to use it with BCacheFS and I do not really like to use LUKS for external devices. According to the kernel log I still don't really think those errors at the block layer were about anything filesystem specific, but what do I know?
It's definitely not unheard of for one specific filesystem to be tickling driver/device bugs and not others.
I wonder what it would take to dump the outstanding requests on device timeout.
Kent Overstreet - 12.02.24, 21:42:26 CET:
[thoughts about whether a cache flush / FUA request with write caches disabled would be a no-op anyway]
I may test the Transcend XS2000 with BTRFS to see whether it makes a difference, however I really like to use it with BCacheFS and I do not really like to use LUKS for external devices. According to the kernel log I still don't really think those errors at the block layer were about anything filesystem specific, but what do I know?
It's definitely not unheard of for one specific filesystem to be tickling driver/device bugs and not others.
I wonder what it would take to dump the outstanding requests on device timeout.
I got some reply back from Transcend support.
They brought up two possible issues:
1) Copied to many files at once. I am not going to accept that one. An external 4 TB SSD should handle writing 1,4 TB in about 215000 files, coming from a slower Toshiba Canvio Basics external HD, just fine. About 90000 files was larger files like sound and video files or installation archives. The rest is from a Linux system backup, so smaller files. I likely move those elsewhere before I try again as I do not need these on flash anyway. However if the amount of files or data matters I could never know what amount of data I could write safely in one go. That is not acceptable to me.
2) Power management related to USB port. Cause I am using a laptop. It may have been that the Linux kernel decided to put the USB port the SSD was connected to into some kind of sleep state. However it was a constant rsync based copy workload. Yes, the kernel buffers data and the reads from Toshiba HD should be quite a bit slower than the Transcend SSD could handle the writes. I saw now more than 80-90 MiB/s coming from the hard disk. However I would doubt this lead to pauses of write activity of more than 30 seconds. Still it could be a thing.
Regarding further testing I am unsure whether to first test with BTRFS on top of LUKS – I do not like to store clear text data on the SSD – or with BCacheFS plus fixes which are 6.7.5 or 6.8-rc4 in just in the case the flush handling fixes would still have an influence on the issue at hand.
First I will have a look on how to see what USB power management options may be in place and how to tell Linux to keep the USB port the SSD is connected to at all times.
Let's see how this story unfolds. At least I am in no hurry about it.
Best,
On Thu, Feb 15, 2024 at 12:09:20PM +0100, Martin Steigerwald wrote:
Kent Overstreet - 12.02.24, 21:42:26 CET:
[thoughts about whether a cache flush / FUA request with write caches disabled would be a no-op anyway]
I may test the Transcend XS2000 with BTRFS to see whether it makes a difference, however I really like to use it with BCacheFS and I do not really like to use LUKS for external devices. According to the kernel log I still don't really think those errors at the block layer were about anything filesystem specific, but what do I know?
It's definitely not unheard of for one specific filesystem to be tickling driver/device bugs and not others.
I wonder what it would take to dump the outstanding requests on device timeout.
I got some reply back from Transcend support.
They brought up two possible issues:
- Copied to many files at once. I am not going to accept that one. An
external 4 TB SSD should handle writing 1,4 TB in about 215000 files, coming from a slower Toshiba Canvio Basics external HD, just fine. About 90000 files was larger files like sound and video files or installation archives. The rest is from a Linux system backup, so smaller files. I likely move those elsewhere before I try again as I do not need these on flash anyway. However if the amount of files or data matters I could never know what amount of data I could write safely in one go. That is not acceptable to me.
- Power management related to USB port. Cause I am using a laptop. It may
have been that the Linux kernel decided to put the USB port the SSD was connected to into some kind of sleep state. However it was a constant rsync based copy workload. Yes, the kernel buffers data and the reads from Toshiba HD should be quite a bit slower than the Transcend SSD could handle the writes. I saw now more than 80-90 MiB/s coming from the hard disk. However I would doubt this lead to pauses of write activity of more than 30 seconds. Still it could be a thing.
Regarding further testing I am unsure whether to first test with BTRFS on top of LUKS – I do not like to store clear text data on the SSD – or with BCacheFS plus fixes which are 6.7.5 or 6.8-rc4 in just in the case the flush handling fixes would still have an influence on the issue at hand.
First I will have a look on how to see what USB power management options may be in place and how to tell Linux to keep the USB port the SSD is connected to at all times.
Let's see how this story unfolds. At least I am in no hurry about it.
This may not be an issue of power management but rather one of insufficient power. A laptop may not provide enough power through its USB ports for the Transcend SSD to work properly under load.
You can test this by connecting a powered UBS-3 hub between the laptop and the drive.
Alan Stern
Alan Stern - 15.02.24, 16:19:54 CET:
First I will have a look on how to see what USB power management options may be in place and how to tell Linux to keep the USB port the SSD is connected to at all times.
Let's see how this story unfolds. At least I am in no hurry about it.
This may not be an issue of power management but rather one of insufficient power. A laptop may not provide enough power through its USB ports for the Transcend SSD to work properly under load.
You can test this by connecting a powered UBS-3 hub between the laptop and the drive.
Interesting idea. Maybe the Transcend XS-2000 4TB needs more power than the Sandisk Extreme Pro 2TB.
Not sure whether I have one at hand with USB-C here, cause my regular USB hub only has USB-A connectors. Need to look for one with enough USB-A and USB-C connectors as I use an USB hub as replacement for a docking station. But I do have at least optionally powered hub with USB-C one at another place. It does not have many ports. But for the task ahead one USB-C port is sufficient.
I will try this as well. Thanks.
Best,
Hi!
Kent Overstreet - 11.02.24, 19:51:32 CET:
He only got errors after an hour or so, or 10 minutes with UAS disabled; we send flushes once a second. Sounds like a screwy device.
Kingston support intends to RMA the XS-2000 4 TB SSD with a variant with a newer firmware version, in case they have it available, while they work on a newer firmware version for the device variant the error happened on.
So it appears the device has a bug. I will keep you posted, once I either receive that other variant or a firmware upgrade for the existing one.
I am happy with Kingston support so far. It takes quite a while, but they are taking the issue for real instead of writing use Windows instead of Linux or something like that :) - like I read before in other occasions with hardware from other suppliers. Thanks!
Best,
linux-stable-mirror@lists.linaro.org