On 08/05/19 15:52, Guilherme G. Piccoli wrote:
Hi, I understand your concern. But all other raid levels contains failure-event mechanisms. For example, in all my tests with raid5 or raid1, it first complained the device was removed, then it failed in super_written() when no other available device was present. On the other hand, raid0 does "blind-writes": it just selects the device in which that bio should be written (given the stripe math) and change the bio's device, sending it back via generic_make_request(). It's dummy, but not in a bad way, but rather for performance reasons. It has no "intelligence" for failures, as all other raid levels.
That said, we could fix md.c for all raid levels, but I personally think it's a bazooka shot, only raid0 shows consistently this issue.
The academic in me says we should push that error handling into generic_make_request() or some raid function in md.c that deals with those problems. Sounds like there's a fair bit of duplicate functionality that could be re-factored out.
Academic purity versus engineering practicality :-)
Heheh you have good points here! Thanks for the input =) Cheers,
Doesn't help when there's not an architect to design an overall "grand scheme", but my usual way of working is to design top down academically, and then ask myself "what do I need" before implementing bottom-up. Hopefully with a load of documentation saying "I haven't done this because I don't need it, but this is where it goes".
Cheers, Wol