On Fri, Jun 2, 2017 at 2:18 PM, Yan, Zheng ukernel@gmail.com wrote:
On Fri, Jun 2, 2017 at 7:33 PM, Arnd Bergmann arnd@arndb.de wrote:
On Fri, Jun 2, 2017 at 1:18 PM, Yan, Zheng ukernel@gmail.com wrote: What I meant is another related problem in ceph_mkdir() where the i_ctime field of the parent inode is different between the persistent representation in the mds and the in-memory representation.
I don't see any problem in mkdir case. Parent inode's i_ctime in mds is set to r_stamp. When client receives request reply, it set its in-memory inode's ctime to the same time stamp.
Ok, I see it now, thanks for the clarification. Most other file systems do this the other way round and update all fields in the in-memory inode structure first and then write that to persistent storage, so I was getting confused about the order of events here.
If I understand it all right, we have three different behaviors in ceph now, though the differences are very minor and probably don't ever matter:
- in setattr(), we update ctime in the in-memory inode first and then send the same time to the mds, and expect to set it again when the reply comes.
- in ceph_write_iter write() and mmap/page_mkwrite(), we call file_update_time() to set i_mtime and i_ctime to the same timestamp first once a write is observed by the fs and then take two other timestamps that we send to the mds, and update the in-memory inode a second time when the reply comes. ctime is never older than mtime here, as far as I can tell, but it may be newer when the timer interrupt happens between taking the two stamps.
- in all other calls, we only update the inode (and/or parent inode) after the reply arrives.
Arnd