Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
289 views
in Technique[技术] by (71.8m points)

c - What are the reasons to check for error on close()?

Note: Please read to the end before marking this as duplicate. While it's similar, the scope of what I'm looking for in an answer extends beyond what the previous question was asking for.

Widespread practice, which I tend to agree with, tends to be treating close purely as a resource-deallocation function for file descriptors rather than a potential IO operation with meaningful failure cases. And indeed, prior to the resolution of issue 529, POSIX left the state of the file descriptor (i.e. whether it was still allocated or not) unspecified after errors, making it impossible to respond portably to errors in any meaningful way.

However, a lot of GNU software goes to great lengths to check for errors from close, and the Linux man page for close calls failure to do so "a common but nevertheless serious programming error". NFS and quotas are cited as circumstances under which close might produce an error but does not give details.

What are the situations under which close might fail, on real-world systems, and are they relevant today? I'm particularly interested in knowing whether there are any modern systems where close fails for any non-NFS, non-device-node-specific reasons, and as for NFS or device-related failures, under what conditions (e.g. configurations) they might be seen.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Once upon a time (24 march, 2007), Eric Sosman had the following tale to share in the comp.lang.c newsgroup:

(Let me begin by confessing to a little white lie: It wasn't fclose() whose failure went undetected, but the POSIX close() function; this part of the application used POSIX I/O. The lie is harmless, though, because the C I/O facilities would have failed in exactly the same way, and an undetected failure would have had the same consequences. I'll describe what happened in terms of C's I/O to avoid dwelling on POSIX too much.)

The situation was very much as Richard Tobin described. The application was a document management system that loaded a document file into memory, applied the user's edits to the in- memory copy, and then wrote everything to a new file when told to save the edits. It also maintained a one-level "old version" backup for safety's sake: the Save operation wrote to a temp file, and then if that was successful it deleted the old backup, renamed the old document file to the backup name, and renamed the temp file to the document. bak -> trash, doc -> bak, tmp -> doc.

The write-to-temp-file step checked almost everything. The fopen(), obviously, but also all the fwrite()s and even a final fflush() were checked for error indications -- but the fclose() was not. And on one system it happened that the last few disk blocks weren't actually allocated until fclose() -- the I/O system sat atop VMS' lower-level file access machinery, and a little bit of asynchrony was inherent in the arrangement.

The customer's system had disk quotas enabled, and the victim was right up close to his limit. He opened a document, edited for a while, saved his work thus far, and exceeded his quota -- which went undetected because the error didn't appear until the unchecked fclose(). Thinking that the save succeeded, the application discarded the old backup, renamed the original document to become the backup, and renamed the truncated temp file to be the new document. The user worked a little longer and saved again -- same thing, except you'll note that this time the only surviving complete file got deleted, and both the backup and the master document file are truncated. Result: the whole document file became trash, not just the latest session of work but everything that had gone before.

As Murphy would have it, the victim was the boss of the department that had purchased several hundred licenses for our software, and I got the privilege of flying to St. Louis to be thrown to the lions.

[...]

In this case, the failure of fclose() would (if detected) have stopped the delete-and-rename sequence. The user would have been told "Hey, there was a problem saving the document; do something about it and try again. Meanwhile, nothing has changed on disk." Even if he'd been unable to save his latest batch of work, he would at least not have lost everything that went before.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...