This article is interesting, especially because it uses NFS as its example. I was doing some reading today about the differences between NFS v2, 3, and 4. It seems that NFS, especially the older versions, are very different from other file service protocols that we're used to, like smb or afp.
The way it's described in the nfs documentation is that nfs is a "stateless" protocol. In comparison, smb is "stateful".
Ever notice that in Windows you can go on the file server and look at a list of all the clients connected to your file server, including what shares are connected, what files are open, and how long they've been idle? You can do the same thing on samba using the swat web interface. All of this information is available to you because smb is "stateful". The server and client both maintain a long-term awareness of what the client is doing with it's connection to the server.
NFS is a different beastie entirely. Ever notice that you can not produce a list of connected clients and what files they have open like you can in swat with smb? I'm getting the idea that by "stateless" it means that the server actually has no idea. Client access to files on the nfs server are very very short term events. The file is opened, read or written, and then closed quickly. Actually, for large files, this event is not even for the whole file. It is for much smaller pieces of the file (with a limit of about 1MB, which might explain the results found in this article) No longer term knowledge of this event or its relation to future or competing file operations is maintained.
Because of this, it is much more important with NFS to make sure writes are committed quickly because there isn't really an orderly tracking of who is accessing what. This is why the official NFS specification requires that all disk writes are done in "sync" mode. This means the server and client are forbidden to use cache AT ALL. The server must confirm to the client that the saved file is actually written out to disk before the event is finished. Linux nfs services have the "async" flag that can be used in the export file, but this is technically a violation of the nfs spec. I guess the only way to get around this is to build aggressive caching into the server file system, but I'm not even sure if this would help, unless the file system "lies" to the nfs service about really writing to disk.
After reading the http://nfs.sourceforge.net page top to bottom carefully, I got the overall impression that nfs is a very exact and paranoid protocol, designed for near-absolute data integrity, NOT for performance. Heck, with version 2 in sync mode, you can reboot your nfs server in the middle of operations, and nothing will be lost. The only fool-proof way to get high performance out of it is to get really really fast disks
But, this is the real world, and my server is protected by a UPS. To all intents and purposes, it doesn't go down. I think some caching can be used in the right way, based on this fact.
NFSv4 is supposed to be stateful, like smb, and as such should be quite different.
Correct me if I've misunderstood nfs at all in this ramble.
The way it's described in the nfs documentation is that nfs is a "stateless" protocol. In comparison, smb is "stateful".
Ever notice that in Windows you can go on the file server and look at a list of all the clients connected to your file server, including what shares are connected, what files are open, and how long they've been idle? You can do the same thing on samba using the swat web interface. All of this information is available to you because smb is "stateful". The server and client both maintain a long-term awareness of what the client is doing with it's connection to the server.
NFS is a different beastie entirely. Ever notice that you can not produce a list of connected clients and what files they have open like you can in swat with smb? I'm getting the idea that by "stateless" it means that the server actually has no idea. Client access to files on the nfs server are very very short term events. The file is opened, read or written, and then closed quickly. Actually, for large files, this event is not even for the whole file. It is for much smaller pieces of the file (with a limit of about 1MB, which might explain the results found in this article) No longer term knowledge of this event or its relation to future or competing file operations is maintained.
Because of this, it is much more important with NFS to make sure writes are committed quickly because there isn't really an orderly tracking of who is accessing what. This is why the official NFS specification requires that all disk writes are done in "sync" mode. This means the server and client are forbidden to use cache AT ALL. The server must confirm to the client that the saved file is actually written out to disk before the event is finished. Linux nfs services have the "async" flag that can be used in the export file, but this is technically a violation of the nfs spec. I guess the only way to get around this is to build aggressive caching into the server file system, but I'm not even sure if this would help, unless the file system "lies" to the nfs service about really writing to disk.
After reading the http://nfs.sourceforge.net page top to bottom carefully, I got the overall impression that nfs is a very exact and paranoid protocol, designed for near-absolute data integrity, NOT for performance. Heck, with version 2 in sync mode, you can reboot your nfs server in the middle of operations, and nothing will be lost. The only fool-proof way to get high performance out of it is to get really really fast disks
But, this is the real world, and my server is protected by a UPS. To all intents and purposes, it doesn't go down. I think some caching can be used in the right way, based on this fact.
NFSv4 is supposed to be stateful, like smb, and as such should be quite different.
Correct me if I've misunderstood nfs at all in this ramble.
Last edited by a moderator: