Network Working Group E. Van Hensbergen Internet-Draft IBM Research Updates: January 2010 experimental-draft-9P2000-protocol (if approved) Intended status: Experimental Expires: July 5, 2010 Plan 9 Remote Resource Protocol experimental-draft-9p2000.L-protocol Status of this Memo This document is an Internet-Draft and is NOT offered in accordance with Section 10 of RFC 2026, and the author does not provide the IETF with any rights other than to publish as an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 5, 2010. Van Hensbergen Expires July 5, 2010 [Page 1] Internet-Draft 9p2000.L January 2010 Abstract 9P is a distributed resource sharing protocol developed as part of the Plan 9 research operating system at AT&T Bell Laboratories (now a part of Lucent Technologies) by the Computer Science Research Center. It can be used to distributed file systems, devices, and application services. It was designed as an interface to both local and remote resources, making the transition from local to cluster to grid resources transparent. In my use of 9P within the Linux and IBM communities there have been several requests for extensions to better support Linux environments or features not currently a part of the protocol (locking, caching, batch operations, storage networking, etc.) As such, I'd like to work towards a specification of the protocol which can accomodate these extensions in as non-intrusive a way as possible (the counter- example being 9P2000.u which while negotiated ends up complicating clients and servers who wish to use it). 9p2000.L is an expanded version of the original protocol with 9P2000 as its core. Extended operations representing either support for UNIX environments, or even alternate approaches to the protocol (such as Op from the Octopus project) exist in complimentary operand spaces. It briefly was referenced as the 9p2010 protocol, but this was retracted in form of referencing it as an extension (.L) Van Hensbergen Expires July 5, 2010 [Page 2] Internet-Draft 9p2000.L January 2010 Table of Contents 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 4 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1. Overview of Differences . . . . . . . . . . . . . . . . . 5 2.1.1. Numeric Versus String IDs . . . . . . . . . . . . . . 5 2.1.2. Error Strings Versus Error Codes . . . . . . . . . . . 5 2.1.3. Permission Modes . . . . . . . . . . . . . . . . . . . 6 2.1.4. Special Files and Links . . . . . . . . . . . . . . . 6 2.1.5. ioctl . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.6. Extended Attributes . . . . . . . . . . . . . . . . . 7 2.2. Changed Messages . . . . . . . . . . . . . . . . . . . . . 7 3. Protocol Data Types . . . . . . . . . . . . . . . . . . . . . 8 3.1. Changed Basic Data Types . . . . . . . . . . . . . . . . . 8 3.2. Changed Structured Data Types . . . . . . . . . . . . . . 8 4. Changed File Attributes . . . . . . . . . . . . . . . . . . . 9 5. Versioning . . . . . . . . . . . . . . . . . . . . . . . . . . 10 6. Error Definitions . . . . . . . . . . . . . . . . . . . . . . 11 7. Changed Protocol Operations . . . . . . . . . . . . . . . . . 12 7.1. version - negotiate protocol version . . . . . . . . . . . 12 7.2. error - return an error . . . . . . . . . . . . . . . . . 12 7.3. auth/attach - messages to establish a connection . . . . . 13 7.4. open, create - prepare a fid for I/O on an existing or new file . . . . . . . . . . . . . . . . . . . . . . . . . 13 7.5. stat, wstat - inquire or change file attributes . . . . . 14 8. Security Considerations . . . . . . . . . . . . . . . . . . . 16 9. Protocol Definition Differences . . . . . . . . . . . . . . . 17 10. Normative References . . . . . . . . . . . . . . . . . . . . . 19 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 20 Appendix B. Copyright . . . . . . . . . . . . . . . . . . . . . . 21 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 22 Van Hensbergen Expires July 5, 2010 [Page 3] Internet-Draft 9p2000.L January 2010 1. Requirements notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119. Van Hensbergen Expires July 5, 2010 [Page 4] Internet-Draft 9p2000.L January 2010 2. Introduction Many modern UNIX systems, including Linux use a virtual file system (VFS) layer as a basic level of abstraction for accessing underlying implementations. Implementing 9p2000.L under Linux is a matter of mapping VFS operations to their associated 9P operations. The problem, however, is that 9P2000 was designed for a non-UNIX system so there are several fundamental differences in the functional semantics provided by 9P. To preserve compatibility with these pre-existing features we propose a transparent extension to the file system semantics which minimally effects the protocol syntax. 9p2000.L is a dialect of 9P2000 negotiated in the Tversion/Rversion exchange. If the server agrees to 9p2000.L, then the wire protocol that follows is 9P2000 with these changes. 2.1. Overview of Differences 2.1.1. Numeric Versus String IDs Under Plan 9 [PLAN9], user names as well as groups are represented by strings, while on Unix they are represented by unique numbers. This is complicated by Linux making it exceedingly difficult to map these numeric identifiers to their string values in the kernel. Many of the available UNIX network file systems avoid this issue and simply use numeric identifiers over the wire, hoping they map to the remote system. NFSv4 has provisions for sending string group and user info over the wire and then contacting a user-space daemon which attempts to provide a valid mapping. 9p2000.L provides a dual approach to this problem, changes have been made to the protocol to allow for numeric identifiers to be used along-side strings. Servers should make every attempt to provide valid information for both the numeric and string form of identifiers. Clients can then use the available information to best map the identifiers to their local environment. Use of extended form string user names (e.g. user@domain) as specified in the NFSv4 environment is also valid. 2.1.2. Error Strings Versus Error Codes A similar problem exists for error numbers, by default VFS operations return error numbers, whereas Plan 9 (and 9P) use error strings to describe failure conditions. This problem is further exacerbated by the potential for Plan 9 synthetic file servers to return custom error strings which may not match any pre-defined set (or pattern) of standard error messages. Van Hensbergen Expires July 5, 2010 [Page 5] Internet-Draft 9p2000.L January 2010 Again, our approach is to provide as much information as possible in order to help clients properly convey error conditions to end- applications and end-users. As such both error strings and error codes are conveyed in the 9p2000.L version of the protocol. A suggested implementation is that non-standard/unrecognized error strings/codes be made available to applications via some mechanism in order to communicate application synthetic file system events (via sysfs or procfs in Linux, or perhaps via a special extended attribute). 2.1.3. Permission Modes Plan 9 has a different user security model [P9SEC], so there is no such concept as set-uid or set-gid permissions. There is also no equivalent for the sticky bit. Luckily, Plan 9 has plenty of space in higher permission mode bit space for such extensions. 2.1.4. Special Files and Links One of the unique aspects of the Plan 9 name space is that it is dynamic. Users are able to bind files and directories over each other in stackable layers similar to union file systems. This aspect of Plan 9 name spaces has obviated the need for symbolic or hard links. Symlinks on a remote UNIX file server will be traversed transparently as part of a walk - there is no native ability within Plan 9 to create symlinks. This breaks many assumptions for Linux file-systems and many existing applications (for example the kernel build creates a symlink in the include directory as part of the make process). Another unique aspect is that named pipes and devices look just like other files in Plan 9, there are no special mode bits and no concept of major and minor block numbers. To solve both of these problems special mode bits are introduced to mark special file types and a variable-length string field is added to the file attribute structure which is used to store the additional information associated with these special files. In the case of symlinks, for instance, the variable length string contains the full path target of the symbolic link. 2.1.5. ioctl A VFS function not accounted for in the 9P infrastructure is the ioctl command. No attempt has been made at this time to accomodate its functionality. Plan 9 has traditionally used elements within a synthetic file system to provide for similar functionality, with the added benefit of getting transparent network distribution of the Van Hensbergen Expires July 5, 2010 [Page 6] Internet-Draft 9p2000.L January 2010 control path. For such devices/files that still require the use of ioctl, gateway synthetic file systems are suggested to provide analogous functionality. 2.1.6. Extended Attributes A relatively recent addition to VFS interfaces is the ability to have arbitrary key/value pairs added to file meta-data. 9p2000.L doesn't have provisions to accomodate this feature, but it may be added in the future as more file systems adopt extended attributes. 2.2. Changed Messages size[4] Tversion tag[2] msize[4] version[s] size[4] Rversion tag[2] msize[4] version[s] size[4] Rerror tag[2] ename[s] errno[4] size[4] Tauth tag[2] afid[4] uname[s] aname[s] size[4] Rauth tag[2] aqid[13] size[4] Tattach tag[2] fid[4] afid[4] uname[s] aname[s] size[4] Rattach tag[2] qid[13] size[4] Tcreate tag[2] fid[4] name[s] perm[4] mode[1] extension[s] size[4] Rcreate tag[2] qid[13] iounit[4] size[4] Tstat tag[2] fid[4] size[4] Rstat tag[2] stat[n] size[4] Twstat tag[2] fid[4] stat[n] size[4] Rwstat tag[2] Van Hensbergen Expires July 5, 2010 [Page 7] Internet-Draft 9p2000.L January 2010 3. Protocol Data Types 3.1. Changed Basic Data Types TODO: cover changed data types in our protocol synopsis 3.2. Changed Structured Data Types TODO: cover changed structs (like stat) that the protocol uses Van Hensbergen Expires July 5, 2010 [Page 8] Internet-Draft 9p2000.L January 2010 4. Changed File Attributes TODO: discuss changes Van Hensbergen Expires July 5, 2010 [Page 9] Internet-Draft 9p2000.L January 2010 5. Versioning TODO: describe protocol versioning in detail (steal from Protocol section) Van Hensbergen Expires July 5, 2010 [Page 10] Internet-Draft 9p2000.L January 2010 6. Error Definitions TODO: enumerate new standard file system error strings and describe possibility of application error strings Van Hensbergen Expires July 5, 2010 [Page 11] Internet-Draft 9p2000.L January 2010 7. Changed Protocol Operations 7.1. version - negotiate protocol version SYNOPSIS size[4] Tversion tag[2] msize[4] version[s] size[4] Rversion tag[2] msize[4] version[s] DESCRIPTION Support for 9p2000.L is an optional extension which isn't quite covered in the existing version semantics for 9P2000. The original protocol specification describes discarding anything after an initial period in the version string. We use the information following the period as a method for negotiating optional protocol extensions. If a client desires support for the UNIX extensions, it will send the add a .u to the end of the version string (e.g. 9p2000.L). If a server is capable of supporting the extension, it will return 9p2000.L back to the client. If the server is incapable or unwilling to support the extensions, it will return the version string without the extension specification (e.g. 9P2000). Clients should be implemented in such a way to be able to operate without the extensions in some degraded form of operation. Specifics for how to gracefully degrade operation without specific extensions are suggested by this document. 7.2. error - return an error SYNOPSIS size[4] Rerror tag[2] ename[s] errno[4] DESCRIPTION An errno field has been added to the message in order to provide a hint of the underlying UNIX error number which caused the error on the server. Due to consistency problems of mapping error numbers betweeen different versions of UNIX, clients should give preference to the error string in attempting to report the error, however, in the event that they are unable to map an error string, they may return the errno to the application. Van Hensbergen Expires July 5, 2010 [Page 12] Internet-Draft 9p2000.L January 2010 A special errno (ERRUNDEF) is returned by servers who do not wish to return raw error numbers. In the event that clients can not interpret the error string, they should somehow make the error string available to end-application/end-user via dynamic custom error codes. 7.3. auth/attach - messages to establish a connection SYNOPSIS size[4] Tattach tag[2] fid[4] afid[4] uname[s] aname[s] n_uname[4] size[4] Rattach tag[2] qid[13] size[4] Tauth tag[2] afid[4] uname[s] aname[s] n_uname[4] size[4] Rauth tag[2] aqid[13] DESCRIPTION A numeric uname field has been added to the attach and auth messages in order to provide hints to map a string to a numeric id if such a facility is not available. The numeric uname should be given preference over the uname string unless n_uname is unspecified (~0). 7.4. open, create - prepare a fid for I/O on an existing or new file SYNOPSIS size[4] Tcreate tag[2] fid[4] name[s] perm[4] mode[1] extension[s] size[4] Rcreate tag[2] qid[13] iounit[4] DESCRIPTION The most signifigant change to the create operation is the new permission modes which allow for creation of special files. In addition to creating directories with DMDIR, 9p2000.L allows the creation of symlinks (DMSYMLINK), devices (DMDEVICE), named pipes (DMNAMEPIPE), and sockets (DMSOCKET). extension[s] is a string describing special files, depending on the mode bit. For DSYMLINK files, the string is the target of the link. For DMDEVICE files, the string is "b 1 2" for a block device with major 1, minor 2. For normal files, this string is empty. Van Hensbergen Expires July 5, 2010 [Page 13] Internet-Draft 9p2000.L January 2010 7.5. stat, wstat - inquire or change file attributes SYNOPSIS size[4] Tstat tag[2] fid[4] size[4] Rstat tag[2] stat[n] size[4] Twstat tag[2] fid[4] stat[n] size[4] Rwstat tag[2] DESCRIPTION There are four new fields in the stat structure supporting 9P2000 extensions - as well as new qid.type bits and mode bits. size[2] total byte count of the following data type[2] for kernel use dev[4] for kernel use qid.type[1] the type of the file (directory, etc.), represented as a bit vector corresponding to the high 8 bits of the file's mode word. qid.vers[4] version number for given path qid.path[8] the file server's unique identification for the file mode[4] permissions and flags atime[4] last access time Van Hensbergen Expires July 5, 2010 [Page 14] Internet-Draft 9p2000.L January 2010 mtime[4] last modification time length[8] length of file in bytes name[ s ] file name; must be / if the file is the root directory of the server uid[ s ] owner name gid[ s ] group name muid[ s ] name of the user who last modified the file extension[ s ] For use by the UNIX extension to store data about special files (links, devices, pipes, etc.) n_uid[4] numeric id of the user who owns the file n_gid[4] numeric id of the group associated with the file n_muid[4] numeric id of the user who last modified the file The n_uid, n_gid, and n_muid are numeric hints that clients may use to map numeric ids when a string to numeric id mapping facility is not available. extension[s] is a string describing special files, depending on the mode bit. For DSYMLINK files, the string is the target of the link. For DMDEVICE files, the string is "b 1 2" for a block device with major 1, minor 2. For normal files, this string is empty. Van Hensbergen Expires July 5, 2010 [Page 15] Internet-Draft 9p2000.L January 2010 8. Security Considerations TODO: Talk about specific security considerations of 9P under UNIX, specifically about the different auth models [P9SEC] Van Hensbergen Expires July 5, 2010 [Page 16] Internet-Draft 9p2000.L January 2010 9. Protocol Definition Differences /* permissions */ enum { DMDIR = 0x80000000, DMAPPEND = 0x40000000, DMEXCL = 0x20000000, DMMOUNT = 0x10000000, DMAUTH = 0x08000000, DMTMP = 0x04000000, DMSYMLINK = 0x02000000, /* 9p2000.L extensions */ DMDEVICE = 0x00800000, DMNAMEDPIPE = 0x00200000, DMSOCKET = 0x00100000, DMSETUID = 0x00080000, DMSETGID = 0x00040000, }; /* qid.types */ enum { QTDIR = 0x80, QTAPPEND = 0x40, QTEXCL = 0x20, QTMOUNT = 0x10, QTAUTH = 0x08, QTTMP = 0x04, QTLINK = 0x02, QTFILE = 0x00, }; struct v9fs_stat { uint16_t size; uint16_t type; uint32_t dev; struct v9fs_qid qid; uint32_t mode; uint32_t atime; uint32_t mtime; uint64_t length; char *name; char *uid; char *gid; char *muid; /* 9p2000.u extensions */ char *extension; uint32_t n_uid; uint32_t n_gid; uint32_t n_muid; Van Hensbergen Expires July 5, 2010 [Page 17] Internet-Draft 9p2000.L January 2010 char data[0]; }; struct Rerror { char *error; uint32_t errno; /* 9p2000.u extension */ }; Van Hensbergen Expires July 5, 2010 [Page 18] Internet-Draft 9p2000.L January 2010 10. Normative References [P9SEC] Cox, R., Grosse, E., Pike, R., Presotto, D., and S. Quinlan, "Security in Plan 9", Proceedings of the Usenix Security Symposium, 2002, . [PLAN9] Lucent Technologies, "Plan 9 Home Page", . Van Hensbergen Expires July 5, 2010 [Page 19] Internet-Draft 9p2000.L January 2010 Appendix A. Acknowledgements The 9p2000.L protocol extensions evolved out of contributions from a number of people including: Abhishek Kilkarni Russ Cox Lachester Ionkov Ron Minnich Andrey Mirtchoviski Rob Pike Eric Van Hensbergen Uriel (Contentious Objector) The 9P protocol was the result of work done by the Computing Science Research Center of AT&T Bell Labs (now a part of Lucent Technologies) and in particular: Rob Pike Dave Presotto Sape Mullender Ken Thompson Russ Cox Van Hensbergen Expires July 5, 2010 [Page 20] Internet-Draft 9p2000.L January 2010 Appendix B. Copyright This specification is derrived from the Plan 9 Documentation and Manual Pages. The source material is Copyright (C) 2003, Lucent Technologies Inc. and others. All Rights Reserved. Extensions to the original source material are Copyright (C) 2010, the authors of this document (as specified in Authors List in the References section). Van Hensbergen Expires July 5, 2010 [Page 21] Internet-Draft 9p2000.L January 2010 Author's Address Eric Van Hensbergen IBM Research Email: ericvh@gmail.com URI: http://www.research.ibm.com/people/b/bergevan Van Hensbergen Expires July 5, 2010 [Page 22]