Outline of a spam-proof replacement for SMTP. $Id: new_smtp.txt,v 1.7 2003/09/19 10:31:02 warner Exp $ 0: Introduction I assembled this scheme in late June 2003 after pondering what I consider to be the best features of our current email system. In my mind, the greatest thing about networked computers is their ability to help humans communicate. Speed and efficiency are one aspect that email improves over phone and paper mail, but more important than those is the ability to bring together thinly-dispersed groups of people who share a common interest. Mailing lists, newsgroups, and web forums all support groups with a mere tens to hundreds of members, separated across distances (physical and organizational) that would normally prohibit these people from ever discovering each other's existence, much less communicating and working together in a meaningful way. In thinking about the openness that makes email such a powerful tool for human communication, and how it is that exact openness which provides the spammers with the ability to annoy millions of people per hour, I started to wonder what would happen if you gave up that property. What if the email system only let you communicate with a fixed set of people? How would it work, how would you select those people? How evil would it be? What properties would such a system have? My goal was to document the design of such a system with enough detail that its properties, both good and bad, could be evaluated. Perhaps we could learn something from such a design, that there are features we are unwilling to give up. I also decided to ignore deployment issues, and to pretend that we were designing the first electronic mail system. It needed to look enough like something from the real world for users to be able to relate to it, but I decided that a new system could come with a slightly different set of attitudes towards mail and the ability to send it. In the end, I think the amount of openness available in this system is actually pretty good. If you look at email in the context of earlier forms of human interaction, the extra effort that a sender has to go through to establish a relationship with a new recipient is fairly small. My personal spam level has hit the 500-per-day mark, with about 10-20 making it past spamassassin. Bill Richardson made the suggestion [1] that companies which claim to be addressing the spam problem are more interested in control and a revenue stream than actually improving (or salvaging) communication between humans. He put up a challenge to implement a better system. Others have suggested that many people are considering abandoning email altogether because of the level of spam they receive. Once we get to a point where we're willing to give up the whole system, then it is clear we are also at a point where we'd be willing to move to a new system. Then it becomes a question of what properties and features we want in a new protocol and a new infrastructure. This document is intended to sketch out a possible new system so its good and bad points can be addressed. -Brian Warner, 14 Jul 2003 1: Motivation 1.1: Problems with existing spam-abatement systems Filtering is a short-term solution which has already turned into an arms race. Just like advertising embedded in television programs, spam can be expressed such in a way that a recipient, without comparing the message against those received by many others, cannot tell the difference between mass-produced spam and normal messages from new correspondents with an unusually strong focus on product names. Some schemes are designed to catch spam in which the same message is sent to large numbers of people (razor). Some people feel that small-scale spamming is more acceptable than large-scale spamming. Razor is nice in that it raises the effort level for the spammers by a little bit (they must send slightly different messages to each person), but random numbers are easy to come by and the resulting overall network traffic is actually larger (they must send a different message to each person). Authentication based schemes seek to provide accountability, distinguishing between established correspondents and new senders. Most such systems allow new senders at least one chance to send an initial message (the trust-first, reject-later model), in the name of being easier to use and to provide a way for new senders to become established (and trusted) correspondents. Such systems break down in the face of automatically-mass-created identities. Other authentication schemes try to tie the sender's identity (email address) to a "real world" truenames (a name, government-issued ID number, etc). The idea is that every human gets one and only one identity, so they have something to lose if they spend it on spamming. These schemes impose heavy privacy and logistical costs, as well as giving unacceptable amounts of control to whichever agency is charged with granting those identity certificates. Other schemes are based on the identity of the computer being used to send the message (its IP address). They restrict the ability to send mail to certain classes of machines: those which are administered by "responsible" personnel, or those with some long-term reputation to lose. These systems suffer from the difficulty of how to measure this level of responsibility, and deciding who gets to measure it. 1.2: A New System This system uses authentication to tie each message to a particular sender. There is no such thing as "real world" identity, the sender is merely identified by a public key. Rather than accept messages from all senders unless specifically blocked (which is clearly vulnerable to mass-produced identities), this system accepts unlimited mail *only* from senders to whom the recipient has granted permission. The ability to send mail to someone is the ability to consume some of their time. Time is the only real currency we have: about 2.7e9 seconds per lifetime. Nobody other than the recipient should have the ability to decide how that time is spent. The working title for this new system is PETmail, which (if pressed) could stand for Permission, Encryption, and Transparency. 1.3: Additional Features Given that we are rebuilding a mail system from scratch, what features should we add? Apart from spam, where else does our current SMTP/POP/IMAP scheme fall short? Privacy: all messages should be encrypted, transparently. All messages should be authenticated to the extent that is reasonable for automated processes (asserting that two messages are from the same sender is reasonable, trying to claim a real-world identity of the sender through some kind of certificate chain is not). Pseudonymous mail (through a remailer network) should be easy to use. Mobility: it should be easy and straightforward for users to change email providers, with a minimum of disruption. Easier mobility means better competition, leading to better service and lower prices. This can be improved by decoupling the addressing and routing of individual messages. 1.4: Features To Retain Apart from the general "get a message from one human to another" goal, what other nice features of SMTP-based email would be like to retain? Queueing: The sender should be able to generate messages offline and then send them in a batch when the plug their laptop into the network. The recipient should be able to retrieve their messages and then read them offline. ### Architecture 2: Permission to Send The ability to send mail to a person must derive from the recipient. A human (or an authorized computer agent acting upon their behalf) must grant that permission to each sender. The fundamental problem with SMTP is that this permission (the ability to send a message to a recipient R) is equivalent to knowing R's email address, expressed as a short string which combines both identity and addressing (routing). The permission is not tied to the sender: anyone who knows the address can use it. That ability is easily given away to others, intentionally or otherwise. It is small enough to store millions on a CD. The act of sending a message (unless an anonymous remailer is used) automatically gives the recipient permission to send a message back. The act of referring to another user (referring to a human by their email address, to avoid ambiguity) also gives away that permission. This permission is unlimited (once obtained, millions of messages can be sent), and irrevocable. In PETmail, this "Permission" is more restricted. The most frequently used type of Permission is called Specific Permission, and is defined as the ability of an individual sender S to send a message to a recipient R. It is tied to S's private key, so it is only useful to a single sender and cannot be given away. The permission is checked by the recipient's agent, so the recipient can revoke it at any time. Specific Permission can be granted to others directly, or extended (if permitted) by third parties who already have that permission. New senders must use the other kind of permission, Generic Permission, to get a message to the recipient. Since PETmail is fundamentally about helping humans communicate with other humans, Generic Permission should only be usable by humans. Unattended machines are prevented from using it with CAPTCHA[2] challenges (or at least slowed down significantly by requiring hashcash[3]). Once a message arrives with Generic Permission, the recipient can choose to grant Specific Permission to its sender (or may instruct their agent to automatically grant such permission). They can later revoke it at any time. Both types of Permission can specify limits on message size, MIME Content-Type (to express a preference for HTML or plain ASCII mail). Specific Permission can also set a limit on message rate. Specific Permission is expressed as a simple record in the recipient's address book which indicates the conditions of the permission and the public key of the sender. Generic Permission is part of the identifying record published by each recipient. 3: Email Addresses To accomplish the "mobility" goal, email addresses are broken up into two parts: Addresses and Identities. In addition, a Routing Record is used to indicate how the final message should be transported, which may be included in the Identity or looked up separately. There are specific mechanisms for looking up each component. Traditional email addresses have several properties which make them useful identifiers. They are short, somewhat personalized, and broken up into small pieces with independent meanings (.com, a company name, a person's name). This makes it fairly easy for humans to remember them. They are short, making them suitable for use on business cards, in document citations. They are globally unique, making them convenient as identifiers. If they were not unique, senders would have to go through an additional step to resolve the ambiguity before the mail system could deliver the message. They also have other useful properties: They are hierarchical, making it easy to build a distributed lookup system. SMTP email addresses are built upon DNS names, which provides a scalable base for this lookup. They also routing information (the address plus the SMTP specification plus a DNS resolver provide everything the MTA needs to deliver the message). This is a convenience for SMTP, because no additional protocol is needed to get the message to the designated recipient. The corresponding problems are: The hierarchical model means that everybody is paying *somebody* for their namespace. The entity in charge of the top-level has an unreasonable amount of power and will be tempted to abuse it. Politics will always be involved in contention for popular names. By tying address and routing together, the entity which issues (and controls) the address must also provide a mailbox and way to access it (usually POP or IMAP). This means megabytes of storage and significant bandwidth. (this is not strictly true, a domain name owner sub-contract mail transit to another provider, use MX records to send incoming mail to them, and point their pop.example.com address at that provider's server. However, this indirection must be done per-domain instead of per-user, and it requires active involvement by the domain name owner, rather than being controlled by the recipient). By tying address and identity together, the address issuer controls the recipient's identity. If they revoke that address, senders can no longer reach the recipient, forcing them to do out-of-band searches in hopes of finding another working email address. This works to the issuer's advantage: recipients would rather put up with poor service or high prices than go through the hassle of changing email addresses, which requires manually updating all their correspondents and sacrificing all published instances of their address. In PETmail, Addresses are still used for the first three properties (memorizable, citable, and unique). However, address and identity are split to prevent address issuers from holding their users hostage with poor service. Address and routing are split so both address services and transit services can be run independently, allowing them to be cheaper (address services does not require a lot of bandwidth and only minimal storage, while transit service does not require a cool or memorable domain name). The local user will use "Pet Names"[4] to refer to the entries in their address book. A 'Pet Name' is a (sender-)local nickname for the recipient, like "Mom" or "Shashi's friend Drew" or "that Guy from the coffee shop last Thursday". It is memorizable but not globally unique. 3.1: Addresses PETmail addresses are short strings in the usual user@host format, and have all the useful identifying properties of traditional SMTP email addresses. They are used to refer to a recipient in print, in a document, or by word of mouth. On the receiving end, they can be typed into a PETmail system and it will figure out where to send the message. These addresses are mapped (through Identity Services) to an Identity Record, described below, which provides the rest of the information necessary to send a message to the recipient behind the address. Multiple addresses can point to the same record. This is the only use of the Address. Note that once the ID Record is acquired, the address is no longer needed. If the ID Record is acquired in some other way (included in an email message, perhaps), the address isn't needed at all. It serves as a pointer to the Identity, nothing else. Identity Services provide this mapping, and users can rent, buy, be assigned, or otherwise acquire cool names from the usual suspects, with the usual contention for popular ones. The usual DNS-style scheme is used to locate the Identity Server for any given name, based upon the host portion. The Identity Server then provides a mapping from the user portion to the Identity Number (a hash of the public key, therefore a unique key to identify the Identity), or to the complete Identity Record. Identity Servers do not have to provide mail transit. Their entire computational burden is reduced to about that of an LDAP server. Their involvement is merely to manage allocation of these names and map them to IDs. Companies can provide Addresses for their employees, ISPs can provide Addresses for their customers, free email providers can provide them for their users. Organizations can be formed which do nothing but rent names within their own domain. More importantly, losing a name (because of a change in employment, change of ISP, or collapse of a free email provider) is less traumatic, for several reasons. Address books are indexed by pet name or ID number rather than an Address, so the sender's local name for the recipient does not depend upon which global names point to them. The transport service (how to get a message into the mailbox) is separate, so losing a name does not mean losing the mailbox. Finally, the recipient has enough information in their address book to automatically send updates to all their established correspondents (those who have Permission to send to them). Therefore, the only consequence of losing an Address is that any published, non-updatable copies of that Address (on business cards, in documents sitting on somebody's hard drive, etc) become invalid. Established correspondents are not affected at all. New correspondents must acquire the ID Record in another way, perhaps by downloading a copy off a web page, or some kind of lookup service that searches by personal name (like the PGP keyservers). Note that PETmail might use a slightly different address syntax, like 'user.host' instead of 'user@host', which would allow new PETmail addresses to be distinguished from old SMTP ones. 3.2: Identity The Identity record is what actually describes to the recipient. Each recipient has one. A human who wants to pose as multiple recipients ("Brian at work" vs. "Brian at home" vs. "Brian doing PETmail stuff") can create multiple Identities. The Identity is described by a data structure called the Identity Record. It consists of a public key, identifying information (names), a permission record, and a transport record. The Identity is indexed by an Identity Number, which is simply the hash of the public key. Any PETmail recipient (or sender) can be uniquely identified by their ID Number. The entire Identity record is signed by the private key, and includes an expiration date. These records should expire after about a month; their owner's agent will automatically update the record before then. Addresses and Identities in PETmail correspond to Names and IP Addresses in DNS. A (PETmail Address / DNS Name) is mapped into a (PETmail Identity / IP Address) by an (Identity Server / DNS Server). The mapping is many-to-one. The results of the mapping are used by the lower level to actually reach the target. Identities are created by the user's agent, which sends updates to the various places which need them: established correspondents and Identity Servers. 3.2.1: Public Key The public key is unsigned, and is created by the user's mail agent once, when they create the email account. It is the most long-term member of the address record, but even so it can be changed by distributing a message which updates all senders with the new key. The keypair is created secretly, obviously no one other than the user and their agent learns the private key. ID Numbers are known to be unique by the cryptographic properties of the hash used to contruct them. 3.2.2: Identifying Information The identifying information is an optional list of attributes that can help a sender make sure they've retrieved the right Identity: real name, company affiliation, etc. This information is signed by the private key but is not otherwise validated: it is entirely under the control of the recipient and they can describe themselves in any way (or as anyone) they like. The identifying information may also contain a list of Addresses which currently map to this identity. If one of the names go away and the Identity Record is somehow unobtainable through the usual channels, the sender can attempt a lookup through the other names listed. As long as not all of the names go away at the same time, the Identity Record should remain accessible. 3.2.3: Permission Record The permission record indicates what a new sender must do to be allowed to send a message to this recipient. This encodes the "Generic Permissions" used by every sender who doesn't have more specific permission. Recipients indicate that they are willing to accept mail from unknown senders (senders which do not exist in the user's address book) as long as each message fulfills certain requirements. One kind of requirement would state that the message must include a hashcash string. Another could require a CAPTCHA ticket from a particular server. In general, only one of the listed methods would be required. The requirement is per-message: each message needs a new (unique) ticket or piece of hashcash. These requests are handled as described by section 6.3, below. Once one of these messages has gotten through, and the recipient does not tell their agent that the message was undesirable, their agent will automatically add the sender to their addressbook and send back a Specific Permission record. This sender-specific permission overrides the generic one listed in the Identity Record, and will generally accept all messages without additional per-message effort (although it may still impose rate, size, or Content-Type restrictions). Only messages signed by the sender's key are eligible to use those sender-specific permissions: if the message cannot be matched to someone in the address book, the receiving agent falls through to the generic Permissions. If it does not meet the requirements imposed by those, it is dropped. 3.2.4: Transport Records Finally, the transport record indicates how to actually get mail to the recipient. This may be an SMTP mailbox which accepts PETmail-encapsulated messages, or a new kind of transport (described in section 8, below). It may also point to a Transport Server, which can provide a new record for each message (to support anonymous remailer single-use blocks). The transport server may require hashcash or a CAPTCHA ticket before giving out those transport records. ([TODO]: perhaps allow multiple transport records, use MX priorities to decide which to use?) By making the transport information a separate component (which can be updated at any time), the user can rent mailbox space from arbitrary providers. Changing Transport providers does not require changing Names. 3.2.5: Retrieving the Identity [TODO]: this section needs more thought. To get from an Identity ID (as provided by the Name Server) to an Identity Record, the agent uses one of two methods. The first is a straightforward LDAP-style lookup which is requested from the same machine that provided the Name service. Organizations willing to provide name->ID mapping are also likely to provide ID lookup. Such hosts must also offer a protocol for the recipient's agent to update the Identity record (signed by the private key, of course). The second method is a distributed hash table, such as the ones provided by Mojo Nation, freenet, etc. The Name server is likely to participate in this network, as well as the recipient's agent. Even the sender's agent may be a node. The distributed hash table simply maps ID number to an Identity record. The Identity record can include hints to suggest where later updates should be retrieved. 3.3: Address Component Lookup [TODO: image of how pieces map to other pieces] names all map (through name server) to an id (hash of public key) id (through search engines, mnet-style network, well-known servers) to key record (public key, name information, ticket vendors, routing info?) 4: Messages Messages in PETmail are broken into two classes: control and data. Control messages are sent from one MUA agent program to another, and are not directly delivered to the user (some of their contents may be displayed, but the majority of the message is machine-parseable and will be consumed by the agent program). Data messages convey human-to-human communications. All messages are signed by the sender's private key and encrypted with the recipient's public key. The data encapsulation depends upon the transport protocol being used (as determined by the Transport Record part of the Identity Record), but the most typical is expected to be normal SMTP, with a MIME Content-Type of application/x-petmail, in which the MIME body is a signed/encrypted encapsulation of the original email message. Actually the body will be two separate pieces each signed and encrypted identically. The first will contain the header information (including a hash of the second part), while the second will contain the message body. This allows the recipient to decrypt the headers and verify permission before doing a potentially long decryption of the actual message body. 5: Permission Permission is expressed in a record called a Permission Slip which simply states "sender S has permission to send email to recipient R". This is signed by the recipient's private key and simply names the sender (by public key id) who is thus granted permission to send. It also contains a list of parameters which govern the permission being granted: maximum message rate maximum message size a list of per-message requirements (hashcash, captcha ticket from a trusted server, etc) ability to extend sending privileges to third parties HTML preference (or list of acceptable MIME content-type values) Permission is enforced by the recipient's MUA, which has a Permission Table that is simply a list of all the Permission Slips which have been granted. Each message is decrypted, the signature verified, and the signing key's ID looked up in the local Permission Table. That table is checked for attributes like maximum message rate, and this policy is enforced. If the message is accepted, it is conveyed to the user. If not, it is discarded (possibly with an error message to the sender's MUA indicating the reason for refusal). Rejection indicates either a disagreement between the sender and the recipient (different attributes in the Permission Slip, for example), or a sender who is willfully ignoring their lack of permission. The recipient is not directly exposed to messages which fail the Permission test (although they would probably be noted in a log file). A copy of the Permission Slip is also held by the sender named by the slip. The sending MUA uses this to determine whether an outbound message will be accepted (as well as letting it check the other parameters which might result in a message being dropped). The only time messages will arrive at a recipient without valid Permission is if they are sent by rogue clients (spammers), and they will all be simply dropped. The permission slip names the sender by public key fingerprint (a secure hash of the public key), not by Name or Routing Information. This allows those two fields to change. The public key can change too, but this must be implemented by sending signed message to everyone in the Permission Table, telling them to update their permission slips. This process can be completely automated by the MUA. 6: Granting Permission 6.1: Granting It To A Recipient Assume that you ("A") already have permission to send mail to a recipient "B". If you send them a message, it would be polite to give them the ability to respond immediately. When sending the first message, your MUA can generate a Permission Slip for them and include it along with your data message. If you have Permission to send them a message, you must have their Permission Slip in your address book, so your agent will already know their public key. Your MUA can create a Permission Slip which grants user "B" to send mail to user "A", sign it with your private key, and encapsulate it as a control-message attachment to the regular data message. When B receives the mail, B's agent will see the Permission Slip and add it to her address book. When B responds to your message, her MUA will use that Permission to do so. Your MUA agent records brief information about the message which caused this grant of permission (timestamp, subject line, perhaps a brief summary of the content), so you can go back later and find out how they acquired this permission. No extra effort is required on B's part to respond to your message. 6.2: Granting It To A Friend Suppose that you ("A") have permission to send mail to both "B" and "C". You want to tell "B" to talk to "C" about something. You send a message to B and include C's email address. Your MUA notices the address embedded in your message to B. It looks at the Permission Slip you have for C and sees that the "may extend sending privilege to third parties" flag is set. This indicates that C is willing to give permission to new senders based upon a request by an existing sender. Your MUA then generates a permission request that asks to give B the ability to send messages to C, and sends this request to C. C's MUA receives it, verifies (through the signature) that it came from a valid sender ("A"), checks that the 3rd-party-permission flag is set on that sender's record, and then creates B's Permission Slip. It sends the slip directly to B. C's agent records the identity of the requesting sender when it creates these automatically-granted privileges. This gives C the ability to find out how any given sender acquired their permission, as later they may want to reevaluate their decision to set the 3rd-party-permission flag on the person who granted that permission. Correspondents who are fond of submitting your address to mailing lists or other "opt-them-in" schemes would probably have this privilege revoked, while you could still maintaining their ability to send their own messages to you. All the addresses on C's inbound permission list show their pedigree, starting with either a Permission Server's request or an outbound Permission Slip. This ability could be turned on and off by changing a flag in the local Slip (and sending an update to the newly upgraded/downgraded sender). B will then receive two messages: the data message from A (perhaps with an attached note that say permission was requested on B's behalf), and the B->C Permission Slip from C. The permission request/grant process happens automatically, so the Permission Slip should arrive well before B ever sends a message to C. No extra effort is required on B's part to send a message to your correspondent C. 6.3: Granting It To A Stranger The third category of new correspondents are those that are not already in your address book and are not introduced by someone who is. These new senders start with nothing more than your Identifier, found on a mailing list, inside a file, or printed on a business card. This is also the category which all spammers will fall into, so it will have the highest barrier to entry. You ("S") meet someone ("R") at a party, and exchange Identifiers. Then you go home and want to establish a connection with them so you can exchange messages: You type their R.Identifier into your local $NEW MUA It uses the $NEW-MX records to find a Permission Server for them. From that server, it retrieves the public key and the routing information. It tells the server that "S" wishes to receive a Permission Slip from R. It sends your Address components. The server presents a CAPTCHA challenge, which you must pass (an image of letters which must be recognized, or a sound-to-text conversion). This test exists to prevent machines from mass-requesting permission. It could also be a simple hashcash submission, with a value high enough to make it prohibitive for spammers to efficiently request a lot of permission at once. The challenge must only be passed once for any sender, so several minutes of computer time would not be an unreasonable requirement. It then asks you for an "introduction message". You type in "Hey, this is Bob, we met at Fred's party last night". This is your chance to convince the recipient that you are worth talking to, and not some random (low-volume) spammer. The server creates and signs a message for R that says "S would like permission to send mail to you". It includes S's address, and the introduction. It uses the routing information to send this message to R, marked as a Permission Request. That's the end of the request phase. At some point R gets the message. R's $NEW-MUA sees that the message is a Permission Request, and makes sure the signer is on a list of Permission Servers that it uses. (this provides protection against rogue permission servers). The MUA displays the request, along with the introduction, to the user R in a special folder/category of "permission requests". R just clicks "yes" or "no". If yes, the MUA creates the Permission Slip with a set of default attributes, and adds it to the local Permission Table. It sends a copy to S. Eventually S's MUA receives the Permission Slip message. S keeps R's permission slip and address components in it's Addressbook list. Now when S wants to send a message to R: S's MUA composes the message. The message is encrypted and signed. The message is sent using the Routing Information. The routing information is retrieved from the Permission Server just as DNS records are, so it can be cached for limited periods of time. When R's MUA receives the message, it decrypts it, and checks to see if the signing keyid is on the valid permission list. If not, the message is silently dropped. It enforces the other attributes (checks to see too many messages are being sent too quickly, or are too large). If everything checks out, the message is displayed to the user. Strangers are required to prove that they are human to a server, and then are allowed to send a rate-limited introduction to the potential recipient. 7: Ticket Vendors The Identifier Record (and public key) stick with the recipient. The set of ticket vendors that provide services for that recipient can come and go. A separate protocol is used between the recipient's PETmail MUA and the Ticket Vendor: it has messages like "please subscribe me" and "please remove me". The recipient should also be able to set the bar for how easy/difficult it should be to make those requests: CAPTCHA severity, hashcash expense, IQ test, etc :). ISPs are expected to provide free permission servers for their customers, and 3rd party permission servers should exist too. Existing nymservers should also become permission servers. This MUA-vendor protocol may just be a web page that the recipient uses to add their information, which would probably be easier and more flexible than trying to establish a complete machine-machine protocol. The Ticket Vendor records may state that the prospective sender must either submit M bits of hashcash or a human must pass a CAPTCHA challenge. The sender's agent knows how fast their computer is, so it can look at the hashcash requested and estimate how many CPU seconds that will consume. The agent should be configured with a threshold of CPU time above which the user is willing to use their own time to complete the request. The issue here is in comparing human time against CPU time. Human time is precious, computer time less so. [TODO: explore this further. The hashcash must be expensive enough to prevent a spam attack, assuming that spammers are able to dedicate more computer resources than most individual senders]. 8: Routing Information The routing information specifies a transport. I can think of two at the moment: SMTP and mixminion. The routing information is retrieved from the Permission Server in a similar to the way DNS records are retrieved, so they can be cached for limited periods of time determined by the owner of the record. This makes it possible to change the routing information, easing changes of email provider. 8.1: SMTP The PETmail messages are framed somehow, signed, encrypted, and then sent as a normal MIME email message (with, perhaps, a content-type of "application/x-petmail") to the agent specified by an rfc822 email address in the "routing information". To handle transport errors, VERP is used (with a nonce suffix), keeping information about the delivery attempt for a reasonable period (10 days?). If a bounce comes back, the envelope dest address is used to determine which PETmail-level message was involved, and the message is considered to have failed. Any PETmail-level responses are made using the Routing Information obtained for the sender's address, so they may use an unrelated SMTP-level address (or a different transport protocol altogether). 8.2: Mixminion Mixminion is a type-3 remailer network [5]. The Permission Server returns Single Use Reply Blocks (SURBs) on request. The sender is then responsible for implementing the mixminion protocol. The SURBs are stocked by the recipient, possibly triggered automatically by "I'm running low" messages sent by the PS. To prevent bad guys from DoS by depleting the SURB stock, the PS can be configured to require something (CAPTCHA or hashcash) before giving up a SURB. To provide sender anonymity, both the SURB request (really a routing information request) and whatever kind of anti-DoS requirement must be provided in a non-realtime transaction. For CAPTCHA, that means sending one message with the image (tarball of html objects?) and including a signature of the message that must be included with the response (you have to avoid fake [pre-generated] queries and replay attacks). For hashcash it might mean two round trips: request, response (includes hash target), hashcash, SURB. Also, it must be decided whether these control messages are subject to the same Permission rules as regular messages, and if not, how to prevent DoS attacks using them. 8.3: Other possibilities djb's QMQP? direct TCP? Jabber? 9: Design Requirements We need several protocols and data structures to implement PETmail. This list is from the bottom up. Message format, for both control and data messages ID Record Permission Record Transport Record Ticket various Transport encapsulations and protocols Address Server lookup, publish, subscribe, unsubscribe Identity Server lookup, publish, subscribe, unsubscribe Transport Server lookup, publish, subscribe, unsubscribe Ticket Vendor issue, subscribe, unsubscribe Requirements and likely design choices are listed below. In all cases it is best to avoid creating new protocols when existing ones are sufficient. 9.1: Message Format All messages consist of a signed and encrypted payload. The payload is a serialized tree structure with named nodes, each with named attributes which are either strings, numbers, or other nodes. Control messages and data messages are contained in the same payload, with different top-level node types. Data messages (actual human-targeted email) are carried as "rfc822" nodes, with sub-nodes that carry Tickets. ID Record updates are stored in 'idRecord' nodes, and Permission Slips go in 'permission' nodes. Other plugins are allowed to insert additional top-level nodes, and the recipient will silently ignore any for which it does not have a handler. The serialization protocol might be simplified by requiring that the tree structure be limited to simple ASCII strings. To accomplish this, at a minimum, the nodes must be allowed to point to one of several external sections (which will immediately follow the serialized tree structure). This would allow bulk binary data (such as an rfc822 email message) to be carried without requiring cumbersome escape sequences in the serialized tree structure. This may not be sufficient. It would be nice if the nodes could contain UTF-8 encoded strings (particularly for the Identifying Information portion of the ID Record). Note that XML may or may not be a useful serialization mechanism. It may have problems with the UTF-8 strings, and character escapes (to remove the ">" characters from string attributes) must be dealt with. 9.2: Encryption The plan is to simply use OpenPGP (hopefully by spawning off GPG subprocesses). The PETmail agent should use its own keyring. The necessary operations are: create keypair, insert public key, extract public key, encrypt+sign, verify, decrypt+verify. 9.3: ID Record These will be serialized with the same tree structure as used for the message bodies. The data fields put into the serialized form are: ID number public key names (multiline string of Identifying Information) addresses (list of strings) generic Permission transport record timestamp expiration date It may be useful to put a "source" field in the ID Record that indicates where an updated version can probably be found. This would reduce the need for Addresses a bit. 9.4: Permission Record Same serialization code as elsewhere. Generic and Specific permissions use the same record, they only differ in where the record will be found (generic permissions are found on ID Records, specific permissions are found in Permission Slips and attached to address book entries). All the permissions described here are actually 'rfc822' permissions, leaving room for other kinds. Extensions announce their presence to correspondents by granting permissions of other types: the new "foobar" plugin in the sender's agent knows it is allowed to send new feature nodes (named "foobar" instead of "rfc822" or other pre-defined top-level message nodes) to a recipient by virtue of the "foobar-permission" block it was given along with the rfc822 specific permissions. The rfc822 permissions include the following fields: minimum inter-message-gap (implements rate control) maximum message size acceptable MIME Content-Types list of Ticket Vendor URLs "can grant permission to others" flag 9.5: Transport Record Same serialization as elsewhere. Data contents must indicate a method to get messages to the recipient. Probably a name and some arguments, but might include more data if necessary for remailer hops (could include a SURB or secondary encryption key to use). SMTP-encapsulation: email address to deliver encapsulated body TCP: hostname and portnumber mixminion remailer chain: reponse block Usenet (alt.anonymous.messages): newsgroup name, subject line transport server: provides more transport records others: jabber thingy, IRC gatherer bot, HTTP POST cgi 9.6: Ticket The ticket is just a signed message from the Ticket Vendor indicating that the requirements for this kind of ticket have been met. The Vendor is indicated by a URI which must be included in the signed region of the ticket, as must be the recipient's ID Number. The tickets must be given unique numbers or ticket-IDs, and timestamps, so duplicates can be rejected. Timestamps greater than one day in the future or five days in the past will simply be dropped, to minimize the storage requirements needed to implement duplicate rejection. The protocol used by the potential sender to obtain a ticket is unclear. It needs to be extensible (to handle new kinds of CAPTCHA challenges), but must be universally accessible (no one-platform binary plugins). One possible scheme involves a local web browser. The agent listens on a local HTTP port. The protocol would specify a way to construct a URL that includes all the information that must be placed inside the ticket (like the recipient's ID number), as well as a local URL to which the ticket should be delivered when the process is complete. Assuming the Ticket Vendor's URL was something like "http://vendor.com/type3", the browser would be sent to a URL like: http://vendor.com/type3?recip=1abc33d9&output=http://localhost:9999/ticket?num=1234 The vendor would return HTML that displays the challenge and presents a form box for the response. Upon receiving the proper response, the vendor would return HTML with a small form and a button that says "press here to retrieve the ticket". The button would POST the form to the 'output' URL, with the ticket contents as base64-encoded text in a 'hidden' input field. This scheme is a gnarly kludge but it would work. We treat it as an existence proof of a workable ticket-granting scheme and hope that something more useable will become obvious later on. The ticket vendor might require hashcash instead of a CAPTCHA challenge. In this case, the vendor provides the prefix string and the target hash bits, and expects a response with the complete hashable string. The hashcash is verified by the vendor, which then issues a ticket. 9.7: Protocols It would be nice if the various protocols necessary to implement PETmail were usable by detached nodes, defines as agents which are separated from the servers they want to reach by a queue that delays messages by some interval longer than humans are willing to wait. This also includes nodes behind a remailer chain. It should be possible for pseudonymous users to take advantage of all PETmail features despite never having a realtime TCP connection to the internet. One way this might be possible is to declare protocol queries to be PETmail-encoded messages (signed and encrypted headers of a tree structure, as described above). The queries are sent over a Transport, which could either be a "realtime" transport (like a direct TCP port) or a "queued" one (like SMTP encapsulation). The message would includ a reponse Transport Record to tell the recipient how to get the response back to the sender. As an optimization, realtime transports could take advantage of a special "reply" Transport type, which simply means the recipient should send the response back over the same connection. To use this, recipients should hold such connections open for a little while after the message has been received, to see if a response will be generated. Messages should allow multiple Transport Records, so that nodes which cannot use all types (offline nodes could not use "TCP" transports, for example) still have something to fall back upon. This scheme would allow Address lookups and Ticket requests to be performed by nodes which are mostly offline, at the expense of requiring multiple round trips per message. 9.8: Address Server Protocols The means by which the client agent turns an Address into an ID Record must be specified. It should be easy for Address Services to implement and easy for large sites to glue to a database. HTTP with a well-known scheme for constructing the URL is a possibility. LDAP, DNS records are others. An out-of-band protocol will be used to subscribe and unsubscribe (you would have to pay your ISP to get an address, go through an IT department for a work-related address, etc). No matter how this protocol works, it should wind up giving the agent a Transport Record where updates can be sent, and should accept a public key from the agent. The transport record is called the "publisher". After subscription, the agent can sent the completed Identity Record to the transport record, which will accept anything signed by the appropriate public key. Each time the ID Record changes or needs to be refreshed, the agent delivers a new copy to the publisher. 9.9: Identity Server Protocols The idea here is to reduce the need for traditional hierarchical (and therefore inevitable political) addresses. If complete ID Records are used everywhere instead of just ID Numbers, and if the records contains an accurate "source" field where updates could be retrieved, then there would probably not be much need for the Identity Servers. If used, these servers need to map ID Numbers to ID Records. The nature of the ID Numbers (hashes of the public key) make DHTs ideal storage mechanisms. One could even imagine a large-scale p2p network (consisting of all the nodes running PETmail agents) which provided both the ID Record lookup as well as a search service that allowed senders to look for ID Records based upon the Identifying Information. This would act as a large distributed white pages or name directory, and could remove the need for preassigned Addresses altogether. Subscription and Unsubscription protocols are not necessary: records are simply published and updated, and disappear when they expire. 9.10: Transport Protocols Each recipient needs some way of receiving messages. These methods are called "Inlets", and each one is closely tied to a "Transport". Recipients who have static, globally-reachable IP addresses and can run servers all the time can simply publish a TCP host and port number, and their agent can listen for connections there. Others will probably need to have a queue on some better-connected machine. The "SMTP-encapsulated" transport method uses just such a queue, retrieved by the agent using POP or IMAP. The recipient will need to somehow subscribe to use such a service, using an out-of-band means (buy an account from an ISP, get hired by a company, etc). Once acquired, the recipient must describe the Inlet to their agent (provide the POP host, username, and password) so it can use the Inlet to retrieve messages. They must also describe what the corresponding Transport Record should look like (in this case, the SMTP address which is routed into that POP mailbox). Unsubscription (to shut down the Inlet) is also beyond the scope of PETmail, but the agent must provide a way to remove Inlets from the active list. No update or publish protocol is necessary. 9.11: Transport Server Protocols Transport Servers are servers which issue Transport Records to senders. The anticipated use of these is to dispense remailer "SURBs": Single Use Response Block. To prevent certain kinds of traffic analysis attacks, these blocks can only be used once, and are therefore unsuitable for inclusion in the Identity Record as a transport record. Instead, the Transport Record will be a special type that indicates the Transport Server which should be asked for the real record. It can give out a different record for each request. It should also be possible for the Transport Server to require a Ticket before giving out a SURB, to prevent automated DoS attacks intended to exhaust all the recipient's SURBs. In addition to a retrieval protocol that agents can use to acquire a real Transport Record, there must be subscription and unsubscription protocols to create the stockpile of SURBs, an update protocol to add more SURBs to the pile, and probably a query or notification protocol so the agent can know when the stock is running low. In this aspect, Transport Servers behave a little bit like nymservers. 9.12: Ticket Vendor Protocols It should be possible for the Ticket Vendors to get paid for their service, to provide for something other than a pure advertising-revenue driven service model. This means recipients who accept a given Vendor's tickets must be known to the Vendor. So there must be a subscription protocol of some sort, whereby the recipient's public key is added to the Vendor's list. The Vendor knows that they will only dispense tickets that are for recipients on this list. The means by which the ticket is actually retrieved by the sender (including how the CAPTCHA challenge is presented) is thorny, as described above. However, it must be well-specified as part of the PETmail protocol suite to assure interoperability and universal accessibility. 10: Complaints/Issues 10.1: The registration is just like TMDA. Yes, but with differences. request/response messages are distinct types from normal communication. The confirmation step ("are you human?", which for TMDA is reduced to "will you respond to an email that I send you?") is performed by the permission server, which can do arbitrary additional steps as required by the recipient. Request/responses are always treated differently from regular messages, eliminating the possibility of loops. Confirmation is *always* required, changing the basis that a TMDA user is asking more from their correspondents than they are willing to do themselves. 10.2: What text gets through? Can't someone send spam through the permission requests? In $NEW, the only arbitrary text that a spammer can send to a recipient is in an introduction message. These are rate-limited by the permission servers, because each permission request is signed by the permission server, and the server requires a CAPTCHA challenge to be passed. So an actual human must be involved in the creation of each such "introduction spam". The fastest way to send a great deal of spam is for the spammer to offer a lot of money to the operators of the permission server and buy their private key. Alternatively, they could break into a permission server and steal the key. This will allow them to send an arbitrary number of introduction messages to anyone who is subscribed to that permission server. A permission server which is compromised will be detectable fairly quickly (the operators can compare their logs against the tickets being received by subscribers: if there are any which were not actually issued by the server, the key has been compromised). One which goes rogue (selling the private key) will be detectable by the large amount of spam being sent using their tickets. In either case, the recipient can remove that server's public key from their list. The next best way to send spam in $NEW is for a spammer to buy the private key from a correspondent to whom you've already granted permission (or, again, break into their computer and steal it). This will let them send an unlimited number of messages to anyone who has given permission to the victim/rogue. Once those recipients start seeing the spam, that correspondent will probably be cut off quickly. If that "sell-out sender" had their 3rd-party-permission flag turned on, they could request (and get) permission for a large number of new senders, all of which could start flooding the recipient with spam. The MUA will have a feature to revoke any permission derived from a particular sender's key. The only remaining way to send a spam is to hire people to answer CAPTCHA challenges, and send the message as the introduction text of a permission request. This will cost 10-15 seconds of a human's time per message (each of which can only be sent to a single recipient). A million pieces of spam will cost about 4 person-months. At $5 per hour, this raises the cost of each spam to about 1.4 cents. An article at Wired [6] shows one spammer receiving about $10 per successful order. This leads to a breakeven point of 1/720 (they must make at least one sale out of every 720 messages sent to pay their human operator $5 per hour), far greater than the current 1/100000 to 1/1000000 rate most spammers manage. 10.3: How to interface with existing MUAs? The permission process must involve a human, so the $NEW-MUA looks more like a web browser. A program can be developed which functions as a local proxy for the existing MUA. It can speak pop/imap/smtp to a local MUA (where envelope-sender is then indexed into the permission list), so existing MUAs should be able to work with it just fine. The $NEW-MUA must be a GUI, a CAPTCHA displayer, speak $NEW with other hosts, and speak pop/imap/smtp to the local MUA. Twisted to the rescue! 10.4: Mailing Lists? The mailing list is a sender, and has a public key. The recipient has to give a Permission to the list when they subscribe. All messages coming through the list are signed by the list manager software before transmission. The list can perform whatever inlet filtering it desires. Perhaps posters must acquire a Permission with the list, in which case the human list manager gets to decide whether or not the poster should be allowed in. It is also possible that the mailing list software will simply accept messages from anyone (a Generic Permission record with no requirements listed). 10.5: Automated tools? cron jobs and the like must send mail from well-known addresses, which must be added explicitly to a whitelist. Given adequate protection against spoofing at the MTA level, it would be reasonable to whitelist all the internal addresses at an organization. 10.6: Anonymous Senders? Remailers? If the recipient requires Tickets in their Generic Permissions (necessary to avoid being open to unlimited spam), then each anonymous message will require a ticket. Pseudonymous senders can acquire Permission just like known senders, as long as there is a return path along which the permission slip can be sent. The sender is not required to reveal anything beyond their ID Record (which is entirely under their control, they could leave all the name fields blank) and the return Transport (which could be through a remailer response block). Transport Servers could dispense a supply of type-3 network SURBs, allowing recipients behind remailer chains to be just as accessible as non-anonymous recipients. 10.7: Blind People? http://yro.slashdot.org/yro/03/07/02/1940226.shtml?tid=111&tid=126 CNET article points out that current CAPTCHA schemes are not accessible by the blind, and proposed audio schemes aren't really workable. comments point out workarounds. hashcash is a good one, but the size must be carefully chosen to be expensive for spammers to throw computer power at, but feasible for valid senders to do occasionally. Worst-case fallback is for a Ticket Vendor to have a phone number or chat room where they employ humans to adminster Turing tests. These vendors would probably expect real money for this service. 10.8: Control Messages? PETmail messages basically form an MUA-to-MUA protocol. Some of these messages contain blocks that are shown to humans (the 'rfc822' email message node), and these blocks are rate-limited and filtered to enforce the recipient's choices. The others (permission slips, ID Record updates, etc) are consumed by the PETmail agent. Permission Slips can be updated (to upgrade/downgrade capabilities, or change attributes of the Identity) by simply sending out messages to everybody in the address book. These updates are accepted silently by the receiving MUA. 10.9: DoS attacks If incoming Permission Slips are automatically added to the database, a bad guy could send lots of them, and make that database very large or slow. It may be reasonable to reject permission slips that aren't accompanied by accepted mail. 10.10: Virus attacks windows boxes might be vulnerable to a virus modifying the permission list, changing the MUA to give out permission to anybody who asked, or (worst case) broadcasting the user's private key. It would be worse if the permission pedigrees were altered, making it all the senders look the same (both the new spammers and the pre-existing real ones), making it harder to auto-revoke the bogus permissions. 10.11: Address migration, just how bad is this relative to an open network; how much does this stifle new communication?: Take a look at your address book and think about how you acquired the addresses therein. I think they can be broken into a few distinct categories: senders of inbound messages: People who sent you a message, now you can reply to them addresses sent by a mutual friend / third party: entries which were copied out of somebody else's address book to your own addresses found from web pages, mailing list postings, README files, source code, etc: addresses which were posted by their owners to solicit mail from people they don't yet know about. They probably expect mail about the project or page which contained their address In PETmail, inbound messages automatically grant permission to reply (by default). People with permission to send you mail can ask that their correspondents also get that permission, and the PETmail agent will automatically (by default) grant that permission. The only case where users must go through hoops beyond the usual technical issues of using an MUA is the last case, where there is no pre-existing relationship between sender and recipient, not even a mutual friend. In this situation, I think it is reasonable to expect the sender to put slightly more effort into requesting permission to transmit. The expectation is this: if you, as a sender, want to use some of the recipient's time, you must be willing to spend some of your own. There are precedents in (pre-computer) human communication: think of letters of introduction that an individual might carry, from one organization to another, which help to add that person into the new group. Also think of calling cards, the original paper kind that a butler would deliver, and secretaries who mediate all attempts to communicate with their boss. 10.12: Disaster Recovery if the recipient loses their private key, their account is worthless, and they will have to start again from scratch. The ID Records contained in their correspondent's address books will eventually expire. If the recipient loses their Address Book (along with the specific permissions they granted to their correspondents), it can be regenerated if the permitted senders each send a copy of their Permission. The slips can be verified (since they are all signed). (Note: this requires that the protocol be designed such that these slips are signed separately from the rest of the message). What is missing is a way to find out who the senders are. The archived email can provide a partial list. It may be useful for the MUA to respond to inbound mail that is signed by an key that is not in the address book by asking for a copy of the Permission Slip (perhaps only when the table is known to be damaged). The MUA should copy the private key and Permission Table into a simple, single, easy to back-up file, and make it straightforward for the user to copy this file to safety. 10.13: Web mail Web-based mail systems mean the MUA can't actually run directly on the end user's computer. In this case, the agent must be run on the mail provider's system, and a web page used as the owner's interface to the agent. The down side is that the user is completely dependent upon their mail provider for security. If their private key lives on the server, the administrator of that server can sell out to whomever they want. One advantage, however, is that the user's email is then available from many locations, something made difficult by having a local private key. 10.14: End-to-end Acknowledgment Rather than using SMTP bounce messages (sent by gateways or end systems) to indicate a possible failure, I think PETmail should use positive receipt acknowledgments (sent only by the end system) to indicate success. The message will be held by the sending system for a period of time (perhaps 5 days) and removed when the receipt is received. The motivation for this: spurious bounce messages (fake bounces which are not associated with an actual outbound message) must be dropped without displaying them to the user, otherwise they can be used as a spam channel. Real bounces can be generated by too many kinds of systems to be accurately correlated with outbound messages (although using VERP envelope sender addresses might help) or machine interpreted as meaning "temporary failure" or "permanent failure". The fact that you don't know who is going to send a bounce means that there is no way to authenticate the bounce sender, which creates a spam channel (albeit a slightly difficult one to exploit) in which the spammer finds one message that you've sent and then sends spam which looks like bounces of that message. Finally, the default reaction to an incoming message that does not have the proper Permissions is to silently drop it (although this could be changed). If the sender and the receiver have mismatched ideas about what those permissions are, the sender will not directly know that they are being treated as a spammer. The ack lets them at least know that their message is not being accepted (although it won't let them distinguish between an intermediate network problem and a permission problem at the recipient). Comments: please let me know what you think! -Brian, warner-newsmtp @ lothar.com http://www.lothar.com/tech/spam/index.html [1]: http://bothan.net/~bill/ [2]: http://www.captcha.net/ CAPTCHA challenges are computer-administrated Turing tests. A problem is presented which is easy for a human to solve but infeasibly difficult for a computer to solve. The most common type in current use is a character recognition problem, where an image of distorted characters is presented and the human is asked to identify the letters and type them into a text box. An alternative one uses a speech-recognition problem: it plays an audio segment with computer-synthesized voice speaking some letters and requests that the human type in the letters that they hear. These challenges do not have to be administered by a computer, but hiring a human to assert the humanity of potential senders is considerably more expensive. It may be necessary as a fallback, however, both to provide for senders who cannot see or hear, and in case advances in OCR or AI render the usual tests vulnerable to the spammer's bots. [3]: hashcash Hashcash uses a problem which is time-consuming (but not impossible) to solve, for which the solution is easy and quick to verify: obtaining a partial collision with a hash function. The server gives a prefix string and a match string to the client. The client is required to return an answer which satisfies two properties: it starts with the prefix string, and the hash of the answer starts with the match string. Since the hash function used (SHA-1) is cryptographically strong, the only way to find a collision is to perform a brute-force search of suffixes until one happens to hash to an appropriate value. By choosing the length of the match string, the server can require the client to perform an average of 2**(n-1) hash operations, but can verify the results with just one. [4]: http://zooko.com/distnames.html http://www.erights.org/elib/capability/pnml.html These papers describe three kinds of names and three properties: each kind has exactly two of the properties. The "decentralized/global" property holds if a name is globally unique, so it can be give to anyone and used as an unambiguous pointer. The "secure/non-political" property holds if no third party must be trusted or relied upon to provide the name. Finally, the "human memorizable" property requires the name to be short and made of vaguely meaningful components. The first type of name is a "Key", which is global and non-political but not human memorizable. They are usually randomly generated, or hashes of other personal data. The second is a "Nickname", which is basically equivalent to a current email address: global and memorizable but not free of political influence, because you have to get it from someone who owns the domain, and they can take it away from you. The third type is a "Pet Name", which is non-political and memorizable but not globally unique. They are only really meaningful in the context of the person who assigned them. In PETmail, ID numbers are "Keys" (both globally unique and non-political), and if a Distributed Hash Table or other scalable peer-to-peer lookup method were used to store the ID records, the ID number alone would be sufficient to retrieve the record and deliver the message. The PETmail Addresses are "Nicknames" (unique and memorizable), just like traditional SMTP email addresses. The PETmail user agent uses Pet Names to refer to Identities in the local addressbook. These are non-political and memorizable, but are most useful to the owner of that addressbook. They could be somewhat useful to others, however: the PNML paper referenced above describes a scheme to chain these pet names in a way that could let your neighbor refer to one of your correspondents (presuming you'd sent them the identity record). You could wind up with something like "my friend bob's boss's sister's dentist" in your address book, and once you'd established a different connection with them, you could assign a different name ("Dr. Smith"). [5]: http://www.mixminion.net/ The Mixminion remailer network uses messages which have been sliced into identical-sized encrypted fragments. The pieces are sent over a variety of paths in ways that prevent correlation by adversaries monitoring the links between intermediate remailer nodes. When the fragments converge on the output node, they are reassembled into the final message. This technique prevents both observers and the recipient from determining anything about where the message came from. The only information the recipient can obtain is what the sender chooses to reveal in their message (which is presumeably encrypted to the recipient). [6]: http://www.wired.com/news/business/0,1367,59907,00.html