DirectConnect TNG

A call for a new protocol

Draft, version 0.3 July 25th, 2003

By Jan Vidar Krey <janvidar at extatic dot org>

1.0 Introduction

1.1 Background

I have been developing QuickDC for over a year, written it from scratch using a packet sniffer and poking through other implementations from projects like javaDC, dctc and DC++.
During this period both me and other developers have found and discussed numerous problems in the current Direct Connect protocol. We have suggested and implemented many extensions to the protocol, like the "Extendedprotocol" (implemented first in DC++), "Quicklist" and many more.

These are all working solutions, but keeping this strategy will ultimately make the system unnecessary complex and finally break compatibility with the Neo-Modus clients.

1.2 Status of this document

This document does not specify an implemented protocol. This is just a proposed outline for a modern future peer to peer network.
There are many issues that needs to be worked out.

Please discuss this with me and others if you have suggestions or questions.

1.3 The dark side of Direct Connect

Here is a short list of the problems encountered in the current Direct Connect protocol (in no particular order).

The protocol doesn't handle error conditions (If something went wrong, just ignore it, thats the implementation).
Implementations doesn't handle different states of the protocol, command scopes and order is undefined in most cases and implemented differently between projects. *
The whole network relies on a simple nickname, this is problematic when somebody has taken your name, or if you use the same nickname from different computers, or if someone uses a different nickname on different hubs (thus you are unable to download).
Searches are difficult to track (which search result belongs to which search request). This makes automated searches difficult.
No implementation of hashes to uniquely identify files.
The default ports are 411 and 412 which are privileged ports on Unix and commonly blocked by firewalls. *
The protocol doesn't support multisource downloads (correlated with file hashing)
The protocol doesn't natively support secure communication (plain text passwords etc)
The protocol doesn't support IPv6
Hubs aren't uniquely identified.
The Client-Client protocol is uneccesary complex, with many interactions, which gives significant delays before transfers.
No native support for multiple hubs.
MultiSearch and MultiConnectToMe should be handled transparently by hubs. *
The protocol doesn't escape special characters, like: |, $, <, >, character 0x5 or even spaces in some cases
No unicode support.
Hubs should probe and disconnect ineffective connections (with large send queues and slow response) *
No way to protect your shares from outsiders (i.e. RIAA scanning your network without being on the same hub as you).

Fields marked with a * may be fixed in implementation.

1.4 A proposed solution

Instead of adding bloat to the current protocol, which seems to be the trend these days, I propose creating a totally new system around on the DC model (clients and hubs), which should be reasonable easy to implement in current open source DC clients.
In short, this means a brand new protocol, which addresses most of the annoying parts in the current Direct Connect protocol.
Note: not all the flaws in the protocol is addressed in this draft.

1.5 Executive summary

In general we remove the nickname dependencies, add hashes, support full unicode and make a more statefull and secure protocol. For the client-client protocol we are moving towards a (semi-)stateless protocol and add the option to deny connections for "outsiders".

1.6 Thanks to

Special thanks to Sandos, BlackClaw and Sedulus for constructive feedback on this document.

2.0 Language and standards

2.1 Messaging

First of all, the protocol is text based, like most other protocols (FTP, SMTP, HTTP, etc.). The protocol is using unicode (UTF8) for all interactions except binary transfers. All binary data are little-endian (least significant bit first).

Commands are (unlike the current DC protocol) transmitted as lines separated by the common Internet standard; CRLF (carriage return and linefeed), which means it's easier to debug implementations using a telnet session. Special characters needs to be escaped using C-style backslashes.

The lock/key challenge response system is gone.

The connecting party (client) will always talk first when connected (this to make automated scans difficult).

All error/info strings from special commands should not exceed 50 bytes.

2.2 Address notation

The server (hub) does not listen to a standard port any more.
This is up to the implementation and/or the system administrator. Thus any network node should always be addressed with both their IP and port. IPv4, IPv6 and DNS addressing is allowed.

The addresses should be on this format:

IPv4: "n.n.n.n:port"
IPv6: "[x:x:x:x:x:x:x:x]:port" (follows the RFC 2732 guidelines).
DNS: "fully.qualified.hostname:port (Note: The DNS name must match your reverse DNS for security reasons)."

2.3 New concepts

A new concept GUID, which is short for "Globally Unique ID". Similar functionality is in the recent Gnutella protocol extensions aswell. Each node (client and server) has a unique ID. All references in the protocol should use this ID instead of the nickname. This way we can have a network that doesn't rely on the nickname. You can join multiple hubs with multiple nicknames, and still be interpreted as one user.

The GUID is calculated as a SHA-1 (160 bit) hash of the address notation "ip:port" and a timestamp (IPv4 or IPv6 doesn't really matter). This GUID is usually represented as a 40 byte text string hexadecimal style, but that might be too large. It is possible to "squeeze" this size by using some other encoding (20 bytes is minimum nevertheless).

The GUID is created the first time the client is ran, and it will be stored and used for each later session. The user will be identified by this ID across different hubs and names. All login systems (registered users, etc) will use the GUID to identify a user.

Another new concept is "channels", similar to to IRC-channels or "a chat room". The reason for this is to ease the bandwidth load for userinfo, since the GUID would otherwise generate alot of traffic. After connecting to a hub a user may join several channels (limited by the server). The user will only receive userinfo for the users on those channels.

Unlike IRC the channels are not generated ad-hoc, meaning the server admin has to setup the channels. An operator is operator for all channels.

2.4 New server functionality

The "multisearch" and "multiconnecttome" are now gone. This is handled transparently by the server (if it is linked with other servers).

The server has enhanced flow control mechanisms. For example a hub can send a "Ping" requesting a "Pong" from each client to determine wether the host is responding within a reasonable amount of time or not. If it isn't it should be disconnected.
Also the hub can disconnect clients if they are message "flooding", or if their send queue is filling up. The search commands can optinally by the hub be sent to linked hubs, but they may just aswell be discarded if the hub is reaching some bandwidth, memory or CPU usage limit.

3.0 Command reference

3.1 DCTNG
Syntax:	DCTNG version
Parameters:	version a positive integer representing the protocol version supported by the client
Context:	Client to server and server to client, during initial handshake
Errors:	If a problem persists disconnect.

3.2 HubName
Syntax:	HubName string
Parameters:	string name of the hub. This string should not exceed 50 bytes.
Context:	Server to client, during handshake, or at any time after successfully logged in (if name changes).
Errors:	none

3.3 Full
Syntax:	Full
Parameters:	none
Context:	Server to client, during initial handshake only.
Errors:	none

3.4 Redirect
Syntax:	Redirect address
Parameters:	address Standard address notation (DNS, IPv4 or IPv6)
Context:	Server to client at any time.
Errors:	none

3.5 AccessDenied
Syntax:	AccessDenied reason
Parameters:	reason Text string describing the reason why an operation was refused
Context:	Server to client after certain requests or during handshake/login.
Errors:	none

3.6 Login
Syntax:	Login
Parameters:	none
Context:	Server to client to prompt for user data.
Errors:	AccessDenied

3.7 Password
Syntax:	Password nonce
Parameters:	nonce This command is sent by the server and is a password request. The password should be MD5 encrypted with this nonce (random data) to prevent playback attacks and password breaking. The same command should be sent back to the server using the encrypted password for verification.
Context:	Server to client after the Login request have been answered and client to server after this request.
Errors:	Password

3.8 Info
Syntax:	Info GUID nick bytes maxu maxd slots hubs mode flags description
Parameters:	GUID The globally unique ID nick The nickname to use for this user bytes Number of shared bytes (64bit integer) maxu Maximum upload speed ever reached by this host in bytes per second maxd Maximum download speed reached in bytes per second slots The number of possible simultaneous downloads from this host (slots) hubs The number of hubs this host is connected to mode One character mode: { 1=Active mode, 2=Passive mode, or 3=Proxy mode} desc Description of host / shared files.
Context:	Client to server immediately after login and server to client at any time later.
Errors:	none

3.9 Search
Syntax:	Search Destination ID mimelist Mode Metadata Pattern
Parameters:	Destination The destination of the search result, can be two things: "Hub:GUID" for passive searches or standard address notation for active searches (replied by UDP). ID A 1 byte to 8 byte alphanumeric ID which should be used by clients to track results and which search request it is a reply to. mimelist Comma separated list of mimetypes (RFC 2045, 2046). Currently wildcards are added to the mimes similar to HTTP encoding types. Which means you can do a search for "/" which means everything, or "video/*" which means any video type. The list is limited to a maximum of three mime types. Mode The search mode: any size (default) size atleast size atmost exact size match HASH ModeData This field depends on search mode it can be the size for searchmodes 1-3, a hash for search mode 4 (example: SHA1:f2a394e49ac933602cb46897b780425ed73ed6f4) or simply 0 for search mode 1. Pattern The search pattern is the rest of the string, so no characters (except CRLF) needs to be escaped
Context:	To both client and server after sucessfull login.
Errors:	none

3.10 SR
Syntax:	SR destination ID data
Parameters:	description not completed

3.11 Connect
Syntax:	Connect GUID address
Parameters:	GUID Globally unique ID for the user you want to connect to address Standard address notation for your client.
Context:	documentation not completed
See also:	ReqConnect

3.12 ReqConnect
Syntax:	Connect GUID GUID
Parameters:	GUID Globally unique ID for the user you want to connect to GUID Your GUID
Context:	In passive mode you can request the other party to set up a connection for you. The other party will respond by using the Connect command, unless it is firewalled (then it would return this command).
See also:	Connect

DirectConnect TNG

A call for a new protocol

1.0 Introduction

1.1 Background

1.2 Status of this document

1.3 The dark side of Direct Connect

1.4 A proposed solution

1.5 Executive summary

1.6 Thanks to

2.0 Language and standards

2.1 Messaging

2.2 Address notation

2.3 New concepts

2.4 New server functionality

3.0 Command reference

3.1 DCTNG

3.2 HubName

3.3 Full

3.4 Redirect

3.5 AccessDenied

3.6 Login

3.7 Password

3.8 Info

3.9 Search

3.10 SR

3.11 Connect

3.12 ReqConnect