Monday 3 December 2012

WebRTC Conference and Expo 2012: Mapping SIP to WebRTC - a misunderstanding?


I was fortunate enough to attend the first WebRTC Conference and Expo in San Francisco last month.  This fascinating conference demonstrated the growing hype and interest around WebRTC.

One of the things I observed at the conference was a lot of discussion about "WebRTC obsoleting SIP" and "mapping SIP to WebRTC", both topics that seemed to indicate a misunderstanding of both SIP and WebRTC amongst those discussing them.

SIP and WebRTC are very different protocols (in fact WebRTC is a set of protocols).  SIP relates to the signalling plane and WebRTC to the media plane.  This means that SIP and WebRTC are complementary protocols not competing protocols.

There was a lot of discussion about WebRTC's ability to establish media sessions without signalling or server infrastructure - this is not possible.  When you use the PeerConnection API in a browser you get and provide SDP which needs to be exchanged by the peers to establish the media connection.  How can the peers exchange this SDP without signalling or server infrastructure?

It is clear that some signalling and server infrastructure is required for WebRTC, and this can take one of three obvious forms:
  • Proprietary HTTP/REST signalling
  • XMPP/Jingle over WebSocket
  • SIP over WebSocket
Each of these has advantages and disadvantages.  HTTP/REST signalling is very simple but as HTTP is a transaction (not a session protocol) dialog state must be maintained within the network - this could easily result in resiliency and scaling issues, or complex server infrastructure that replicates dialogs between servers.  XMPP/Jingle is a session protocol, but like HTTP/REST it is lacking in terms of support for common regulatory and privacy features[1].  SIP in many ways is the ideal protocol - the hint is in the name, "Session Initiation Protocol" - as well established extensions (such as outbound) provide resilience and scaling, while regulatory and privacy issues have long been catered for.

Some people indicated that they believed that the use of SIP with WebRTC would be limited due to its complexity and the unwillingness of web-developers to use it.  However, I am not entirely convinced by this argument.  A web-developer is unlikely to want to implement any of the signalling necessary for his WebRTC application, but he will make use of any good libraries that are available.  I believe that SIP will be extensively used in WebRTC applications as long as a good enough set of libraries are available so that web-developers do not need to deal with it directly.

[1] A lot of web-developers ignore regulatory and privacy requirements simply because the large OTT companies have not had to do anything yet.  "Yet" is the operative word here - regulations and privacy are important and cannot be ignored indefinitely.  Using a protocol like SIP for the signalling future proofs web apps and services.  With a good SDK using SIP for the signalling is no harder than using anything else.